Currently, some large language models support calling through the OpenAI library. After installing Python 3.7.1 or later and setting up a virtual environment, you can install the OpenAI Python library. Run the following command in the terminal/command line:

pip install --upgrade openai

Si's API endpoints for chat, language and code, images, and embeddings are fully compatible with OpenAI's API.

For applications utilizing OpenAI's client libraries, configuring them to connect to Si's API servers is straightforward. This allows you to seamlessly execute your existing applications with our open-source models

Configuring OpenAI to use Si API

To start using Together with OpenAI's client libraries, pass in your Together API key to the api_key option, and change the base_url to https://api.together.xyz/v1:

import os
import openai

client = openai.OpenAI(
  api_key="YOUR_API_KEY",
  base_url="<https://api.TBA/v1>",
)

You can find your API key in your settings page.

Querying an Inference model

Now that your OpenAI client is configured to point to Si, you can start using one of our open-source models for your inference queries.

For example, you can query one of our chat models, like Meta Llama 3:

import os
import openai

client = openai.OpenAI(
  api_key="YOUR_API_KEY",
  base_url="<https://api.TBA/v1>",
)

response = client.chat.completions.create(
  model="meta-llama/Llama-3-8b-chat-hf",
  messages=[
    {"role": "system", "content": "You are a travel agent. Be descriptive and helpful."},
    {"role": "user", "content": "Tell me about San Francisco"},
  ]
)

print(response.choices[0].message.content)

Or you can use a language model to generate a code completion:

import os
import openai

client = openai.OpenAI(
  api_key="YOUR_API_KEY",
  base_url="<https://api.TBA/v1>",
)

response = client.completions.create(
  model="codellama/CodeLlama-34b-Python-hf",
  prompt="def bubbleSort(): ",
  max_tokens=175
)

print(response.choices[0].text)

Streaming with OpenAI

You can also use OpenAI's streaming capabilities to stream back your response:

import os
import openai

system_content = "You are a travel agent. Be descriptive and helpful."
user_content = "Tell me about San Francisco"

client = openai.OpenAI(
  api_key="YOUR_API_KEY",
  base_url="<https://api.TBA/v1>",
)

stream = client.chat.completions.create(
  model="mistralai/Mixtral-8x7B-Instruct-v0.1",
  messages=[
    {"role": "system", "content": system_content},
    {"role": "user", "content": user_content},
  ],
  stream=True,
)

for chunk in stream:
  print(chunk.choices[0].delta.content or "", end="", flush=True)

Community libraries

The Together API is also supported by most OpenAI libraries built by the community.

Feel free to reach out to support if you come across some unexpected behavior when using our API.