chat

Create chat completion

Create a chat completion, supports streaming

Request

POST /chat/completions

Request Body

Field	Type	Required	Description
model	string	Yes	Model name
messages	Array<Message>	Yes	Messages list
frequency_penalty	number	No	-
logit_bias	object	No	-
logprobs	boolean	No	When true, stream must be false (OpenAI constraint)
top_logprobs	integer	No	-
max_tokens	integer	No	-
n	integer	No	Number of chat completion choices to generate
modalities	Array<string (text, audio)>	No	Output modality types. Use [“text”, “audio”] for audio output
audio	object	No	Audio output configuration (when modalities includes audio)
presence_penalty	number	No	-
response_format	object	No	-
seed	integer	No	-
service_tier	string (auto, default)	No	-
stop	string	No	-
stream	boolean	No	-
stream_options	object	No	-
temperature	number	No	-
top_p	number	No	-
top_k	integer	No	Top-k sampling parameter (non-OpenAI standard, model-specific)
tools	Array<Tool>	No	-
tool_choice	string (none, auto, required)	No	-
parallel_tool_calls	boolean	No	Whether to enable parallel function calling during tool use. Only valid when tools are specified.
user	string	No	Unique identifier representing end-user for abuse monitoring
reasoning_effort	string (low, medium, high)	No	Reasoning effort level for o1 series models (low, medium, high)
max_completion_tokens	integer	No	Maximum number of tokens to generate in the completion (alternative to max_tokens, more precise)
store	boolean	No	Whether to store the output for use in model distillation or evals
metadata	object	No	Custom metadata to attach to the request for tracking purposes

Request Examples

Basic chat request

{
  "model": "gpt-4o-mini",
  "messages": [
    {
      "role": "user",
      "content": "Hello, how are you?"
    }
  ]
}

Request with system prompt

{
  "model": "qwen-plus",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "What is the capital of France?"
    }
  ],
  "temperature": 0.7,
  "max_tokens": 100
}

Streaming request

{
  "model": "gpt-4o-mini",
  "messages": [
    {
      "role": "user",
      "content": "Tell me a story"
    }
  ],
  "stream": true,
  "temperature": 0.8
}

Request with tool calls

{
  "model": "gpt-4o-mini",
  "messages": [
    {
      "role": "user",
      "content": "What's the weather like in San Francisco and Tokyo?"
    }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get the current weather in a given location",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {
              "type": "string",
              "description": "The city and state, e.g. San Francisco, CA"
            },
            "unit": {
              "type": "string",
              "enum": [
                "celsius",
                "fahrenheit"
              ],
              "description": "The temperature unit to use"
            }
          },
          "required": [
            "location"
          ]
        }
      }
    }
  ],
  "tool_choice": "auto",
  "temperature": 0.7
}

Tool call result response

{
  "model": "gpt-4o-mini",
  "messages": [
    {
      "role": "user",
      "content": "What's the weather in San Francisco?"
    },
    {
      "role": "assistant",
      "content": null,
      "tool_calls": [
        {
          "id": "call_abc123",
          "type": "function",
          "function": {
            "name": "get_weather",
            "arguments": "{\"location\": \"San Francisco, CA\", \"unit\": \"celsius\"}"
          }
        }
      ]
    },
    {
      "role": "tool",
      "content": "{\"temperature\": 18, \"condition\": \"sunny\", \"humidity\": 65}",
      "tool_call_id": "call_abc123"
    }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get the current weather in a given location",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {
              "type": "string"
            },
            "unit": {
              "type": "string",
              "enum": [
                "celsius",
                "fahrenheit"
              ]
            }
          },
          "required": [
            "location"
          ]
        }
      }
    }
  ]
}

Multi-turn conversation

{
  "model": "claude-sonnet-4.5",
  "messages": [
    {
      "role": "system",
      "content": "You are a knowledgeable programming tutor."
    },
    {
      "role": "user",
      "content": "How do I create a list in Python?"
    },
    {
      "role": "assistant",
      "content": "In Python, you can create a list using square brackets. For example: my_list = [1, 2, 3]"
    },
    {
      "role": "user",
      "content": "How do I add items to it?"
    }
  ],
  "max_tokens": 500,
  "temperature": 0.8
}

JSON mode output

{
  "model": "gpt-4o-mini",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant that outputs in JSON format."
    },
    {
      "role": "user",
      "content": "Extract the name, age, and occupation from this text: John is 30 years old and works as a software engineer."
    }
  ],
  "response_format": {
    "type": "json_object"
  },
  "temperature": 0.5
}

Structured JSON output

{
  "model": "gpt-4o-mini",
  "messages": [
    {
      "role": "user",
      "content": "Generate a user profile for a software engineer named Alice"
    }
  ],
  "response_format": {
    "type": "json_schema",
    "json_schema": {
      "name": "user_profile",
      "description": "A user profile object",
      "schema": {
        "type": "object",
        "properties": {
          "name": {
            "type": "string"
          },
          "age": {
            "type": "integer"
          },
          "occupation": {
            "type": "string"
          },
          "skills": {
            "type": "array",
            "items": {
              "type": "string"
            }
          }
        },
        "required": [
          "name",
          "occupation"
        ]
      }
    }
  },
  "temperature": 0.7
}

Vision input (image understanding)

{
  "model": "gpt-4o-mini",
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "What's in this image?"
        },
        {
          "type": "image_url",
          "image_url": {
            "url": "https://example.com/image.jpg",
            "detail": "high"
          }
        }
      ]
    }
  ],
  "max_tokens": 300
}

Forced tool call

{
  "model": "gpt-4o-mini",
  "messages": [
    {
      "role": "user",
      "content": "Tell me about the weather"
    }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get weather information",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {
              "type": "string"
            }
          },
          "required": [
            "location"
          ]
        }
      }
    },
    {
      "type": "function",
      "function": {
        "name": "get_time",
        "description": "Get current time",
        "parameters": {
          "type": "object",
          "properties": {
            "timezone": {
              "type": "string"
            }
          }
        }
      }
    }
  ],
  "tool_choice": {
    "type": "function",
    "function": {
      "name": "get_weather"
    }
  }
}

Reasoning mode (o1 series)

{
  "model": "grok-4-fast-reasoning",
  "messages": [
    {
      "role": "user",
      "content": "Solve this math problem: If a train travels 120 km in 2 hours, then stops for 30 minutes, then travels another 90 km in 1.5 hours, what is the average speed for the entire journey?"
    }
  ],
  "max_completion_tokens": 2000,
  "reasoning_effort": "high",
  "user": "user_12345"
}

Audio output (speech response)

{
  "model": "gpt-4o-mini-audio",
  "messages": [
    {
      "role": "user",
      "content": "Tell me a short story about a robot"
    }
  ],
  "modalities": [
    "text",
    "audio"
  ],
  "audio": {
    "voice": "alloy",
    "format": "mp3"
  },
  "max_tokens": 500
}

Parallel tool calls enabled

{
  "model": "gpt-4o-mini",
  "messages": [
    {
      "role": "user",
      "content": "Check the weather in Tokyo, Paris, and New York simultaneously"
    }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get weather information",
        "parameters": {
          "type": "object",
          "properties": {
            "city": {
              "type": "string"
            }
          },
          "required": [
            "city"
          ]
        }
      }
    }
  ],
  "parallel_tool_calls": true,
  "temperature": 0.7
}

Request with metadata and user tracking

{
  "model": "gpt-4o-mini",
  "messages": [
    {
      "role": "user",
      "content": "Explain quantum entanglement in simple terms"
    }
  ],
  "temperature": 0.8,
  "max_tokens": 300,
  "user": "user_abc123",
  "metadata": {
    "session_id": "session_xyz789",
    "conversation_id": "conv_456",
    "source": "mobile_app",
    "version": "1.2.3"
  },
  "store": true
}

Successful response

Response Schema

Field	Type	Required	Description
id	string	Yes	-
object	string	Yes	-
created	integer	Yes	-
model	string	Yes	-
choices	Array<ChatCompletionChoice>	Yes	-
usage	object	No	-
system_fingerprint	string	No	-

Response Example

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "gpt-4o-mini",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! I'm doing well, thank you for asking. How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 13,
    "completion_tokens": 17,
    "total_tokens": 30
  }
}

Code Examples

JavaScript (Fetch)

const response = await fetch('https://api.r9s.ai/v1/chat/completions', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
  "model": "gpt-4o-mini",
  "messages": [
    {
      "role": "user",
      "content": "Hello, how are you?"
    }
  ]
})
});

const data = await response.json();
console.log(data);

Python (requests)

import requests

url = "https://api.r9s.ai/v1/chat/completions"
headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}

response = requests.post(url, json={
  "model": "gpt-4o-mini",
  "messages": [
    {
      "role": "user",
      "content": "Hello, how are you?"
    }
  ]
}, headers=headers)
data = response.json()
print(data)

cURL

curl -X POST "https://api.r9s.ai/v1/chat/completions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"gpt-4o-mini","messages":[{"role":"user","content":"Hello, how are you?"}]}'

Schema Reference

ChatCompletionRequest

Field	Type	Required	Description
model	string	Yes	Model name
messages	Array<Message>	Yes	Messages list
frequency_penalty	number	No	-
logit_bias	object	No	-
logprobs	boolean	No	When true, stream must be false (OpenAI constraint)
top_logprobs	integer	No	-
max_tokens	integer	No	-
n	integer	No	Number of chat completion choices to generate
modalities	Array<string (text, audio)>	No	Output modality types. Use [“text”, “audio”] for audio output
audio	object	No	Audio output configuration (when modalities includes audio)
presence_penalty	number	No	-
response_format	object	No	-
seed	integer	No	-
service_tier	string (auto, default)	No	-
stop	string	No	-
stream	boolean	No	-
stream_options	object	No	-
temperature	number	No	-
top_p	number	No	-
top_k	integer	No	Top-k sampling parameter (non-OpenAI standard, model-specific)
tools	Array<Tool>	No	-
tool_choice	string (none, auto, required)	No	-
parallel_tool_calls	boolean	No	Whether to enable parallel function calling during tool use. Only valid when tools are specified.
user	string	No	Unique identifier representing end-user for abuse monitoring
reasoning_effort	string (low, medium, high)	No	Reasoning effort level for o1 series models (low, medium, high)
max_completion_tokens	integer	No	Maximum number of tokens to generate in the completion (alternative to max_tokens, more precise)
store	boolean	No	Whether to store the output for use in model distillation or evals
metadata	object	No	Custom metadata to attach to the request for tracking purposes

Message

Field	Type	Required	Description
role	string (system, user, assistant, tool)	Yes	Message role
content	object	No	Message content. Can be null when assistant message contains tool_calls. - user/system messages: Required, contains text or multimodal content - assistant messages: Optional when tool_calls is present; can be null or omitted - tool messages: Required, contains tool return results (usually JSON string) Important: In /v1/responses API, content field must exist and cannot be null. For /v1/chat/completions, content can be null when tool_calls is present.
name	string	No	Sender name
reasoning_content	string	No	Reasoning content
tool_calls	Array<ToolCall>	No	Tool calls list
tool_call_id	string	No	Tool call ID

MessageContent

Field	Type	Required	Description
type	string (text, image_url)	Yes	-
text	string	No	-
image_url	object	No	-

ImageURL

Field	Type	Required	Description
url	string	Yes	-
detail	string (auto, low, high)	No	-

ToolCall

Field	Type	Required	Description
id	string	Yes	-
type	string	Yes	-
function	object	Yes	-

FunctionCall

Field	Type	Required	Description
name	string	Yes	-
arguments	string	No	-

Audio

Field	Type	Required	Description
voice	string (alloy, echo, fable, onyx, nova, shimmer)	No	Voice type for audio output
format	string (mp3, opus, aac, flac, wav, pcm)	No	Audio output format

ResponseFormat

Field	Type	Required	Description
type	string (text, json_object, json_schema)	No	-
json_schema	object	No	-

JsonSchema

Field	Type	Required	Description
name	string	No	-
description	string	No	-
schema	object	No	-

StreamOptions

Field	Type	Required	Description
include_usage	boolean	No	Whether to include usage statistics

Tool

Field	Type	Required	Description
type	string (function)	Yes	Tool type, currently only supports function
function	object	Yes	-

ChatCompletionResponse

Field	Type	Required	Description
id	string	Yes	-
object	string	Yes	-
created	integer	Yes	-
model	string	Yes	-
choices	Array<ChatCompletionChoice>	Yes	-
usage	object	No	-
system_fingerprint	string	No	-

ChatCompletionChoice

Field	Type	Required	Description
index	integer	Yes	-
message	object	Yes	-
finish_reason	string (stop, length, tool_calls, content_filter)	Yes	-
logprobs	object	No	-

Usage

Field	Type	Required	Description
prompt_tokens	integer	Yes	Number of tokens in the prompt (input)
prompt_tokens_details	object	No	Details about prompt tokens
completion_tokens	integer	Yes	Number of tokens in the completion (output)
completion_tokens_details	object	No	Details about completion tokens
total_tokens	integer	Yes	Total number of tokens (prompt + completion)

API Overview - Learn about authentication and basic information
models - View models related APIs
responses - View responses related APIs
messages - View messages related APIs
completions - View completions related APIs
edits - View edits related APIs
images - View images related APIs
embeddings - View embeddings related APIs
engine-embeddings - View engine-embeddings related APIs
moderations - View moderations related APIs
audio - View audio related APIs
search - View search related APIs
proxy - View proxy related APIs

chat

Create chat completion

Request

Request Body

Request Examples

Basic chat request

Request with system prompt

Streaming request

Request with tool calls

Tool call result response

Multi-turn conversation

JSON mode output

Structured JSON output

Vision input (image understanding)

Forced tool call

Reasoning mode (o1 series)

Audio output (speech response)

Parallel tool calls enabled

Request with metadata and user tracking

Response Schema

Response Example

Code Examples

JavaScript (Fetch)

Python (requests)

cURL

Schema Reference

ChatCompletionRequest

Message

MessageContent

ImageURL

ToolCall

FunctionCall

Audio

ResponseFormat

JsonSchema

StreamOptions

Tool

ChatCompletionResponse

ChatCompletionChoice

Usage

Related APIs