Skip to content

responses

Create a response with streaming support. This endpoint corresponds to OpenAI’s Responses API.

POST /responses
FieldTypeRequiredDescription
modelstringYesModel name
inputstringYesInput content, required parameter. Can be:
- String: Single text input
- Message array: Structured conversation history
Important limitations:
- Messages only support basic fields (role, content, name)
- Does not support tool_calls, tool_call_id and other tool-related fields
- content field is required and cannot be null
- To use tools, define them in the top-level tools parameter; model will call them on first response
Note: Responses API has deprecated messages parameter, now uses input parameter uniformly
instructionsstringNoSystem-level instructions to guide model behavior and response style (similar to system message)
temperaturenumberNoControls output randomness, higher values mean more random
top_pnumberNoNucleus sampling parameter, controls output diversity
max_output_tokensintegerNoMaximum number of tokens to generate
streambooleanNoWhether to enable streaming
modalitiesArray<string (text, audio)>NoResponse modality types
toolsArray<ResponseTool>NoAvailable tools list (using flat format)
tool_choicestring (none, auto, required)NoTool selection strategy
parallel_tool_callsbooleanNoWhether to enable parallel function calling during tool use. When false, ensures exactly zero or one tool is called.
textobjectNoText output configuration
previous_response_idstringNoThe ID of a previous response to continue the conversation from. This allows you to chain responses together and maintain conversation state.
When using previous_response_id, the model will automatically have access to all previously produced reasoning items and conversation history.
storebooleanNoWhether to store the generated model response for later retrieval via API.
Defaults to true. Set to false to disable storage (required for ZDR organizations).
backgroundbooleanNoWhether to run the model response in the background asynchronously. Useful for long-running tasks.
reasoningobjectNoConfiguration for reasoning models (e.g., o1, o3, gpt-5). Controls how the model uses reasoning tokens to “think” through the problem.
truncationstring (auto, disabled)NoThe truncation strategy to use for the model response.
- auto: If input exceeds context window, truncate by dropping items from beginning
- disabled: Request fails with 400 error if input exceeds context window (default)
stopstringNoUp to 4 sequences where the API will stop generating further tokens
metadataobjectNoAdditional metadata for tracking and organization purposes
{
"model": "gpt-4o-mini",
"input": "Tell me a joke about programming",
"instructions": "You are a funny assistant",
"max_output_tokens": 500,
"temperature": 0.7
}
{
"model": "gpt-4o-mini",
"input": [
{
"role": "user",
"content": "Hello, how are you?"
}
],
"instructions": "You are a helpful assistant",
"max_output_tokens": 1000
}
{
"model": "qwen-plus",
"input": [
{
"role": "user",
"content": "What is artificial intelligence?"
},
{
"role": "assistant",
"content": "Artificial intelligence (AI) is..."
},
{
"role": "user",
"content": "Can you give me some examples?"
}
],
"instructions": "You are a knowledgeable AI tutor",
"max_output_tokens": 2000,
"stream": true
}
{
"model": "gpt-4o-mini",
"input": [
{
"role": "user",
"content": "What's the weather like in San Francisco?"
}
],
"instructions": "You are a helpful assistant with access to tools",
"max_output_tokens": 2000,
"temperature": 0.7,
"modalities": [
"text"
],
"tools": [
{
"type": "function",
"name": "get_weather",
"description": "Get the current weather in a location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string"
}
},
"required": [
"location"
]
}
}
]
}
{
"model": "gpt-4o-mini",
"input": [
{
"role": "user",
"content": "Book a flight from NYC to London on Dec 25th and check the weather there"
}
],
"instructions": "You are a travel assistant. Use tools to help users with travel planning.",
"tools": [
{
"type": "function",
"name": "search_flights",
"description": "Search for available flights",
"parameters": {
"type": "object",
"properties": {
"origin": {
"type": "string",
"description": "Departure city"
},
"destination": {
"type": "string",
"description": "Arrival city"
},
"date": {
"type": "string",
"description": "Travel date in YYYY-MM-DD format"
}
},
"required": [
"origin",
"destination",
"date"
]
}
},
{
"type": "function",
"name": "get_weather",
"description": "Get weather forecast",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string"
},
"date": {
"type": "string"
}
},
"required": [
"location"
]
}
}
],
"tool_choice": "auto",
"max_output_tokens": 3000
}
{
"model": "gpt-4o-mini",
"input": [
{
"role": "user",
"content": "Calculate 15% tip on a $85.50 bill and tell me the total"
}
],
"instructions": "You are a helpful calculator assistant",
"stream": true,
"tools": [
{
"type": "function",
"name": "calculate",
"description": "Perform mathematical calculations",
"parameters": {
"type": "object",
"properties": {
"expression": {
"type": "string",
"description": "Mathematical expression to evaluate"
}
},
"required": [
"expression"
]
}
}
],
"max_output_tokens": 1000
}
{
"model": "gpt-4o-mini",
"input": "Search for recent news about artificial intelligence",
"instructions": "You must use the search tool to find current information",
"tools": [
{
"type": "function",
"name": "web_search",
"description": "Search the web for information",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string"
},
"num_results": {
"type": "integer"
}
},
"required": [
"query"
]
}
}
],
"tool_choice": "required",
"max_output_tokens": 2000
}
{
"model": "gpt-4o-mini",
"input": "Summarize the key points from our discussion",
"instructions": "You are a meeting assistant",
"max_output_tokens": 1500,
"temperature": 0.5,
"top_p": 0.9,
"metadata": {
"user_id": "user_12345",
"session_id": "session_abc",
"conversation_id": "conv_xyz"
}
}
{
"model": "gpt-4o-mini",
"input": "Write a short poem about the ocean",
"instructions": "You are a creative poet",
"stream": true,
"max_output_tokens": 500,
"temperature": 0.9
}
{
"model": "gpt-4o-mini",
"input": "Extract person information and return as JSON: John Smith is 35 years old and works as a software engineer in San Francisco",
"instructions": "Extract structured data and output in JSON format",
"text": {
"format": {
"type": "json_object"
}
},
"max_output_tokens": 500
}
{
"model": "gpt-4o-mini",
"input": "Generate a user profile for software developer Alice Chen in JSON format",
"instructions": "Create a detailed user profile following the schema",
"text": {
"format": {
"type": "json_schema",
"name": "user_profile",
"schema": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"age": {
"type": "integer"
},
"occupation": {
"type": "string"
},
"location": {
"type": "string"
},
"skills": {
"type": "array",
"items": {
"type": "string"
}
}
},
"required": [
"name",
"age",
"occupation",
"location",
"skills"
],
"additionalProperties": false
},
"strict": true
}
},
"max_output_tokens": 800
}
{
"model": "gpt-4o-mini",
"input": [
{
"role": "user",
"content": "What's the weather in Tokyo?"
},
{
"role": "assistant",
"content": "I'll check the weather for you."
},
{
"role": "user",
"content": "The weather tool returned: temperature 22°C, condition sunny, humidity 60%"
}
],
"instructions": "You are a helpful assistant. Synthesize tool results naturally.",
"max_output_tokens": 500
}

Chained conversation with previous_response_id

Section titled “Chained conversation with previous_response_id”
{
"model": "gpt-4o-mini",
"input": "Can you elaborate more on the second point?",
"instructions": "You are a helpful assistant",
"previous_response_id": "resp_abc123xyz456",
"max_output_tokens": 1000
}
{
"model": "gpt-4o-mini",
"input": "Analyze this large dataset and provide insights: [dataset details...]",
"instructions": "You are a data analyst",
"background": true,
"max_output_tokens": 5000,
"temperature": 0.3
}
{
"model": "gpt-5-codex",
"input": "A farmer needs to transport a fox, a chicken, and a bag of grain across a river. The boat can only carry the farmer and one item. If left alone, the fox will eat the chicken, and the chicken will eat the grain. How can the farmer get everything across safely?",
"instructions": "Think through this step by step",
"reasoning": {
"effort": "high"
},
"max_output_tokens": 3000
}

Successful response

FieldTypeRequiredDescription
idstringYesUnique identifier for the response
objectstringYes-
created_atintegerYesUnix timestamp when the response was created
statusstring (in_progress, completed, incomplete, failed, cancelled)YesThe status of the response
backgroundbooleanNoWhether the response is running in the background
billingobjectNoBilling information
completed_atobjectNoUnix timestamp when the response was completed
errorobjectNoError information if the response failed
incomplete_detailsobjectNoDetails about why the response is incomplete
instructionsstringNoSystem-level instructions that guided the model’s behavior
max_output_tokensintegerNoMaximum number of tokens to generate
max_tool_callsobjectNoMaximum number of tool calls allowed
modelstringYesThe model used for the response
outputArray<ResponseOutputItem>NoArray of output items produced by the model
parallel_tool_callsbooleanNoWhether parallel tool calls are enabled
previous_response_idobjectNoID of the previous response in a chain
prompt_cache_keyobjectNoKey for prompt caching
prompt_cache_retentionobjectNoPrompt cache retention policy
reasoningobjectNoReasoning configuration
safety_identifierobjectNoSafety identifier for the response
service_tierstring (auto, default)NoService tier used
storebooleanNoWhether to store the response
temperaturenumberNoTemperature parameter used
textobjectNoText format configuration
tool_choicestring (none, auto, required)NoTool choice strategy used
toolsArray<ResponseTool>NoTools that were available
top_logprobsintegerNoNumber of top log probabilities
top_pnumberNoTop-p sampling parameter used
truncationstring (auto, disabled)NoTruncation strategy used
userobjectNoUser identifier
metadataobjectNoAdditional metadata
usageobjectNoUsage statistics (null when response is still in progress)
{
"id": "resp_0af56b0fcfec6be7006949f1d2bf7881a1ac983a51aa13d9e9",
"object": "response",
"created_at": 1766453714,
"status": "completed",
"background": false,
"billing": {
"payer": "developer"
},
"completed_at": 1766453715,
"error": null,
"incomplete_details": null,
"instructions": "You are a funny assistant",
"max_output_tokens": 500,
"max_tool_calls": null,
"model": "gpt-4o-mini-2024-07-18",
"output": [
{
"id": "msg_0af56b0fcfec6be7006949f1d3675881a1a35d943785fb4b8b",
"type": "message",
"status": "completed",
"content": [
{
"type": "output_text",
"annotations": [],
"logprobs": [],
"text": "Why do programmers prefer dark mode?\n\nBecause light attracts bugs!"
}
],
"role": "assistant"
}
],
"parallel_tool_calls": true,
"previous_response_id": null,
"prompt_cache_key": null,
"prompt_cache_retention": null,
"reasoning": {
"effort": null,
"summary": null
},
"safety_identifier": null,
"service_tier": "default",
"store": true,
"temperature": 0.7,
"text": {
"format": {
"type": "text"
},
"verbosity": "medium"
},
"tool_choice": "auto",
"tools": [],
"top_logprobs": 0,
"top_p": 1,
"truncation": "disabled",
"usage": {
"input_tokens": 22,
"input_tokens_details": {
"cached_tokens": 0
},
"output_tokens": 13,
"output_tokens_details": {
"reasoning_tokens": 0
},
"total_tokens": 35
},
"user": null,
"metadata": {}
}
{
"id": "resp_tool_abc123",
"object": "response",
"created_at": 1766453800,
"status": "completed",
"background": false,
"billing": {
"payer": "developer"
},
"completed_at": 1766453802,
"error": null,
"incomplete_details": null,
"instructions": "You are a helpful assistant with access to tools",
"max_output_tokens": 2000,
"model": "gpt-4o-mini-2024-07-18",
"output": [
{
"id": "msg_tool_output_123",
"type": "message",
"status": "completed",
"content": [
{
"type": "output_text",
"text": "I'll check the weather for you."
}
],
"role": "assistant"
},
{
"id": "call_weather_001",
"type": "function_call",
"status": "completed",
"call_id": "call_abc123",
"name": "get_weather",
"arguments": "{\"location\": \"San Francisco, CA\", \"unit\": \"celsius\"}"
}
],
"parallel_tool_calls": true,
"tools": [
{
"type": "function",
"name": "get_weather",
"description": "Get current weather",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string"
},
"unit": {
"type": "string"
}
}
}
}
],
"tool_choice": "auto",
"temperature": 0.7,
"usage": {
"input_tokens": 45,
"output_tokens": 28,
"total_tokens": 73
},
"metadata": {}
}
{
"id": "resp_json_xyz789",
"object": "response",
"created_at": 1766453900,
"status": "completed",
"background": false,
"completed_at": 1766453901,
"instructions": "Extract structured data",
"max_output_tokens": 500,
"model": "gpt-4o-mini-2024-07-18",
"output": [
{
"id": "msg_json_output",
"type": "message",
"status": "completed",
"content": [
{
"type": "output_text",
"text": "{\"name\": \"John Smith\", \"age\": 35, \"occupation\": \"software engineer\", \"location\": \"San Francisco\"}"
}
],
"role": "assistant"
}
],
"text": {
"format": {
"type": "json_object"
}
},
"usage": {
"input_tokens": 32,
"output_tokens": 24,
"total_tokens": 56
}
}
{
"id": "resp_incomplete_456",
"object": "response",
"created_at": 1766454000,
"status": "incomplete",
"background": false,
"completed_at": 1766454002,
"error": null,
"incomplete_details": {
"reason": "max_output_tokens"
},
"instructions": "Write a detailed essay",
"max_output_tokens": 100,
"model": "gpt-4o-mini-2024-07-18",
"output": [
{
"id": "msg_incomplete",
"type": "message",
"status": "incomplete",
"content": [
{
"type": "output_text",
"text": "Artificial intelligence is a rapidly evolving field that encompasses machine learning, neural networks, and..."
}
],
"role": "assistant"
}
],
"usage": {
"input_tokens": 15,
"output_tokens": 100,
"total_tokens": 115
}
}
{
"id": "resp_failed_789",
"object": "response",
"created_at": 1766454100,
"status": "failed",
"background": false,
"completed_at": null,
"error": {
"type": "server_error",
"message": "Model service temporarily unavailable"
},
"instructions": "Answer the question",
"max_output_tokens": 1000,
"model": "gpt-4o-mini-2024-07-18",
"output": [],
"usage": {
"input_tokens": 10,
"output_tokens": 0,
"total_tokens": 10
}
}
{
"id": "resp_background_999",
"object": "response",
"created_at": 1766454200,
"status": "in_progress",
"background": true,
"completed_at": null,
"instructions": "Analyze large dataset",
"max_output_tokens": 5000,
"model": "gpt-4o-mini-2024-07-18",
"output": [],
"usage": null
}
const response = await fetch('https://api.r9s.ai/v1/responses', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({
"model": "gpt-4o-mini",
"input": "Tell me a joke about programming",
"instructions": "You are a funny assistant",
"max_output_tokens": 500,
"temperature": 0.7
})
});
const data = await response.json();
console.log(data);
import requests
url = "https://api.r9s.ai/v1/responses"
headers = {
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json"
}
response = requests.post(url, json={
"model": "gpt-4o-mini",
"input": "Tell me a joke about programming",
"instructions": "You are a funny assistant",
"max_output_tokens": 500,
"temperature": 0.7
}, headers=headers)
data = response.json()
print(data)
Terminal window
curl -X POST "https://api.r9s.ai/v1/responses" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"gpt-4o-mini","input":"Tell me a joke about programming","instructions":"You are a funny assistant","max_output_tokens":500,"temperature":0.7}'
FieldTypeRequiredDescription
modelstringYesModel name
inputstringYesInput content, required parameter. Can be:
- String: Single text input
- Message array: Structured conversation history
Important limitations:
- Messages only support basic fields (role, content, name)
- Does not support tool_calls, tool_call_id and other tool-related fields
- content field is required and cannot be null
- To use tools, define them in the top-level tools parameter; model will call them on first response
Note: Responses API has deprecated messages parameter, now uses input parameter uniformly
instructionsstringNoSystem-level instructions to guide model behavior and response style (similar to system message)
temperaturenumberNoControls output randomness, higher values mean more random
top_pnumberNoNucleus sampling parameter, controls output diversity
max_output_tokensintegerNoMaximum number of tokens to generate
streambooleanNoWhether to enable streaming
modalitiesArray<string (text, audio)>NoResponse modality types
toolsArray<ResponseTool>NoAvailable tools list (using flat format)
tool_choicestring (none, auto, required)NoTool selection strategy
parallel_tool_callsbooleanNoWhether to enable parallel function calling during tool use. When false, ensures exactly zero or one tool is called.
textobjectNoText output configuration
previous_response_idstringNoThe ID of a previous response to continue the conversation from. This allows you to chain responses together and maintain conversation state.
When using previous_response_id, the model will automatically have access to all previously produced reasoning items and conversation history.
storebooleanNoWhether to store the generated model response for later retrieval via API.
Defaults to true. Set to false to disable storage (required for ZDR organizations).
backgroundbooleanNoWhether to run the model response in the background asynchronously. Useful for long-running tasks.
reasoningobjectNoConfiguration for reasoning models (e.g., o1, o3, gpt-5). Controls how the model uses reasoning tokens to “think” through the problem.
truncationstring (auto, disabled)NoThe truncation strategy to use for the model response.
- auto: If input exceeds context window, truncate by dropping items from beginning
- disabled: Request fails with 400 error if input exceeds context window (default)
stopstringNoUp to 4 sequences where the API will stop generating further tokens
metadataobjectNoAdditional metadata for tracking and organization purposes
FieldTypeRequiredDescription
rolestring (system, user, assistant, tool)YesMessage role
contentobjectNoMessage content. Can be null when assistant message contains tool_calls.
- user/system messages: Required, contains text or multimodal content
- assistant messages: Optional when tool_calls is present; can be null or omitted
- tool messages: Required, contains tool return results (usually JSON string)
Important: In /v1/responses API, content field must exist and cannot be null.
For /v1/chat/completions, content can be null when tool_calls is present.
namestringNoSender name
reasoning_contentstringNoReasoning content
tool_callsArray<ToolCall>NoTool calls list
tool_call_idstringNoTool call ID
FieldTypeRequiredDescription
typestring (text, image_url)Yes-
textstringNo-
image_urlobjectNo-
FieldTypeRequiredDescription
urlstringYes-
detailstring (auto, low, high)No-
FieldTypeRequiredDescription
idstringYes-
typestringYes-
functionobjectYes-
FieldTypeRequiredDescription
namestringYes-
argumentsstringNo-
FieldTypeRequiredDescription
typestring (function)YesTool type, currently only supports function
namestringYesFunction name
descriptionstringNoFunction description, helps model understand when to call this function
parametersobjectNoFunction parameter definition in JSON Schema format
FieldTypeRequiredDescription
typestringYesTool type
functionobjectYes-
FieldTypeRequiredDescription
idstringYesUnique identifier for the response
objectstringYes-
created_atintegerYesUnix timestamp when the response was created
statusstring (in_progress, completed, incomplete, failed, cancelled)YesThe status of the response
backgroundbooleanNoWhether the response is running in the background
billingobjectNoBilling information
completed_atobjectNoUnix timestamp when the response was completed
errorobjectNoError information if the response failed
incomplete_detailsobjectNoDetails about why the response is incomplete
instructionsstringNoSystem-level instructions that guided the model’s behavior
max_output_tokensintegerNoMaximum number of tokens to generate
max_tool_callsobjectNoMaximum number of tool calls allowed
modelstringYesThe model used for the response
outputArray<ResponseOutputItem>NoArray of output items produced by the model
parallel_tool_callsbooleanNoWhether parallel tool calls are enabled
previous_response_idobjectNoID of the previous response in a chain
prompt_cache_keyobjectNoKey for prompt caching
prompt_cache_retentionobjectNoPrompt cache retention policy
reasoningobjectNoReasoning configuration
safety_identifierobjectNoSafety identifier for the response
service_tierstring (auto, default)NoService tier used
storebooleanNoWhether to store the response
temperaturenumberNoTemperature parameter used
textobjectNoText format configuration
tool_choicestring (none, auto, required)NoTool choice strategy used
toolsArray<ResponseTool>NoTools that were available
top_logprobsintegerNoNumber of top log probabilities
top_pnumberNoTop-p sampling parameter used
truncationstring (auto, disabled)NoTruncation strategy used
userobjectNoUser identifier
metadataobjectNoAdditional metadata
usageobjectNoUsage statistics (null when response is still in progress)
FieldTypeRequiredDescription
idstringYesUnique identifier for output item
typestring (message, function_call, reasoning)YesOutput type (message for final response, function_call for tool calls, reasoning for reasoning trace)
statusstring (completed, in_progress, incomplete)NoOutput status
rolestring (user, assistant)NoMessage role
contentArrayNoContent array
call_idstringNoFunction call ID
namestringNoFunction name
argumentsstringNoFunction arguments (JSON string)
outputstringNoFunction output
summaryobjectNoNatural-language summary of reasoning (for reasoning type), can be string or array
encrypted_contentobjectNoEncrypted reasoning tokens for stateless workflows (for reasoning type)
FieldTypeRequiredDescription
input_tokensintegerYesNumber of tokens in the input
input_tokens_detailsobjectNoDetails about input tokens
output_tokensintegerYesNumber of tokens in the output
output_tokens_detailsobjectNoDetails about output tokens
total_tokensintegerYesTotal number of tokens (input + output)