Chat Completion
Send OpenAI-compatible chat completion requests through Velrix and let gateway policy choose the provider route, fallback behavior, and operational controls.
Endpoint
Endpoint
Use the standard OpenAI chat completions path with the Velrix base URL.
Method
Create a chat completion.
POST /v1/chat/completionsBase URL
OpenAI-compatible Velrix endpoint.
https://api.velrix.ai/v1Authentication
Headers
Velrix follows OpenAI-compatible bearer-token authentication for this route.
Authorization
Scoped Velrix API key.
Bearer $VELRIX_API_KEYContent-Type
Requests and responses use JSON.
application/jsonSchema
Body parameters
These are the primary fields most clients need. Velrix forwards compatible fields to the selected upstream route when supported by that route.
modelRequired
Model ID to route. Use gpt-5.4 for policy routing or a catalog model ID for pinned routing.
messagesRequired
Conversation messages. Supported roles include developer, system, user, assistant, and tool; content can be text or supported multimodal content parts.
max_completion_tokensOptional
Upper bound for generated tokens, including visible output and reasoning tokens. Prefer this over legacy max_tokens for compatible models.
temperature / top_pOptional
Sampling controls. Use one primary randomness control for predictable production behavior.
streamOptional
When true, return server-sent event chunks as output is generated.
stream_optionsOptional
Options for streamed responses, including whether to include usage before the data: [DONE] message.
tools / tool_choiceOptional
Define callable tools and control whether the model may, must, or must not call them.
response_formatOptional
Constrain output format, including JSON object mode or JSON schema structured outputs when supported.
logprobs / top_logprobsOptional
Request token log probabilities and the most likely alternative tokens when supported.
modalities / audioOptional
Request text and, for compatible models, audio output with voice and format settings.
web_search_optionsOptional
Configure built-in web search behavior for models and routes that support it.
metadataOptional
Attach key-value metadata for filtering, tracing, or operational bookkeeping.
service_tierOptional
Latency tier preference for platforms that support service tier selection.
Request
Request example
Provide a model and an ordered message list. Use gpt-5.4 for policy routing or a catalog model ID for pinned routing.
curl https://api.velrix.ai/v1/chat/completions \
-H "Authorization: Bearer $VELRIX_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5.4",
"messages": [
{
"role": "developer",
"content": "Answer with concise operational detail."
},
{
"role": "user",
"content": "Summarize this incident timeline."
}
],
"temperature": 0.2,
"max_completion_tokens": 512,
"metadata": {
"service": "dashboard-docs"
}
}'curl https://api.velrix.ai/v1/chat/completions \
-H "Authorization: Bearer $VELRIX_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5.4",
"messages": [
{
"role": "user",
"content": "Return a JSON object with risk and mitigation."
}
],
"response_format": {
"type": "json_object"
},
"stream": true,
"stream_options": {
"include_usage": true
}
}'Messages
OpenAI documents developer messages as higher-priority instructions for newer models; user messages carry end-user input.
Authorization
Authenticate with a scoped Velrix API key in the bearer token.
Response
Response shape
Responses follow the OpenAI-compatible chat completion shape so existing clients can parse choices and usage metadata.
{
"id": "chatcmpl-B9MBs8CjcvOU2jLn4n570S5qMJKcT",
"object": "chat.completion",
"created": 1741569952,
"model": "gpt-5.5",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I assist you today?",
"refusal": null,
"annotations": []
},
"logprobs": null,
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 19,
"completion_tokens": 10,
"total_tokens": 29,
"prompt_tokens_details": {
"cached_tokens": 0,
"audio_tokens": 0
},
"completion_tokens_details": {
"reasoning_tokens": 0,
"audio_tokens": 0,
"accepted_prediction_tokens": 0,
"rejected_prediction_tokens": 0
}
},
"service_tier": "default"
}Advanced
Streaming and tools
OpenAI's official reference documents streaming options and tool choice controls for chat completions.
Streaming
Set stream: true to receive server-sent event chunks. Use stream_options.include_usage when you need a usage block before the final done event.
Tool choice
Use tools and tool_choice to let the model call functions automatically, require a call, or force a named function. Tool support depends on the selected route.