API ReferenceOpenAI

Chat Completion

Send OpenAI-compatible chat completion requests through Velrix and let gateway policy choose the provider route, fallback behavior, and operational controls.

Endpoint

Endpoint

Use the standard OpenAI chat completions path with the Velrix base URL.

Method

Create a chat completion.

POST /v1/chat/completions

Base URL

OpenAI-compatible Velrix endpoint.

https://api.velrix.ai/v1

Authentication

Headers

Velrix follows OpenAI-compatible bearer-token authentication for this route.

Authorization

Scoped Velrix API key.

Bearer $VELRIX_API_KEY

Content-Type

Requests and responses use JSON.

application/json

Schema

Body parameters

These are the primary fields most clients need. Velrix forwards compatible fields to the selected upstream route when supported by that route.

model

Required

Model ID to route. Use gpt-5.4 for policy routing or a catalog model ID for pinned routing.

messages

Required

Conversation messages. Supported roles include developer, system, user, assistant, and tool; content can be text or supported multimodal content parts.

max_completion_tokens

Optional

Upper bound for generated tokens, including visible output and reasoning tokens. Prefer this over legacy max_tokens for compatible models.

temperature / top_p

Optional

Sampling controls. Use one primary randomness control for predictable production behavior.

stream

Optional

When true, return server-sent event chunks as output is generated.

stream_options

Optional

Options for streamed responses, including whether to include usage before the data: [DONE] message.

tools / tool_choice

Optional

Define callable tools and control whether the model may, must, or must not call them.

response_format

Optional

Constrain output format, including JSON object mode or JSON schema structured outputs when supported.

logprobs / top_logprobs

Optional

Request token log probabilities and the most likely alternative tokens when supported.

modalities / audio

Optional

Request text and, for compatible models, audio output with voice and format settings.

web_search_options

Optional

Configure built-in web search behavior for models and routes that support it.

metadata

Optional

Attach key-value metadata for filtering, tracing, or operational bookkeeping.

service_tier

Optional

Latency tier preference for platforms that support service tier selection.

Request

Request example

Provide a model and an ordered message list. Use gpt-5.4 for policy routing or a catalog model ID for pinned routing.

Shell
curl https://api.velrix.ai/v1/chat/completions \
  -H "Authorization: Bearer $VELRIX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.4",
    "messages": [
      {
        "role": "developer",
        "content": "Answer with concise operational detail."
      },
      {
        "role": "user",
        "content": "Summarize this incident timeline."
      }
    ],
    "temperature": 0.2,
    "max_completion_tokens": 512,
    "metadata": {
      "service": "dashboard-docs"
    }
  }'
Structured and streamed
curl https://api.velrix.ai/v1/chat/completions \
  -H "Authorization: Bearer $VELRIX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.4",
    "messages": [
      {
        "role": "user",
        "content": "Return a JSON object with risk and mitigation."
      }
    ],
    "response_format": {
      "type": "json_object"
    },
    "stream": true,
    "stream_options": {
      "include_usage": true
    }
  }'

Messages

OpenAI documents developer messages as higher-priority instructions for newer models; user messages carry end-user input.

Authorization

Authenticate with a scoped Velrix API key in the bearer token.

Response

Response shape

Responses follow the OpenAI-compatible chat completion shape so existing clients can parse choices and usage metadata.

JSON
{
  "id": "chatcmpl-B9MBs8CjcvOU2jLn4n570S5qMJKcT",
  "object": "chat.completion",
  "created": 1741569952,
  "model": "gpt-5.5",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I assist you today?",
        "refusal": null,
        "annotations": []
      },
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 19,
    "completion_tokens": 10,
    "total_tokens": 29,
    "prompt_tokens_details": {
      "cached_tokens": 0,
      "audio_tokens": 0
    },
    "completion_tokens_details": {
      "reasoning_tokens": 0,
      "audio_tokens": 0,
      "accepted_prediction_tokens": 0,
      "rejected_prediction_tokens": 0
    }
  },
  "service_tier": "default"
}
Inspect request logs

Advanced

Streaming and tools

OpenAI's official reference documents streaming options and tool choice controls for chat completions.

Streaming

Set stream: true to receive server-sent event chunks. Use stream_options.include_usage when you need a usage block before the final done event.

Tool choice

Use tools and tool_choice to let the model call functions automatically, require a call, or force a named function. Tool support depends on the selected route.