Count Tokens
Count the input tokens for an Anthropic-style message request before creating it. Use this route to estimate cost, validate limits, and account for tools, images, documents, and system prompts.
Endpoint
Endpoint
Use the Anthropic-compatible Messages token counting path with the Velrix Anthropic base URL.
Method
Count tokens in a message without creating the message.
POST /v1/messages/count_tokensSDK method
Anthropic TypeScript beta client method.
client.beta.messages.countTokens(params, options?)Base URL
Anthropic-compatible Velrix endpoint.
https://api.velrix.aiAuthentication
Headers
The official route is beta-scoped and uses Anthropic-compatible authentication headers.
x-api-key
Scoped Velrix API key.
$VELRIX_API_KEYanthropic-version
Anthropic API version header.
2023-06-01anthropic-beta
Optional beta feature header when a route or feature requires explicit beta enrollment.
beta-feature-nameContent-Type
Requests and responses use JSON.
application/jsonSchema
Body parameters
These fields mirror the official beta Messages count-tokens shape. Velrix forwards compatible fields to the selected upstream route when supported.
modelRequired
Model that will be used for token accounting. Use claude-sonnet-4-6 for policy routing or a catalog model ID.
messagesRequired
Input messages to count. Content can include text and supported Anthropic beta content blocks.
systemOptional
System prompt included in the token count. Anthropic treats this as a top-level parameter.
tools / tool_choiceOptional
Tool definitions and tool-use policy included in token accounting.
cache_controlOptional
Top-level cache control that applies a cache marker to the last cacheable block in the request.
context_managementOptional
Context management configuration. The response can report original input tokens when context management is applied.
mcp_serversOptional
MCP server definitions that should be considered for this beta request.
output_configOptional
Configuration for output behavior, such as format, effort, and task budget.
thinkingOptional
Extended thinking configuration for models that support it.
speedOptional
Inference speed mode for compatible routes, such as standard or fast.
Request
Request example
Send the same message payload you plan to create, minus generation-only fields such as max_tokens.
curl https://api.velrix.ai/v1/messages/count_tokens \
-H "x-api-key: $VELRIX_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4-6",
"system": "Answer with concise operational detail.",
"messages": [
{
"role": "user",
"content": "Hello, world"
}
],
"tools": [
{
"name": "lookup_incident",
"description": "Look up an incident by ID.",
"input_schema": {
"type": "object",
"properties": {
"incident_id": {
"type": "string"
}
},
"required": ["incident_id"]
}
}
]
}'import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic({
apiKey: process.env.VELRIX_API_KEY,
baseURL: "https://api.velrix.ai",
});
const betaMessageTokensCount = await client.beta.messages.countTokens({
messages: [{ content: "Hello, world", role: "user" }],
model: "claude-sonnet-4-6",
});
console.log(betaMessageTokensCount.context_management);
console.log(betaMessageTokensCount.input_tokens);Response
Response 200
The official response returns token totals and optional context-management accounting.
{
"context_management": {
"original_input_tokens": 0
},
"input_tokens": 2095
}Operations
Usage notes
Use token counting before expensive or user-controlled requests so limits and routing choices are explicit.
Preflight budgets
Count tokens before creating a message when prompts include large files, images, documents, tool definitions, or system instructions.
Tool-aware counting
Include the same tool definitions you plan to send with the message so the token count reflects the real request.
Route-specific totals
Tokenization depends on the routed model. Pin a catalog model when exact counts matter for admission control.
Same key scope
The count request uses the same Velrix key, quota, and logging path as Anthropic-compatible message creation.