API ReferenceAnthropic beta

Count Tokens

Count the input tokens for an Anthropic-style message request before creating it. Use this route to estimate cost, validate limits, and account for tools, images, documents, and system prompts.

Endpoint

Endpoint

Use the Anthropic-compatible Messages token counting path with the Velrix Anthropic base URL.

Method

Count tokens in a message without creating the message.

POST /v1/messages/count_tokens

SDK method

Anthropic TypeScript beta client method.

client.beta.messages.countTokens(params, options?)

Base URL

Anthropic-compatible Velrix endpoint.

https://api.velrix.ai

Authentication

Headers

The official route is beta-scoped and uses Anthropic-compatible authentication headers.

x-api-key

Scoped Velrix API key.

$VELRIX_API_KEY

anthropic-version

Anthropic API version header.

2023-06-01

anthropic-beta

Optional beta feature header when a route or feature requires explicit beta enrollment.

beta-feature-name

Content-Type

Requests and responses use JSON.

application/json

Schema

Body parameters

These fields mirror the official beta Messages count-tokens shape. Velrix forwards compatible fields to the selected upstream route when supported.

model

Required

Model that will be used for token accounting. Use claude-sonnet-4-6 for policy routing or a catalog model ID.

messages

Required

Input messages to count. Content can include text and supported Anthropic beta content blocks.

system

Optional

System prompt included in the token count. Anthropic treats this as a top-level parameter.

tools / tool_choice

Optional

Tool definitions and tool-use policy included in token accounting.

cache_control

Optional

Top-level cache control that applies a cache marker to the last cacheable block in the request.

context_management

Optional

Context management configuration. The response can report original input tokens when context management is applied.

mcp_servers

Optional

MCP server definitions that should be considered for this beta request.

output_config

Optional

Configuration for output behavior, such as format, effort, and task budget.

thinking

Optional

Extended thinking configuration for models that support it.

speed

Optional

Inference speed mode for compatible routes, such as standard or fast.

Request

Request example

Send the same message payload you plan to create, minus generation-only fields such as max_tokens.

Shell
curl https://api.velrix.ai/v1/messages/count_tokens \
  -H "x-api-key: $VELRIX_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "system": "Answer with concise operational detail.",
    "messages": [
      {
        "role": "user",
        "content": "Hello, world"
      }
    ],
    "tools": [
      {
        "name": "lookup_incident",
        "description": "Look up an incident by ID.",
        "input_schema": {
          "type": "object",
          "properties": {
            "incident_id": {
              "type": "string"
            }
          },
          "required": ["incident_id"]
        }
      }
    ]
  }'
TypeScript
import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({
  apiKey: process.env.VELRIX_API_KEY,
  baseURL: "https://api.velrix.ai",
});

const betaMessageTokensCount = await client.beta.messages.countTokens({
  messages: [{ content: "Hello, world", role: "user" }],
  model: "claude-sonnet-4-6",
});

console.log(betaMessageTokensCount.context_management);
console.log(betaMessageTokensCount.input_tokens);

Response

Response 200

The official response returns token totals and optional context-management accounting.

Response 200
{
  "context_management": {
    "original_input_tokens": 0
  },
  "input_tokens": 2095
}

Operations

Usage notes

Use token counting before expensive or user-controlled requests so limits and routing choices are explicit.

Preflight budgets

Count tokens before creating a message when prompts include large files, images, documents, tool definitions, or system instructions.

Tool-aware counting

Include the same tool definitions you plan to send with the message so the token count reflects the real request.

Route-specific totals

Tokenization depends on the routed model. Pin a catalog model when exact counts matter for admission control.

Same key scope

The count request uses the same Velrix key, quota, and logging path as Anthropic-compatible message creation.