OpenRouter on Bifrost: Multi-Provider LLM Routing, Models, and API Reference

OpenRouter provider summary

OpenRouter is an OpenAI-compatible LLM routing service that aggregates access to 100+ models across vendors including OpenAI, Anthropic, Google, Meta, and Mistral. Bifrost routes requests to OpenRouter with intelligent parameter conversion and full streaming support.

Popular models available through OpenRouter:

openai/gpt-4o (Latest OpenAI)
anthropic/claude-3.5-sonnet (Anthropic)
google/gemini-2.0-flash (Google)
meta-llama/llama-3.1-70b-instruct (Meta)

Property	Details
Description	OpenAI-compatible routing service for 100+ models across multiple vendors.
Provider route on Bifrost	openrouter/<model>
Provider doc	OpenRouter API Docs
API endpoint for provider	https://openrouter.ai/api
Supported endpoints	/v1/chat/completions, /v1/completions, /v1/embeddings, /v1/responses, /v1/models

Supported operations

OpenRouter supports 5 major operations across chat, text completions, embeddings, and model listing. Chat completions and responses API support streaming.

Operation	Non-streaming	Streaming	Upstream endpoint
Chat Completions	Yes	Yes	/v1/chat/completions
Responses API	Yes	Yes	/v1/responses
Text Completions	Yes	Yes	/v1/completions
Embeddings	Yes	No	/v1/embeddings
List Models	Yes	No	/v1/models

Parameter handling

OpenRouter accepts OpenAI-compatible parameters with intelligent routing and provider-specific parameter conversion. Usage tracking is comprehensive, including token counts from multiple providers.

Reasoning support:

Available on models that support extended thinking
Pricing varies by model and reasoning depth

Tool calling:

Full support for function calling on compatible models
Parameters convert between OpenAI and provider-specific formats

Supported OpenRouter parameters

Quick reference of OpenAI-compatible parameters accepted when routing through Bifrost to OpenRouter.

[
  "stream",
  "temperature",
  "top_p",
  "top_k",
  "max_tokens",
  "stop",
  "presence_penalty",
  "frequency_penalty",
  "seed",
  "response_format",
  "tools",
  "tool_choice",
  "reasoning"
]

Popular models via OpenRouter

Use the provider prefix openrouter/ in Bifrost model routes. OpenRouter provides access to 100+ models across multiple vendors.

Provider/Family	Model ID	Bifrost route	Typical usage
OpenAI GPT-4o	openai/gpt-4o	openrouter/openai/gpt-4o	Latest OpenAI model
Anthropic Claude 3.5 Sonnet	anthropic/claude-3.5-sonnet	openrouter/anthropic/claude-3.5-sonnet	Advanced reasoning
Google Gemini 2.0 Flash	google/gemini-2.0-flash	openrouter/google/gemini-2.0-flash	Fast generation
Meta Llama 3.1 70B	meta-llama/llama-3.1-70b-instruct	openrouter/meta-llama/llama-3.1-70b-instruct	Open source LLM
Mistral Large	mistralai/mistral-large	openrouter/mistralai/mistral-large	Efficient model

API reference by operation

Gateway paths and OpenRouter upstream endpoints.

1) Chat Completions

Primary path at /v1/chat/completions. Bifrost delegates to the OpenAI implementation with special handling for reasoning models. OpenRouter supports all standard OpenAI chat parameters — see OpenAI Chat Completions for full parameter reference. See Chat Completions in Bifrost docs.

Reasoning parameter handling

Extended thinking on compatible models. Bifrost converts the internal reasoning object to OpenRouter's reasoning_effort. Reasoning models include gpt-oss-120b and other models with special reasoning content handling.

// Bifrost request
{
  "reasoning": {
    "effort": "high",
    "max_tokens": 10000
  }
}

// OpenRouter conversion
{
  "reasoning_effort": "high"
}

Cache control stripping

Anthropic-specific cache_control directives on message content blocks are removed before the request is sent upstream.

// cache_control stripped from content blocks
{
  "messages": [{
    "role": "user",
    "content": [{
      "type": "text",
      "text": "...",
      "cache_control": {"type": "ephemeral"}
    }]
  }]
}

Filtered parameters

Removed for OpenRouter compatibility: prompt_cache_key, verbosity (Anthropic-specific), store, and service_tier (OpenAI-specific).

Messages, tools, responses, and streaming follow standard OpenAI formats
Tool calling: full function definitions and execution on compatible models
Streaming: SSE with usage tracking

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openrouter/openai/gpt-4o",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

2) Responses API

For compatible models on OpenRouter. Maps to upstream /v1/responses endpoint in beta.

curl -X POST http://localhost:8080/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openrouter/anthropic/claude-3.5-sonnet",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

3) Text Completions

Legacy completions endpoint. Maps to upstream /v1/completions. Supports streaming.

curl -X POST http://localhost:8080/v1/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openrouter/meta-llama/llama-3.1-70b-instruct",
    "prompt": "Hello, my name is",
    "max_tokens": 50
  }'

4) Embeddings

Vector embeddings via OpenRouter. Maps to upstream /v1/embeddings. Does not support streaming.

curl -X POST http://localhost:8080/v1/embeddings \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openrouter/openai/text-embedding-3-small",
    "input": "Hello world"
  }'

5) List Models

Enumerate available models on OpenRouter. Maps to upstream /v1/models endpoint.

Implementation caveats

Caveat	Impact	Severity
Vendor model availability	Not all models available at all times; pricing varies by vendor	Medium
Parameter conversion	OpenAI parameters convert to vendor-specific formats automatically	Low
Reasoning availability	Extended thinking only on compatible models with additional cost	Medium
Tool calling limits	Function calling availability depends on underlying model support	Low
No image/audio support	Image generation, TTS, and STT not available through OpenRouter	Medium