Try Bifrost Enterprise free for 14 days.
Request access

[ Provider Guide ]

OpenRouter Provider on Bifrost

OpenRouter is an OpenAI-compatible LLM routing service providing access to 100+ models across multiple vendors. Bifrost seamlessly routes requests to OpenRouter with full parameter conversion and reasoning support.

OpenRouter provider summary

OpenRouter is an OpenAI-compatible LLM routing service that aggregates access to 100+ models across vendors including OpenAI, Anthropic, Google, Meta, and Mistral. Bifrost routes requests to OpenRouter with intelligent parameter conversion and full streaming support.

Popular models available through OpenRouter:

  • openai/gpt-4o (Latest OpenAI)
  • anthropic/claude-3.5-sonnet (Anthropic)
  • google/gemini-2.0-flash (Google)
  • meta-llama/llama-3.1-70b-instruct (Meta)
PropertyDetails
DescriptionOpenAI-compatible routing service for 100+ models across multiple vendors.
Provider route on Bifrostopenrouter/<model>
Provider docOpenRouter API Docs
API endpoint for providerhttps://openrouter.ai/api
Supported endpoints/v1/chat/completions, /v1/completions, /v1/embeddings, /v1/responses, /v1/models

Supported operations

OpenRouter supports 5 major operations across chat, text completions, embeddings, and model listing. Chat completions and responses API support streaming.

OperationNon-streamingStreamingUpstream endpoint
Chat CompletionsYesYes/v1/chat/completions
Responses APIYesYes/v1/responses
Text CompletionsYesYes/v1/completions
EmbeddingsYesNo/v1/embeddings
List ModelsYesNo/v1/models

Parameter handling

OpenRouter accepts OpenAI-compatible parameters with intelligent routing and provider-specific parameter conversion. Usage tracking is comprehensive, including token counts from multiple providers.

Reasoning support:

  • Available on models that support extended thinking
  • Pricing varies by model and reasoning depth

Tool calling:

  • Full support for function calling on compatible models
  • Parameters convert between OpenAI and provider-specific formats

Supported OpenRouter parameters

Quick reference of OpenAI-compatible parameters accepted when routing through Bifrost to OpenRouter.

[
  "stream",
  "temperature",
  "top_p",
  "top_k",
  "max_tokens",
  "stop",
  "presence_penalty",
  "frequency_penalty",
  "seed",
  "response_format",
  "tools",
  "tool_choice",
  "reasoning"
]

Popular models via OpenRouter

Use the provider prefix openrouter/ in Bifrost model routes. OpenRouter provides access to 100+ models across multiple vendors.

Provider/FamilyModel IDBifrost routeTypical usage
OpenAI GPT-4oopenai/gpt-4oopenrouter/openai/gpt-4oLatest OpenAI model
Anthropic Claude 3.5 Sonnetanthropic/claude-3.5-sonnetopenrouter/anthropic/claude-3.5-sonnetAdvanced reasoning
Google Gemini 2.0 Flashgoogle/gemini-2.0-flashopenrouter/google/gemini-2.0-flashFast generation
Meta Llama 3.1 70Bmeta-llama/llama-3.1-70b-instructopenrouter/meta-llama/llama-3.1-70b-instructOpen source LLM
Mistral Largemistralai/mistral-largeopenrouter/mistralai/mistral-largeEfficient model

API reference by operation

Gateway paths and OpenRouter upstream endpoints.

1) Chat Completions

Primary path at /v1/chat/completions. Bifrost delegates to the OpenAI implementation with special handling for reasoning models. OpenRouter supports all standard OpenAI chat parameters — see OpenAI Chat Completions for full parameter reference. See Chat Completions in Bifrost docs.

Reasoning parameter handling

Extended thinking on compatible models. Bifrost converts the internal reasoning object to OpenRouter's reasoning_effort. Reasoning models include gpt-oss-120b and other models with special reasoning content handling.

// Bifrost request
{
  "reasoning": {
    "effort": "high",
    "max_tokens": 10000
  }
}

// OpenRouter conversion
{
  "reasoning_effort": "high"
}

Cache control stripping

Anthropic-specific cache_control directives on message content blocks are removed before the request is sent upstream.

// cache_control stripped from content blocks
{
  "messages": [{
    "role": "user",
    "content": [{
      "type": "text",
      "text": "...",
      "cache_control": {"type": "ephemeral"}
    }]
  }]
}

Filtered parameters

Removed for OpenRouter compatibility: prompt_cache_key, verbosity (Anthropic-specific), store, and service_tier (OpenAI-specific).

  • Messages, tools, responses, and streaming follow standard OpenAI formats
  • Tool calling: full function definitions and execution on compatible models
  • Streaming: SSE with usage tracking
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openrouter/openai/gpt-4o",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

2) Responses API

For compatible models on OpenRouter. Maps to upstream /v1/responses endpoint in beta.

curl -X POST http://localhost:8080/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openrouter/anthropic/claude-3.5-sonnet",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

3) Text Completions

Legacy completions endpoint. Maps to upstream /v1/completions. Supports streaming.

curl -X POST http://localhost:8080/v1/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openrouter/meta-llama/llama-3.1-70b-instruct",
    "prompt": "Hello, my name is",
    "max_tokens": 50
  }'

4) Embeddings

Vector embeddings via OpenRouter. Maps to upstream /v1/embeddings. Does not support streaming.

curl -X POST http://localhost:8080/v1/embeddings \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openrouter/openai/text-embedding-3-small",
    "input": "Hello world"
  }'

5) List Models

Enumerate available models on OpenRouter. Maps to upstream /v1/models endpoint.

Implementation caveats

CaveatImpactSeverity
Vendor model availabilityNot all models available at all times; pricing varies by vendorMedium
Parameter conversionOpenAI parameters convert to vendor-specific formats automaticallyLow
Reasoning availabilityExtended thinking only on compatible models with additional costMedium
Tool calling limitsFunction calling availability depends on underlying model supportLow
No image/audio supportImage generation, TTS, and STT not available through OpenRouterMedium

Authoritative references

[ BIFROST FEATURES ]

Open Source & Enterprise

Everything you need to run AI in production, from free open source to enterprise-grade features.

01 Governance

SAML support for SSO and Role-based access control and policy enforcement for team collaboration.

02 Adaptive Load Balancing

Automatically optimizes traffic distribution across provider keys and models based on real-time performance metrics.

03 Cluster Mode

High availability deployment with automatic failover and load balancing. Peer-to-peer clustering where every instance is equal.

04 Alerts

Real-time notifications for budget limits, failures, and performance issues on Email, Slack, PagerDuty, Teams, Webhook and more.

05 Log Exports

Export and analyze request logs, traces, and telemetry data from Bifrost with enterprise-grade data export capabilities for compliance, monitoring, and analytics.

06 Audit Logs

Comprehensive logging and audit trails for compliance and debugging.

07 Vault Support

Secure API key management with HashiCorp Vault, AWS Secrets Manager, Google Secret Manager, and Azure Key Vault integration.

08 VPC Deployment

Deploy Bifrost within your private cloud infrastructure with VPC isolation, custom networking, and enhanced security controls.

09 Guardrails

Automatically detect and block unsafe model outputs with real-time policy enforcement and content moderation across all agents.

[ SHIP RELIABLE AI ]

Try Bifrost Enterprise with a 14-day Free Trial

[quick setup]

Drop-in replacement for any AI SDK

Change just one line of code. Works with OpenAI, Anthropic, Vercel AI SDK, LangChain, and more.

1import os
2from anthropic import Anthropic
3
4anthropic = Anthropic(
5 api_key=os.environ.get("ANTHROPIC_API_KEY"),
6 base_url="https://<bifrost_url>/anthropic",
7)
8
9message = anthropic.messages.create(
10 model="claude-3-5-sonnet-20241022",
11 max_tokens=1024,
12 messages=[
13 {"role": "user", "content": "Hello, Claude"}
14 ]
15)
Drop in once, run everywhere.