AI Gateway

How to Use Claude Code with Non-Anthropic Models: The Enterprise Gateway Guide (2026)

Use Claude Code with non-Anthropic models through Bifrost. Route to GPT, Gemini, Bedrock, or Vertex with governance, failover, and 11µs overhead.

Claude Code has become the default terminal-based coding agent for engineering teams in 2026, but it ships locked to api.anthropic.com. To use Claude Code with non-Anthropic models, teams need an AI gateway that intercepts requests at the transport layer, translates between provider API formats, and returns responses in Anthropic's format so the client never knows the difference. Bifrost, the open-source AI gateway by Maxim AI, solves this with a single environment variable change, full multi-provider routing across 20+ providers, and enterprise-grade governance built for production teams.

This guide walks through why teams route Claude Code through alternative providers, how Bifrost handles the integration at the protocol layer, the configuration steps, and the governance, observability, and reliability features that come with consolidating Claude Code traffic through a gateway.

Why Teams Run Claude Code with Non-Anthropic Models

Engineering teams in regulated industries, large enterprises, and cost-sensitive organizations consistently hit the same constraints with Claude Code's default Anthropic-only routing. The reasons for switching are practical, not theoretical:

Regional compliance: Data residency requirements that mandate AWS Bedrock or Azure OpenAI in specific geographies.
Cost optimization: Routing routine, lightweight tasks to faster, cheaper models (Groq-hosted Llama, Gemini Flash) while reserving Claude for complex reasoning.
Provider redundancy: Eliminating single-vendor risk during outages or quota throttling.
Model specialization: Using GPT-5 for specific codebases where it performs better, or Gemini for monorepo work where larger context windows matter.
Existing enterprise contracts: Teams already buying capacity through Bedrock, Vertex AI, or Azure OpenAI want to consolidate spend.

These pressures are amplified by the broader market. Gartner has forecast that 40% of enterprise applications will integrate task-specific AI agents by the end of 2026, up from under 5% in 2025. Each of those agents is, at the infrastructure layer, an LLM call. Anthropic's own Claude Code documentation acknowledges third-party provider configurations through Bedrock and Vertex, but a production-grade setup requires more than environment variables.

How Bifrost Routes Claude Code Through Any LLM Provider

Bifrost works at the transport layer. Claude Code sends Anthropic-formatted requests to what it thinks is api.anthropic.com. Bifrost intercepts those requests at a local endpoint, converts them to the target provider's format, forwards them, and translates responses back into Anthropic's format before returning them to Claude Code. The client binary stays unmodified. The full integration pattern is documented in the Claude Code integration guide.

The architecture provides three distinct advantages over building a custom proxy or modifying the Claude Code client:

Single environment variable swap: Set ANTHROPIC_BASE_URL to point at the Bifrost gateway and Claude Code routes through it.
Anthropic-compatible endpoint: Bifrost exposes /anthropic as a first-class handler, so streaming, tool calls, and message format remain compatible.
Provider-agnostic backend: The same Claude Code session can target Anthropic, Bedrock, Vertex, Azure, OpenAI, Mistral, Groq, Cohere, xAI, or any of 20+ supported providers without client-side changes.

Because the substitution happens entirely at the gateway layer, Bifrost functions as a drop-in replacement for Anthropic's API surface from the client's perspective.

Configuring Claude Code to Route Through Bifrost

Setting up Claude Code with non-Anthropic models takes three steps. First, install Claude Code and Bifrost. Second, configure routing in the Bifrost dashboard. Third, update Claude Code's settings.json with the gateway URL and model selections.

Step 1: Install Claude Code

npm install -g @anthropic-ai/claude-code

Run Bifrost locally (or against your hosted instance) so the gateway is reachable at localhost:8080.

Step 2: Configure Provider-Specific Model Pinning

Update the global settings.json (located at ~/.claude/settings.json on macOS, Linux, or WSL) to merge the following env block. Pinning to Anthropic models on AWS Bedrock looks like this:

"env": {
  "ANTHROPIC_BASE_URL": "<http://localhost:8080/anthropic>",
  "ANTHROPIC_API_KEY": "your-virtual-key",
  "ANTHROPIC_DEFAULT_HAIKU_MODEL": "bedrock/global.anthropic.claude-haiku-4-6",
  "ANTHROPIC_DEFAULT_SONNET_MODEL": "bedrock/global.anthropic.claude-sonnet-4-6"
}

For Vertex AI, swap the prefix to vertex/. For Azure, use azure/. For Anthropic-hosted models, no prefix is required. Each pattern is covered in the Claude Code documentation.

Step 3: Override Claude Code's Tier Defaults

Claude Code organizes models into three tiers: Sonnet (default), Opus (complex reasoning), and Haiku (fast, lightweight). With Bifrost, any tier can be overridden to use a model from any provider using the provider/model-name format:

export ANTHROPIC_DEFAULT_SONNET_MODEL="openai/gpt-5"
export ANTHROPIC_DEFAULT_OPUS_MODEL="gemini/gemini-2.5-pro"
export ANTHROPIC_DEFAULT_HAIKU_MODEL="groq/llama-3.3-70b-versatile"

Models can also be switched mid-session using the /model command:

/model openai/gpt-5
/model mistral/mistral-large-latest
/model xai/grok-3

The switch is instantaneous and Claude Code preserves the conversation context across the swap.

Enterprise Governance for Claude Code at Scale

Routing alone is the table-stakes capability. Once Claude Code traffic flows through Bifrost, teams unlock the governance layer that matters most for production deployments. According to a Cloud Security Alliance survey, 82% of organizations discovered an AI agent or workflow in the past year that security or IT did not previously know about. Gateway-level controls close this gap.

Bifrost provides four pillars of governance for Claude Code traffic:

Virtual keys: Per-developer or per-team API keys with independent budgets, rate limits, and access scopes. Configured through virtual keys.
Hierarchical budgets: Customer, team, virtual key, and provider-level limits enforced at the gateway. When a budget is exhausted, requests are blocked before they reach the provider.
Rate limits: Token-per-minute and request-per-minute caps applied per virtual key. Documented under rate limits.
Audit logs: Immutable trails for SOC 2 Type II, GDPR, HIPAA, and ISO 27001 compliance. Every Claude Code request is logged with token usage, latency, model, and full request/response inspection.

For teams running Claude Code cost management across multiple developers, the gateway becomes the single point of cost attribution and policy enforcement.

Reliability, Caching, and Observability

Claude Code makes dozens of API calls during agentic coding sessions. Provider outages, quota throttling, and redundant requests directly translate into failed sessions and inflated bills. Bifrost addresses each at the infrastructure layer.

Automatic failover: If a primary provider returns errors or goes down, Bifrost reroutes traffic to a configured fallback chain with zero downtime. Configured through automatic fallbacks.
Semantic caching: Responses are cached based on semantic similarity, not just exact-match hashes. Repeated queries (or near-duplicates across a session) return cached responses instead of hitting the provider. See semantic caching.
Built-in observability: Every request is logged with latency, tokens, cost, and provider information. Native Prometheus metrics and OpenTelemetry support feed existing APM stacks.
Performance overhead: Bifrost adds only 11 microseconds of overhead per request at sustained 5,000 RPS. Teams running rapid-fire agentic coding sessions do not pay a latency tax for the governance layer. Reproduce the numbers using the benchmarking guide.

The combined effect is a Claude Code deployment that is faster to recover from provider failures, cheaper to run on repeated workloads, and fully observable for both engineering and finance stakeholders.

Provider Compatibility Considerations

Not every provider works equally well with Claude Code. Because Claude Code relies heavily on tool calling for file operations, terminal commands, and code editing, the upstream provider must properly support and stream tool call arguments. Two known limitations are worth flagging before configuring routes:

OpenRouter: Does not stream function call arguments correctly. Tool calls return with empty arguments fields, causing Claude Code to fail on tool-based actions. Route to direct providers instead.
Some proxy providers: May not fully implement the Anthropic API streaming specification for tool calls. If tool execution fails, switching to a direct provider in the Bifrost configuration resolves it.

For coverage analysis across the full set of CLI agents, the CLI agents resource page documents which providers have been validated for tool-heavy workflows. Teams running formal evaluations can also reference the LLM Gateway Buyer's Guide for a side-by-side capability matrix.

MCP Tools and Extended Workflows

Beyond model routing, Bifrost serves as a native MCP gateway, centralizing Model Context Protocol tool connections, governance, and authentication for every connected agent. MCP tools configured in Bifrost are automatically injected into Claude Code's tools array before requests are forwarded to the provider, so Claude Code gains access to filesystem operations, web search, database queries, and any custom MCP server without client-side configuration. Code Mode further reduces token consumption by 50% and latency by 40% for multi-tool workflows, as detailed in the Bifrost MCP Gateway analysis.

Start Routing Claude Code Through Any Provider

Running Claude Code with non-Anthropic models is no longer an experimental setup. It is a production requirement for teams managing cost, compliance, and provider redundancy at scale. Bifrost handles the protocol translation, governance, failover, and observability in a single open-source gateway with negligible latency overhead.

To see how Bifrost can route Claude Code across your existing provider mix with full enterprise controls, book a Bifrost demo and walk through the integration with the Bifrost team.