Top Enterprise AI Gateways to Use Non-Anthropic Models in Claude Code
TL;DR
Claude Code is locked to Anthropic's models by default. Enterprise AI gateways solve this by sitting between Claude Code and any LLM provider, translating API formats transparently so you can route requests to OpenAI, Google Gemini, Mistral, AWS Bedrock, or any other provider without modifying the client. This article covers five gateways that make this possible: Bifrost, LiteLLM, Cloudflare AI Gateway, Kong AI Gateway, and OpenRouter.
Claude Code has rapidly become one of the most capable terminal-based AI coding agents available. But it only speaks Anthropic's API protocol. For engineering teams that need provider flexibility or the ability to use specific non-Anthropic models for certain tasks, this creates a real bottleneck.
AI gateways solve this. They intercept Claude Code's Anthropic-formatted requests, translate them to the target provider's format, and return responses back in Anthropic's format. Claude Code never knows the difference. The integration is simple: set ANTHROPIC_BASE_URL to point at the gateway, and you are running multi-model Claude Code.
Here are five enterprise AI gateways that enable this today.
1. Bifrost
Bifrost is an open-source, high-performance AI gateway built in Go by Maxim AI. It acts as a fully compatible Anthropic API endpoint, making it purpose-built for the Claude Code use case. The integration requires a single environment variable change and zero client-side modifications.
How It Works with Claude Code
Bifrost provides a /anthropic endpoint that accepts native Anthropic-formatted requests. Point Claude Code at it and every request flows through the gateway:
export ANTHROPIC_API_KEY=your-bifrost-virtual-key
export ANTHROPIC_BASE_URL=http://localhost:8080/anthropic
claude
From here, Bifrost translates requests to any of its 20+ supported providers, including OpenAI, AWS Bedrock, Google Vertex, Azure, Groq, and Mistral. You can override Claude Code's default model tiers (Sonnet, Opus, Haiku) to use models from any configured provider:
export ANTHROPIC_DEFAULT_SONNET_MODEL="openai/gpt-5"
export ANTHROPIC_DEFAULT_OPUS_MODEL="anthropic/claude-opus-4-5-20251101"
export ANTHROPIC_DEFAULT_HAIKU_MODEL="azure/claude-haiku-4-5"
Mid-session model switching also works. Running /model vertex/claude-haiku-4-5 inside Claude Code reroutes traffic to a different provider instantly.
MCP Gateway Integration
Where Bifrost pulls ahead for agentic workflows is its MCP gateway capability. Bifrost acts as both an MCP client and server. It connects to external MCP tool servers (filesystem, web search, databases, custom APIs) and exposes them through a single /mcp endpoint. Adding it to Claude Code takes one command:
claude mcp add --transport http bifrost <http://localhost:8080/mcp>
Bifrost controls tool access through Virtual Keys, so different developers or teams can have different tool permissions, all enforced at the gateway level.
Enterprise Capabilities
Beyond model routing, Bifrost brings production-grade infrastructure features. Automatic failover reroutes traffic when a provider goes down. Semantic caching reduces redundant API calls by matching requests on meaning rather than exact text. Virtual Keys provide hierarchical budget management, rate limiting, and access control across teams. Every request gets logged with full metadata through the built-in observability dashboard, with Prometheus metrics and OpenTelemetry support for production monitoring.
Getting started takes seconds:
npx -y @maximhq/bifrost
2. LiteLLM
Platform Overview
LiteLLM is an open-source Python proxy that provides a unified OpenAI-compatible interface across 100+ LLM providers. It can be configured with Claude Code by setting ANTHROPIC_BASE_URL to point at the LiteLLM proxy endpoint.
Features
Broad provider coverage with support for 100+ providers. Includes virtual key management, per-key spend tracking, and configurable routing strategies (latency-based, cost-based). Available as both a Python SDK and a standalone proxy server.
Best For
Python-heavy engineering teams that need quick multi-provider access during development and prototyping. Its wide provider list makes it a flexible starting point, though its Python runtime introduces higher latency at scale compared to compiled alternatives.
3. Cloudflare AI Gateway
Platform Overview
Cloudflare AI Gateway is a managed service running on Cloudflare's global edge network. It proxies and manages LLM API calls with minimal setup through the Cloudflare dashboard. Claude Code integration works by pointing ANTHROPIC_BASE_URL to a Cloudflare gateway endpoint.
Features
Request caching (exact-match), rate limiting, usage analytics, and logging for LLM traffic. Runs across Cloudflare's 250+ points of presence. A generous free tier covers core features including dashboard analytics and basic logging.
Best For
Teams already embedded in the Cloudflare ecosystem that want low-friction AI traffic management without self-hosting. Note that it lacks semantic caching, per-developer cost attribution, and self-hosted deployment options.
4. Kong AI Gateway
Platform Overview
Kong AI Gateway extends Kong's mature enterprise API management platform with AI-specific plugins for multi-LLM routing and governance. It fits naturally into organizations already standardizing on Kong for API infrastructure.
Features
Token-based rate limiting that operates on token consumption rather than raw request counts. Includes AI-specific plugins for prompt templating, response transformation, and traffic control. Leverages Kong's existing ecosystem of authentication, logging, and analytics plugins.
Best For
Enterprise teams already running Kong for API management that want to extend their existing infrastructure to handle LLM traffic without introducing a separate gateway.
5. OpenRouter
Platform Overview
OpenRouter is a managed routing service providing a single API endpoint for accessing models across multiple providers. It handles billing aggregation and model availability tracking. OpenRouter provides documentation for Claude Code integration specifically.
Features
Single API key for accessing 200+ models from OpenAI, Anthropic, Google, Meta, Mistral, and open-source providers. Automatic model fallback, unified billing, and a model comparison interface with pay-per-use pricing.
Best For
Individual developers and smaller teams that want instant multi-model access without managing separate provider accounts or self-hosting infrastructure. It prioritizes convenience over the fine-grained governance controls that larger enterprise teams typically need.
Choosing the Right Gateway
The right gateway depends on where your team sits on the spectrum between convenience and control. Cloudflare works if you are already in their ecosystem. LiteLLM suits Python-native prototyping. Kong extends existing API infrastructure. OpenRouter removes friction for quick experimentation.
For teams building production agentic workflows with Claude Code that demand high throughput, MCP tool orchestration, and enterprise governance, Bifrost is purpose-built for the job, with native integration into Maxim AI's evaluation and observability platform for a complete stack under an Apache 2.0 license.