How to Run Claude Code with OpenAI, Gemini, Bedrock, and Any LLM Using Bifrost CLI
Claude Code is one of the most capable agentic coding tools available today, bringing AI-powered development directly into the terminal. However, it is locked to Anthropic's model family by default. This means teams that want to experiment with GPT-5, Gemini 2.5 Pro, Groq, or models hosted on AWS Bedrock and Azure must maintain entirely separate tooling or manually configure environment variables for every session.
Bifrost CLI eliminates this constraint. It is an interactive terminal tool that connects Claude Code (along with Codex CLI, Gemini CLI, and Opencode) to the Bifrost AI Gateway, enabling developers to run Claude Code with any model from any provider through a single command. No environment variables, no config file editing, no provider-specific setup.
Why Claude Code Needs an AI Gateway
Claude Code relies heavily on tool calling for file operations, terminal commands, and code editing. By default, all requests route directly to Anthropic's API. This creates several limitations for engineering teams working at scale:
- No model flexibility: Teams cannot swap in OpenAI, Google, Mistral, or self-hosted models without manually reconfiguring environment variables for each session.
- No automatic failover: If Anthropic's API experiences downtime or rate limiting, Claude Code sessions halt entirely. There is no built-in mechanism to fail over to an alternative provider.
- No cost governance: Without a gateway layer, there is no way to enforce per-developer budgets, rate limits, or usage tracking across a team of engineers using Claude Code concurrently.
- No centralized observability: Individual Claude Code sessions generate no unified telemetry. Engineering leads have no visibility into model usage patterns, error rates, or token consumption across the team.
Bifrost solves each of these problems by sitting between Claude Code and the upstream providers, acting as a high-performance proxy that adds only 11 microseconds of overhead per request at 5,000 requests per second.
Setting Up Bifrost CLI in Under Two Minutes
Getting started with Bifrost CLI requires Node.js 18+ and a running Bifrost gateway. The entire setup is interactive and guided.
Step 1: Start the Bifrost gateway
npx -y @maximhq/bifrost
This launches the gateway locally at http://localhost:8080 with a built-in web UI for provider configuration and real-time monitoring.
Step 2: Launch the CLI
npx -y @maximhq/bifrost-cli
The CLI walks through an interactive setup flow:
- Base URL: Enter the Bifrost gateway address (defaults to
http://localhost:8080). - Virtual key (optional): If the gateway has virtual key authentication enabled, enter the key here. Bifrost stores it securely in the OS keyring, never in plaintext on disk.
- Harness selection: Choose Claude Code from the list. If it is not installed, the CLI offers to install it via npm automatically.
- Model selection: The CLI fetches all available models from the gateway's
/v1/modelsendpoint and presents a searchable, filterable list. Select any model from any configured provider.
Press Enter, and Claude Code launches with all environment variables, API keys, and provider paths configured automatically. No export ANTHROPIC_BASE_URL required.
Running Claude Code with Non-Anthropic Models
Bifrost automatically translates Anthropic API requests to other provider formats, which means Claude Code can work with models from 20+ supported providers. The provider/model-name format specifies the target:
- OpenAI:
openai/gpt-5 - Google Gemini:
gemini/gemini-2.5-pro - Groq:
groq/llama-3.3-70b-versatile - Mistral:
mistral/mistral-large-latest - xAI:
xai/grok-3 - Self-hosted via Ollama:
ollama/llama3
Claude Code uses three model tiers internally: Sonnet (default), Opus (complex tasks), and Haiku (lightweight). With Bifrost, each tier can be overridden independently to use any model from any provider. For example, a team could run GPT-5 for the primary Sonnet tier, Gemini 2.5 Pro for Opus-level reasoning, and a Groq-hosted model for fast Haiku tasks.
Mid-session model switching is also supported. The /model command in Claude Code allows developers to swap providers on the fly:
/model openai/gpt-5
/model gemini/gemini-2.5-pro
/model bedrock/claude-sonnet-4-5
One important constraint: non-Anthropic models must support tool use capabilities. Claude Code depends on tool calling for file operations, code editing, and terminal commands. Models without tool calling support will not function correctly.
Cloud Provider Passthrough: Bedrock, Vertex, and Azure
For enterprises running Claude models through their own cloud infrastructure, Bifrost CLI handles the authentication and routing complexity automatically.
- AWS Bedrock: Bifrost's Bedrock passthrough manages AWS authentication on behalf of Claude Code. Set
CLAUDE_CODE_SKIP_BEDROCK_AUTH=1and Bifrost handles credential management, including cross-region routing through its adaptive load balancer. - Google Vertex AI: The same pattern applies with Bifrost's Vertex endpoint, handling GCP OAuth and project configuration transparently.
- Azure: Since Claude Code lacks native Azure passthrough, Bifrost routes requests through its Anthropic-compatible endpoint and handles the translation to Azure-hosted models internally.
This is particularly valuable for regulated industries where all LLM traffic must stay within a specific cloud environment. Bifrost's in-VPC deployment option ensures no data leaves the private network.
Tabbed Sessions and MCP Integration
Bifrost CLI does not exit after launching a single session. It maintains a tabbed terminal UI where developers can run multiple agent sessions in parallel. The bottom tab bar shows status badges for each session: active (working), idle (ready), or alert. Use Ctrl+B to switch between tabs, open new sessions with different models, or close completed ones.
For Claude Code specifically, the CLI automatically registers Bifrost's MCP Gateway endpoint, making all configured MCP tools available inside the coding session without any manual claude mcp add-json commands. If a virtual key is configured, authenticated MCP access is set up with the correct authorization headers automatically.
This means developers can use Claude Code to interact with external databases, filesystem tools, web search, and custom business logic through MCP servers, all routed and governed through Bifrost.
Governance and Observability for Engineering Teams
When multiple developers on a team use Claude Code daily, cost and usage management becomes critical. Bifrost provides this through several layers:
- Virtual keys: Each developer or team can receive a virtual key with per-consumer budgets, rate limits, and model access permissions. One developer might have access to Opus-tier models while another is restricted to Sonnet.
- Budget and rate limits: Hierarchical cost controls at the virtual key, team, and customer level prevent runaway spending.
- Built-in observability: Every Claude Code request flowing through Bifrost is tracked with native Prometheus metrics and OpenTelemetry tracing. Engineering leads gain real-time visibility into token consumption, error rates, latency, and provider health across all Claude Code sessions.
- Audit logs: For compliance-sensitive environments, Bifrost Enterprise provides immutable audit trails covering every request.
Getting Started
Bifrost is open source on GitHub and the CLI requires just two commands to get running. For teams that need enterprise governance, adaptive load balancing, SSO integration, and in-VPC deployments for their Claude Code workflows, book a Bifrost demo to explore how it fits your infrastructure.