AI Gateway

Running Claude Code with OpenAI Models: A Step-by-Step Setup Guide

Run Claude Code with OpenAI models like GPT-5 and GPT-4o using Bifrost, the open-source AI gateway. Step-by-step setup, zero code changes.

Running Claude Code with OpenAI models is one of the most common requests from engineering teams that have standardized on Anthropic's terminal-based coding agent but need flexibility on the model layer. Claude Code ships locked to Anthropic's API by default, which limits cost optimization, provider redundancy, and benchmarking workflows. Bifrost, the high-performance open-source AI gateway by Maxim AI, removes that constraint by translating Claude Code's Anthropic-format requests into OpenAI's Chat Completions format transparently. This guide walks through the complete setup, from installing Bifrost to overriding model tiers to validating tool-call behavior in production.

Why Run Claude Code with OpenAI Models

Claude Code's agentic capabilities (file operations, terminal commands, multi-file refactors) make it the default coding agent for many teams, but production engineering organizations frequently need to route traffic to OpenAI models for several reasons:

Cost optimization: GPT-5.2 is currently priced at $1.75 per million input tokens and $14 per million output tokens, which can be lower than Claude Sonnet for input-heavy workloads.
Model benchmarking: Comparing GPT-5 family performance against Claude on specific codebases without rewriting agent infrastructure.
Provider redundancy: Avoiding hard dependency on a single LLM provider for critical developer workflows.
Compliance and data residency: Some organizations require routing LLM traffic through Azure OpenAI for regional or regulatory reasons.
Specialized model strengths: Using OpenAI models for agentic terminal tasks where they have benchmark leads on Terminal-Bench style workloads.

Without an AI gateway, none of these are possible. Claude Code talks Anthropic's Messages API exclusively, and OpenAI's API speaks a different format. Bifrost solves this by acting as a fully Anthropic-compatible endpoint that translates requests on the fly to whichever provider you target.

How Bifrost Routes Claude Code Traffic to OpenAI

Bifrost sits between Claude Code and the upstream LLM providers as a high-performance proxy. The routing flow works as follows:

Claude Code sends an Anthropic Messages API request to Bifrost instead of api.anthropic.com.
Bifrost intercepts the request, parses the model identifier (for example, openai/gpt-5), and translates the request schema to OpenAI's Chat Completions format.
The translated request is dispatched to OpenAI's endpoint with the configured API key.
OpenAI's response is translated back to Anthropic's response schema and returned to Claude Code.
Claude Code processes the response as if it came from Anthropic directly.

This translation happens at runtime with only 11 microseconds of overhead per request at 5,000 requests per second in sustained benchmarks. Bifrost publishes independent performance benchmarks covering latency and throughput across the supported provider matrix.

Bifrost's drop-in replacement architecture is the foundation that makes this work. Claude Code only needs a base URL change; everything else stays identical.

Step 1: Install and Start Bifrost

Bifrost runs as a local gateway that Claude Code connects to over HTTP. The fastest path to a running instance is the NPX installer.

npx -y @maximhq/bifrost

This command starts Bifrost on http://localhost:8080 with a built-in web UI for provider configuration, request logs, and real-time monitoring. The gateway is zero-config: no YAML files, no environment scaffolding required to start.

For production deployments, Docker is the recommended path:

docker run -p 8080:8080 -v $(pwd)/data:/app/data maximhq/bifrost

Bifrost also supports Kubernetes deployment for teams running it as shared infrastructure across engineering organizations.

Step 2: Configure OpenAI as a Provider

Once Bifrost is running, open the web UI at http://localhost:8080 and add OpenAI as a provider. You will need:

An OpenAI API key with access to the models you plan to use (GPT-5, GPT-4o, GPT-4o-mini, etc.)
A provider key name for tracking and rotation purposes

You can configure providers either through the UI, the provider configuration API, or a config.json file. For a config-file approach, the structure looks like this:

{
  "providers": {
    "openai": {
      "keys": [
        {
          "name": "openai-primary",
          "value": "env.OPENAI_API_KEY",
          "models": [],
          "weight": 1.0
        }
      ]
    }
  }
}

Bifrost reads env.OPENAI_API_KEY from the environment, which keeps the actual key out of the config file. For enterprise deployments, integration with HashiCorp Vault, AWS Secrets Manager, and similar systems replaces direct environment variable usage.

Step 3: Point Claude Code at Bifrost

With Bifrost running and OpenAI configured, redirect Claude Code to the gateway by setting two environment variables:

export ANTHROPIC_API_KEY=your-bifrost-virtual-key
export ANTHROPIC_BASE_URL=http://localhost:8080/anthropic

If virtual keys are disabled in your Bifrost instance, set ANTHROPIC_API_KEY to dummy. With virtual keys enabled, this value becomes your governance handle: per-key budgets, rate limits, and tool filtering all anchor on it.

After exporting these variables, run claude in a new terminal session. Claude Code now routes every request through Bifrost.

Step 4: Override Model Tiers to Use OpenAI Models

Claude Code uses three model tiers internally: Sonnet (default for most tasks), Opus (complex reasoning), and Haiku (fast, lightweight operations). Each tier can be independently overridden using environment variables:

# Replace the Sonnet tier with GPT-5 for primary coding tasks
export ANTHROPIC_DEFAULT_SONNET_MODEL="openai/gpt-5"

# Replace the Opus tier with GPT-4o for complex reasoning
export ANTHROPIC_DEFAULT_OPUS_MODEL="openai/gpt-4o"

# Keep Haiku on Anthropic for speed-sensitive operations
export ANTHROPIC_DEFAULT_HAIKU_MODEL="anthropic/claude-haiku-4-5-20251001"

The provider/model format tells Bifrost which provider and model to route to. Any provider configured in your Bifrost instance is available here, including 20+ supported providers such as Anthropic, AWS Bedrock, Google Vertex AI, Azure OpenAI, Groq, Mistral, Cohere, and Ollama for local inference.

For full OpenAI routing, override all three tiers:

export ANTHROPIC_DEFAULT_SONNET_MODEL="openai/gpt-5"
export ANTHROPIC_DEFAULT_OPUS_MODEL="openai/gpt-4o"
export ANTHROPIC_DEFAULT_HAIKU_MODEL="openai/gpt-4o-mini"

You can also launch Claude Code with a specific model using the --model flag, or switch models mid-session using /model openai/gpt-5 inside Claude Code itself. The switch is instantaneous and preserves conversation context.

Step 5: Validate Tool-Call Behavior

This step is the most important and the most frequently skipped. Claude Code depends heavily on tool calling for file operations, terminal commands, and code editing. Not every model and not every provider streams tool call arguments correctly. Before treating a configuration as production-ready, run a tool-heavy task and verify:

File reads and writes complete successfully.
Multi-step terminal commands execute without empty argument errors.
Long-running edits maintain conversation context across turns.

OpenAI's GPT-5 and GPT-4o models support tool calling natively and work reliably with Claude Code. Some aggregator-style providers do not stream function-call arguments correctly, which causes Claude Code to fail silently on file operations. If a model passes basic chat tests but fails on file edits, the most likely cause is incomplete tool-call streaming on the upstream provider. In that case, switch to a different provider in your Bifrost configuration.

Step 6: Add Failover, Caching, and Governance

Once Claude Code is routing through Bifrost, the full set of gateway features becomes available without additional configuration on the Claude Code side:

Automatic failover: Configure fallback chains so that if OpenAI hits a rate limit or experiences an outage, traffic shifts to Anthropic, Bedrock, or Vertex without dropping the session.
Semantic caching: Bifrost's semantic caching reduces token spend for semantically similar requests, which compounds quickly across a team using Claude Code daily.
Virtual keys and budgets: Virtual keys enforce per-developer or per-team budgets, rate limits, and model access policies. Engineering managers gain hierarchical cost control across teams and customers.
Observability: Bifrost exports OpenTelemetry traces and Prometheus metrics natively, so every Claude Code request is captured in Grafana, New Relic, or Datadog dashboards.
Guardrails: For regulated environments, enterprise guardrails integrate AWS Bedrock Guardrails, Azure Content Safety, and Patronus AI to enforce policy on Claude Code traffic.

For teams managing Claude Code at scale, the Claude Code integration page covers the full configuration matrix including AWS Bedrock passthrough and Google Vertex AI authentication.

Persistent Configuration for Daily Workflows

Adding the Bifrost environment variables to a single shell session works for testing. For persistent use, append the exports to ~/.bashrc, ~/.zshrc, or your shell's equivalent:

# Bifrost gateway
export ANTHROPIC_API_KEY=your-bifrost-virtual-key
export ANTHROPIC_BASE_URL=http://localhost:8080/anthropic

# Model tier overrides
export ANTHROPIC_DEFAULT_SONNET_MODEL="openai/gpt-5"
export ANTHROPIC_DEFAULT_OPUS_MODEL="openai/gpt-4o"
export ANTHROPIC_DEFAULT_HAIKU_MODEL="openai/gpt-4o-mini"

If you prefer running Claude Code inside VS Code, install the Claude Code extension and add the same variables to its settings. Bifrost handles every Claude Code surface (CLI, VS Code, JetBrains) identically because it operates at the protocol layer.

Get Started Running Claude Code with OpenAI Models

Routing Claude Code through Bifrost takes a few minutes and unlocks the full multi-provider model flexibility that engineering teams need in 2026. Once configured, you can run Claude Code with OpenAI models, Anthropic models, Bedrock-hosted Claude, Vertex Gemini, Groq-hosted open-weights, or local Ollama, all from the same Claude Code interface, with a single environment variable controlling the active model.

To see how Bifrost handles enterprise Claude Code deployments at scale, including clustering, RBAC, in-VPC deployments, and audit logging, book a demo with the Bifrost team or explore the Bifrost GitHub repository to start running Claude Code with OpenAI models today.

Running Claude Code with OpenAI Models: A Step-by-Step Setup Guide

Why Run Claude Code with OpenAI Models

How Bifrost Routes Claude Code Traffic to OpenAI

Step 1: Install and Start Bifrost

Step 2: Configure OpenAI as a Provider

Step 3: Point Claude Code at Bifrost

Step 4: Override Model Tiers to Use OpenAI Models

Step 5: Validate Tool-Call Behavior

Step 6: Add Failover, Caching, and Governance

Persistent Configuration for Daily Workflows

Get Started Running Claude Code with OpenAI Models

Read next

Best OpenRouter Alternative in 2026: A Production AI Gateway Comparison

Best AI Gateway to Manage Claude Code Cost in 2026

Semantic Caching for LLMs: Cut AI Costs and Latency with an Enterprise AI Gateway

Ship your AI agents 5x faster ⚡️