Try Bifrost Enterprise free for 14 days. Request access

Route Claude Code Through Azure Using Bifrost

Route Claude Code Through Azure Using Bifrost
Bifrost is an open-source AI gateway that lets teams point Claude Code at Azure-hosted models with a single environment variable change, adding governance, observability, and provider flexibility on top.

Many enterprise engineering teams run Azure OpenAI Service for compliance, data residency, and procurement reasons. Claude Code, by default, only targets Anthropic's API. Routing Claude Code traffic through Azure requires a translation layer that speaks both Anthropic's Messages API and Azure's deployment-based endpoint format. Bifrost, the open-source Go-based AI gateway by Maxim AI, handles that translation automatically, with sub-12µs added overhead at 5,000 RPS.

This guide covers how to configure the Azure provider in Bifrost and point Claude Code at it using provider-specific model pinning.


Why Route Claude Code Through Azure

Claude Code's native configuration routes all requests to api.anthropic.com. For teams with Azure-first infrastructure, that creates friction at several levels:

  • Data residency: Some organizations require that LLM traffic stays within a specific Azure region or sovereign cloud boundary.
  • Existing procurement: Azure credits, enterprise agreements, and cost tracking infrastructure are already in place.
  • Compliance posture: Regulated industries (financial services, healthcare, government) often require traffic to pass through approved, audited cloud infrastructure.
  • Multi-provider resilience: Teams that want Azure as a primary provider with Anthropic as a fallback, or vice versa, need a routing layer that handles the protocol differences transparently.

Bifrost sits between Claude Code and Azure OpenAI Service. Claude Code sends standard Anthropic Messages API requests to Bifrost. Bifrost maps the model name to the correct Azure deployment ID, converts the request format as needed, forwards it to your Azure endpoint, and returns the response in Anthropic format. Claude Code never needs to know the difference.


Prerequisites

Before configuring the integration, you need:

  • Bifrost running locally or on a server (see the gateway setup docs)
  • An Azure OpenAI resource with at least one model deployment (Claude models or OpenAI models such as GPT-4o)
  • One of the following credentials: an Azure API key, an Entra ID service principal, or a managed identity attached to the host infrastructure
  • Claude Code installed: curl -fsSL <https://claude.ai/install.sh> | bash

Step 1: Configure the Azure Provider in Bifrost

Bifrost supports three authentication methods for Azure: API key, Entra ID service principal, and managed identity via DefaultAzureCredential. The right choice depends on how your Azure environment is set up.

Authentication method: API key

For most individual setups and development environments, direct API key authentication is the fastest path.

Using the Bifrost REST API:

# Register the Azure provider
curl -X POST <http://localhost:8080/api/providers> \
  -H "Content-Type: application/json" \
  -d '{"provider": "azure"}'

# Add a key with deployment aliases
curl -X POST <http://localhost:8080/api/providers/azure/keys> \
  -H "Content-Type: application/json" \
  -d '{
    "name": "azure-api-key",
    "value": "env.AZURE_API_KEY",
    "models": ["*"],
    "weight": 1.0,
    "aliases": {
      "claude-sonnet-4-6": "my-claude-sonnet-deployment",
      "claude-haiku-4-5": "my-claude-haiku-deployment",
      "gpt-4o": "my-gpt4o-deployment"
    },
    "azure_key_config": {
      "endpoint": "env.AZURE_ENDPOINT",
      "api_version": "2024-10-21"
    }
  }'

Or via config.json:

{
  "providers": {
    "azure": {
      "keys": [
        {
          "name": "azure-api-key",
          "value": "env.AZURE_API_KEY",
          "models": ["*"],
          "weight": 1.0,
          "aliases": {
            "claude-sonnet-4-6": "my-claude-sonnet-deployment",
            "claude-haiku-4-5": "my-claude-haiku-deployment"
          },
          "azure_key_config": {
            "endpoint": "env.AZURE_ENDPOINT",
            "api_version": "2024-10-21"
          }
        }
      ]
    }
  }
}

The aliases field (available from Bifrost v1.5.0-prerelease2 onward) maps model names Claude Code sends to Azure deployment IDs. If no alias matches, Bifrost falls back to using the model name directly as the deployment ID.

Authentication method: Entra ID service principal

For production environments, Entra ID service principal authentication avoids storing static API keys and integrates with Azure's RBAC model.

{
  "providers": {
    "azure": {
      "keys": [
        {
          "name": "azure-entra-key",
          "value": "",
          "models": ["*"],
          "weight": 1.0,
          "aliases": {
            "claude-sonnet-4-6": "my-claude-sonnet-deployment",
            "gpt-4o": "my-gpt4o-deployment"
          },
          "azure_key_config": {
            "endpoint": "env.AZURE_ENDPOINT",
            "client_id": "env.AZURE_CLIENT_ID",
            "client_secret": "env.AZURE_CLIENT_SECRET",
            "tenant_id": "env.AZURE_TENANT_ID",
            "scopes": ["<https://cognitiveservices.azure.com/.default>"],
            "api_version": "2024-08-01-preview"
          }
        }
      ]
    }
  }
}

Required Azure roles for the service principal:

  • OpenAI models: Cognitive Services OpenAI User
  • Anthropic (Claude) models: Cognitive Services AI Services User

Authentication method: managed identity

For deployments running on Azure infrastructure (AKS, App Service, Azure VMs), leave value and all Entra ID fields empty. Bifrost calls DefaultAzureCredential, which checks environment variables, workload identity, and managed identity in that order. No credentials need to be stored or rotated.

{
  "azure_key_config": {
    "endpoint": "env.AZURE_ENDPOINT",
    "api_version": "2024-10-21"
  }
}

Authentication precedence is fixed: Entra ID (if all three fields are set) takes priority over API key, which takes priority over DefaultAzureCredential.


Step 2: Create a Virtual Key for Claude Code

Virtual keys are Bifrost's access control layer. Claude Code authenticates against Bifrost using a virtual key rather than a raw Azure credential, which means individual developers never hold production Azure secrets.

Create a virtual key in the Bifrost dashboard under Governance > Virtual Keys, or via the API:

curl -X POST <http://localhost:8080/api/virtual-keys> \
  -H "Content-Type: application/json" \
  -d '{
    "name": "claude-code-azure",
    "allowed_providers": ["azure"],
    "budget": {
      "monthly_limit_usd": 100
    }
  }'

The response includes a bf- prefixed virtual key string. Copy it for the next step.

Virtual keys support budget and rate limits at the key, team, and customer levels, so enterprise teams can enforce per-developer spending caps without modifying Azure credentials or policies.


Step 3: Configure Claude Code to Use Bifrost

Claude Code uses ANTHROPIC_BASE_URL to override the default Anthropic API endpoint. Setting it to http://localhost:8080/anthropic (or your Bifrost instance URL) redirects all traffic through Bifrost.

Update ~/.claude/settings.json (global) or .claude/settings.json (project-level):

"env": {
  "ANTHROPIC_BASE_URL": "<http://localhost:8080/anthropic>",
  "ANTHROPIC_AUTH_TOKEN": "bf-your-virtual-key",
  "ANTHROPIC_DEFAULT_HAIKU_MODEL": "azure/claude-haiku-4-5",
  "ANTHROPIC_DEFAULT_SONNET_MODEL": "azure/claude-sonnet-4-6"
}

The azure/ prefix tells Bifrost which provider to route the request to. Bifrost strips the prefix, looks up the deployment alias for claude-haiku-4-5 or claude-sonnet-4-6, and forwards the request to the Azure endpoint.

Why ANTHROPIC_AUTH_TOKEN is recommended: this method sends the virtual key as a Bearer token in the Authorization header. Claude Code does not need to log in to an Anthropic account. Bifrost handles authentication entirely through the virtual key.

After saving settings.json, restart Claude Code. You can verify the active model inside a session with /model (no arguments).

Verifying the connection

Start a session and run a simple prompt. In the Bifrost dashboard under Logs, you should see a request with provider azure and the deployment name you configured. All agent interactions are logged at http://localhost:8080/logs through the built-in observability interface.


Launching Claude Code Without Environment Variables: Bifrost CLI

Once the Azure provider is configured in Bifrost and a virtual key is created (Steps 1 and 2 above), there are two ways to connect Claude Code to the gateway. Step 3 showed the manual approach via settings.json. The Bifrost CLI is the alternative: an interactive terminal tool that sets all environment variables and launches Claude Code automatically, with no file editing required.

The CLI does not replace or bypass the gateway setup. It requires a running Bifrost gateway and connects to it at the URL you specify. What it replaces is the manual step of setting ANTHROPIC_BASE_URL, ANTHROPIC_AUTH_TOKEN, and model pins in settings.json.

Install and run With your Bifrost gateway already running, open a second terminal and run:

npx -y @maximhq/bifrost-cli

No global install is required. npx downloads and runs the latest version.

Interactive setup flow

The CLI walks through five prompts:

  1. Base URL: enter your Bifrost gateway URL (default: http://localhost:8080)
  2. Virtual key: enter the bf- prefixed virtual key created in Step 2, or press Enter to skip
  3. Harness: select Claude Code from the agent list
  4. Model: the CLI fetches available models from your gateway's /v1/models endpoint; type to filter and select an Azure-hosted model such as azure/claude-sonnet-4-6
  5. Launch: review the summary and press Enter

The CLI sets ANTHROPIC_BASE_URL, ANTHROPIC_AUTH_TOKEN, and the model pins automatically, then launches Claude Code. It also auto-attaches Bifrost's MCP server to Claude Code so tool access is available without a separate claude mcp add command.

Switching models between sessions

The CLI runs inside a persistent tabbed terminal UI. When a Claude Code session ends, the CLI returns to the summary screen with the previous configuration intact. Press m to pick a different model (including switching between Azure deployments or other configured providers) and press Enter to re-launch.

Virtual keys are stored in the OS keyring (macOS Keychain, Windows Credential Manager, or Linux Secret Service), not written to any config file on disk.


Deployment Alias and Routing Details

Azure requires every model to map to a named deployment. Bifrost resolves this in three layers (highest to lowest priority):

  1. Per-request: { "deployment": "custom-deployment" } in the request body
  2. Key-level aliases: the aliases map in the key configuration
  3. Model name fallback: the model name itself is used as the deployment ID

For Claude Code integrations, key-level aliases are the right pattern. Teams running multiple deployments (for example, a standard-throughput and a provisioned-throughput deployment of the same model) can configure separate virtual keys pointing to different alias sets.

Azure also handles OpenAI and Anthropic models differently. When Bifrost detects an Anthropic model name (any claude-* prefix), it routes to /anthropic/v1/ endpoints and sets the API version to 2023-06-01 automatically. For OpenAI models, it routes to /openai/deployments/{id}/chat/completions. This detection is transparent to Claude Code.


Adding Fallback to Direct Anthropic

For teams that want Azure as the primary provider but need Anthropic as a fallback, Bifrost's retries and fallbacks handle this at the gateway level. Add both providers to the Bifrost configuration and use provider routing to set priority order.

When Azure returns a 429 or 5xx, Bifrost retries against the next configured provider automatically. Claude Code receives a successful response either way, with no session interruption.


Governance, Observability, and Compliance

Routing Claude Code through Bifrost adds a governance layer that Azure OpenAI Service alone does not provide at the per-developer level:

  • Virtual keys: each developer or team gets a scoped credential with its own budget and model access restrictions
  • Audit logs: immutable per-request records suitable for SOC 2, HIPAA, and ISO 27001 reviews (Enterprise tier)
  • RBAC: role-based access control integrated with Okta and Microsoft Entra SSO (Enterprise tier)
  • Observability: native Prometheus metrics, OpenTelemetry tracing, and real-time request monitoring via the built-in log dashboard

For regulated industries, Bifrost also supports in-VPC deployments where the gateway itself runs inside the same Azure Virtual Network as the Azure OpenAI resource, keeping all LLM traffic off the public internet entirely.

Teams evaluating enterprise AI gateway capabilities in full detail can reference the LLM Gateway Buyer's Guide for a structured capability comparison.


Connecting Claude Code to MCP Tools Through Bifrost

Beyond model routing, Bifrost exposes all configured MCP servers through a single /mcp endpoint. Claude Code can connect to this endpoint and access every MCP tool Bifrost manages, including filesystem, GitHub, databases, and custom tools, without configuring each server individually.

Add Bifrost as an MCP server in Claude Code:

claude mcp add --transport http bifrost <http://localhost:8080/mcp> \
  --header "Authorization: Bearer bf-your-virtual-key" \
  --scope user

MCP tool filtering is enforced per virtual key. Each Claude Code user only sees the tools their virtual key is authorized to access. For deeper coverage of the MCP gateway pattern, see the Bifrost MCP gateway resource page.


Summary: Azure Routing in Four Steps

  1. Start Bifrost locally or on your server
  2. Add the Azure provider with your endpoint, credentials, and deployment aliases
  3. Create a virtual key with Azure access enabled
  4. Connect Claude Code to the gateway: edit settings.json manually (Step 3), or run npx -y @maximhq/bifrost-cli to have the CLI set environment variables and launch Claude Code interactively

After that, Claude Code routes through Azure transparently. Token usage, latency, and provider activity appear in the Bifrost log dashboard. All governance rules configured at the virtual key level apply automatically to every Claude Code session using that key.

For enterprise deployments with clustering, cross-region routing, RBAC, and audit logging, explore Bifrost Enterprise or book a demo with the Bifrost team.