How to Connect Claude Code to an MCP Gateway

How to Connect Claude Code to an MCP Gateway

Connect Claude Code to an MCP gateway with Bifrost to centralize tool access, enforce governance, and cut token costs across every connected MCP server.

Claude Code has become one of the most widely used terminal coding agents for AI-assisted engineering, and its native support for the Model Context Protocol (MCP) lets it reach into filesystems, databases, GitHub, web search, and any number of internal tools. The problem starts when that number grows. Connecting Claude Code to a handful of MCP servers is trivial. Connecting it to fifteen, each with its own credentials, auth flow, and config block, is how teams end up with tool sprawl, no access control, and no cost visibility. An MCP gateway fixes that by sitting in front of every upstream server and exposing them to Claude Code through a single endpoint. Bifrost is the open-source AI gateway built for exactly this pattern.

What an MCP gateway is, and why Claude Code needs one

An MCP gateway is a control plane that sits between an MCP client (like Claude Code) and the MCP servers that provide tools. It aggregates tool discovery, centralizes authentication, enforces per-consumer access control, and logs every tool call in one place. Instead of Claude Code holding N separate MCP server configurations, it holds one: the gateway.

The Model Context Protocol was introduced by Anthropic in November 2024 as an open standard for connecting AI applications to external data and tools. In late 2025, MCP was donated to the Agentic AI Foundation under the Linux Foundation, co-founded by Anthropic, Block, and OpenAI, with participation from Google, Microsoft, AWS, Cloudflare, and Bloomberg. The ecosystem now includes thousands of community-built servers, and MCP has emerged as the de facto standard for wiring agents to tools and data.

Claude Code supports MCP natively through the claude mcp add command and its MCP configuration surface. That works well for a single developer connecting to a few known servers. It stops working for teams running production workflows, shared tooling, regulated environments, or large tool catalogs. At that point, the bottleneck is not the client, it is the architecture.

The operational problems a gateway solves for Claude Code

When Claude Code connects directly to each MCP server, a set of infrastructure problems gets pushed onto the client itself:

  • Credential sprawl: every MCP server carries its own API key, OAuth flow, or auth token, stored locally on every developer's machine.
  • No access control: there is no policy layer that decides which developer, team, or workflow can call which tool.
  • No audit trail: tool calls happen inside a single client's memory and never land in a shared log.
  • No cost visibility: when tools call paid external APIs (search, enrichment, code execution), the costs show up as separate vendor bills with no tie back to the agent run that incurred them.
  • Context bloat: every connected MCP server injects its full tool list into the model's context on every request, inflating token usage as the catalog grows.

A gateway collapses all of that into a single control point. Anthropic's engineering team has written about similar dynamics, noting that as agent tool catalogs grow, the context cost of loading every tool definition on every request becomes the dominant portion of spend.

How Bifrost acts as an MCP gateway for Claude Code

Bifrost is the open-source enterprise AI gateway by Maxim AI. It functions as both an MCP client (connecting upstream to filesystems, databases, search APIs, and internal services) and an MCP server that exposes those tools through a single endpoint. Claude Code connects to Bifrost once. Bifrost handles the rest.

Capabilities that matter for Claude Code users include:

  • Unified /mcp endpoint that aggregates every connected MCP server into one connection.
  • Virtual keys for scoping which tools are available to which consumer.
  • Tool-level filtering (not just server level), so filesystem_read can be granted without filesystem_write.
  • Code Mode for lazy-loading tool definitions into the model context to reduce token usage.
  • Audit logs for every tool call, including the tool name, MCP server, arguments, result, latency, and the virtual key that triggered it.
  • OAuth 2.0 with PKCE for upstream MCP servers, detected automatically by OAuth-capable clients.
  • Health monitoring with automatic reconnection for upstream servers.

Bifrost adds just 11 microseconds of overhead per request at 5,000 requests per second in published performance benchmarks, so the gateway never becomes the latency bottleneck.

How to connect Claude Code to Bifrost MCP Gateway

End-to-end setup takes a few minutes. Bifrost runs as an HTTP gateway with a built-in web UI.

Step 1: Run Bifrost locally

The fastest path is NPX or Docker:

# NPX
npx -y @maximhq/bifrost

# OR Docker
docker run -p 8080:8080 maximhq/bifrost

Once running, open http://localhost:8080 to access the dashboard. Bifrost also deploys to Kubernetes, Docker Swarm, or bare metal using the same image.

Step 2: Connect upstream MCP servers

In the Bifrost dashboard, navigate to the MCP section and add each upstream server you want Claude Code to reach. Give it a name, choose the connection type (STDIO, HTTP, SSE, or in-process), and enter the endpoint or command. For HTTP and SSE servers, add any required headers (API keys, auth tokens, custom metadata) directly in the UI. Bifrost connects to each server, discovers its tools, and starts syncing on the configured interval. Full configuration options are covered in the MCP connecting to servers guide.

Step 3: Create a virtual key scoped for Claude Code

Create a virtual key for the Claude Code user or team. Under the MCP settings for that key, select which tools are allowed. The scoping is per-tool, so you can grant crm_lookup_customer without granting crm_delete_customer from the same server. Any request made with that key only sees the tools it is permitted to see. For managing access across many keys at once, MCP Tool Groups let you define a named collection of tools and attach it to any combination of keys, teams, or users.

Step 4: Add Bifrost as an MCP server in Claude Code

Bifrost exposes all connected MCP servers through a single /mcp endpoint. Add it to Claude Code using the standard CLI:

claude mcp add --transport http bifrost <http://localhost:8080/mcp> \\
  --header "Authorization: Bearer vk_your_virtual_key"

For production deployments, point Claude Code at your deployed Bifrost URL (typically behind HTTPS via a reverse proxy) and use the virtual key appropriate for that user or environment. Claude Code will discover every tool from every MCP server connected to Bifrost, governed by the virtual key, through that one connection. Adding new upstream MCP servers to Bifrost surfaces them in Claude Code automatically, with no client-side config changes.

Step 5: Verify the connection

Inside Claude Code, run the /mcp command to see the list of connected servers. Bifrost appears as a single server, and its tool list reflects only the tools your virtual key can access. From here, Claude Code can call any of those tools as part of its agent loop.

Scoping access with virtual keys and tool filtering

Production Claude Code deployments rarely run with unrestricted tool access. Bifrost's MCP tool filtering operates at two levels:

  • Virtual key scoping: each key carries a set of tools it is allowed to call. A customer-facing integration cannot reach internal admin tooling just because both are connected to Bifrost.
  • MCP Tool Groups: a named collection of tools that can be attached to any combination of virtual keys, teams, customers, or providers. Bifrost merges and deduplicates allowed tools at request time.

Audit logging applies uniformly. Every tool call is a first-class log entry with the tool name, server, arguments, result, latency, virtual key, and parent LLM request that triggered it. For teams operating in regulated environments, this is what makes Bifrost's MCP gateway suitable for SOC 2, GDPR, HIPAA, and ISO 27001 audit scope.

Reducing token cost with Code Mode

One of the less obvious costs of running Claude Code with many MCP servers is context bloat. Every tool from every connected server is injected into the model's context on every request. Fifteen servers with thirty tools each means 450 tool definitions sent before Claude Code even sees a prompt. Cloudflare's engineering team documented the same dynamic when exploring a TypeScript-based code-execution approach to MCP.

Bifrost's Code Mode solves this by exposing MCP servers as a virtual filesystem of lightweight Python stub files. The model reads only the stubs it needs, writes a short script to orchestrate the tools, and Bifrost executes the script in a sandboxed Starlark interpreter. Anthropic's engineering team has reported context dropping from roughly 150,000 tokens to 2,000 on representative workflows when moving from classic MCP to code-execution-style orchestration.

In Bifrost's published Code Mode benchmarks, at 508 tools across 16 MCP servers, Code Mode reduced input tokens by 92.8% and cost by 92.2%, with 100% pass rate held across the test suite. Classic MCP loads every tool definition on every request, so connecting more servers makes the problem worse. Code Mode's cost is bounded by what the model actually reads, not by how many tools exist.

Best practices for production Claude Code deployments

A few patterns hold across most production rollouts:

  • One virtual key per user or environment. Do not share keys across developers or between production and staging.
  • Start with an allowlist, not a denylist. Grant only the tools a workflow actually needs.
  • Enable enforce_auth_on_inference. This ensures every MCP request requires a valid virtual key.
  • Deploy Bifrost behind HTTPS. Terminate TLS at a reverse proxy (nginx, Cloudflare, etc.) in front of the gateway.
  • Turn on Code Mode for large tool catalogs. Savings compound with catalog size; at hundreds of tools, Code Mode is the difference between a sustainable and an unsustainable bill.
  • Route all LLM traffic through the same gateway. When model calls and tool calls flow through one control plane, every agent run produces a complete picture: model tokens and tool costs together, under one access control model, in one audit log.

For configuration templates, model routing patterns, and observability integrations specific to Claude Code, see the Claude Code integration resource page.

Get started with Bifrost MCP Gateway

Connecting Claude Code to an MCP gateway is a small configuration change with an outsized operational payoff. One endpoint replaces N configurations, tool access lives behind a virtual key, every tool call lands in an audit log, and Code Mode keeps token costs flat as the tool catalog grows. Bifrost is open source under Apache 2.0 and runs in a single command. To see how Bifrost fits a specific team's Claude Code deployment at scale, including enterprise features like clustering, audit logs, guardrails, vault support, and in-VPC deployments, book a demo with the Bifrost team.