Try Bifrost Enterprise free for 14 days.
Request access
[ MCP GATEWAY ]

Turn AI Models into
Action-Capable Agents

Enable AI models to discover and execute external tools dynamically with the fastest open-source MCP gateway that delivers 11µs overhead and complete security control without automatic execution.

[ PERFORMANCE AT A GLANCE ]

11µs
Internal Overhead
Ultra-low latency at high throughput
50%+
Token Savings
With Code Mode vs classic MCP
40%
Faster Execution
Code Mode execution pipeline
20+
Provider Support
LLM providers supported

[ ARCHITECTURE ]

Complete MCP Gateway Solution

Bifrost acts as both an MCP client (connecting to external tool servers) and an MCP server (exposing tools to external clients like Claude Desktop) through a single deployment.

MCP Client

Bifrost connects to your external MCP servers - filesystem tools, web search, databases, custom APIs, and discovers their capabilities automatically.

  • Auto-discover tools from any MCP server
  • STDIO, HTTP, and SSE transports
  • OAuth 2.0 with automatic token refresh
  • Tool filtering and access control

MCP Server

Bifrost exposes all connected tools through a single gateway URL. MCP clients like Claude Desktop connect to Bifrost and access everything.

  • Single gateway URL for all tools
  • Claude Desktop, Cursor, and any MCP client
  • Unified tool discovery and execution
  • Centralized security and audit trails
Your Application
Chat completions API
Bifrost Gateway
MCP Client + Server
MCP Servers
Filesystem, DB, APIs

[ CORE CAPABILITIES ]

How MCP Works in Bifrost

Connect, secure, filter, and execute tools with explicit approval workflows, autonomous agent mode, and Code Mode for high-efficiency orchestration.

Connect to MCP servers

Connect to any MCP-compliant server via STDIO, HTTP, or SSE. Bifrost auto-discovers tools and their schemas at runtime so your AI models can use them immediately.

STDIO + HTTP + SSE

OAuth Authentication

Secure OAuth 2.0 authentication with automatic token refresh

OAuth 2.0 with automatic token refresh

Explicit tool execution

Tool calls from LLMs are suggestions only. Execution requires a separate API call, giving your app full control to validate, filter, and approve every action before it runs.

Security-first

Agent Mode

Enable autonomous multi-step tool execution with configurable auto-approval. Specify exactly which tools can auto-execute while keeping human oversight for sensitive operations.

Configurable auto-approval

Code Mode

AI writes Python to orchestrate multiple tools. Four meta-tools replace 100+ definitions with on-demand schema loading and sandbox execution. Cuts tokens by 50%+ and LLM calls by 3-4x.

Token efficiency

MCP Gateway URL

A single endpoint for tool discovery, execution, and management.

Single gateway URL

[ HOW IT WORKS ]

Stateless Tool Calling with Explicit Approval

The default tool calling pattern is stateless with explicit execution. No unintended API calls, no accidental data modifications, full audit trail of every operation.

Step 01

Register MCP servers

Connect Bifrost to any MCP-compliant server. Bifrost auto-discovers available tools and their schemas at startup.

Terminal
1$# bifrost config
2$mcp_servers:
3$ - name: filesystem
4$ transport: stdio
5$ command: npx @modelcontextprotocol/server-filesystem
Step 02

Send a chat request

Your app sends a standard chat completion request. Bifrost injects discovered MCP tools into the request automatically.

Terminal
1$curl http://localhost:8080/v1/chat/completions \
2$ -H "Content-Type: application/json" \
3$ -d '{"model": "claude-sonnet", "messages": [...]}'
Step 03

Execute tool calls

When the LLM suggests a tool call, your app decides whether to execute it. Bifrost handles the MCP protocol and returns results.

Terminal
1$# tool call returned in response
2$# your app approves → Bifrost executes
3$# full audit trail logged automatically
No automatic execution: Tool calls from LLMs are suggestions, your app decides what runs.
Full audit trail: Every tool suggestion, approval, and execution is logged with metadata.
Stateless design: Each API call is independent, your app controls conversation state entirely.

[ CODE MODE ]

50% Fewer Tokens.
40% Faster Execution.

If you're using 3+ MCP servers, classic tool calling becomes expensive. Every request sends all tool schemas to the LLM, burning tokens on definitions instead of work.

Code Mode takes a different approach: instead of exposing 100+ tool definitions, the AI writes Python code to orchestrate tools in a sandboxed environment. One round-trip handles what would take multiple sequential tool calls.

AI generates Python to orchestrate multiple tools
Sandboxed execution with full error handling
One round-trip replaces sequential tool calls
Scales to any number of MCP servers
Classic MCP
Token usageHigh
Round-trips per workflowN tools = N calls
ScalabilityDegrades at 3+ servers
Bifrost Code Mode
Token usage50%+ reduction
Round-trips per workflow1 round-trip
ScalabilityAny number of servers

[ SECURITY-FIRST DESIGN ]

STDIO, HTTP, and SSE Support

By default, Bifrost does NOT automatically execute tool calls. All tool execution requires explicit API calls from your application, ensuring human oversight for every operation.

Explicit execution

Tool calls from LLMs are suggestions only. Execution requires a separate API call from your application.

Granular control

Filter tools per-request, per-client, or per-virtual-key. Blacklist dangerous tools globally.

Opt-in auto-execution

Agent Mode with auto-execution must be explicitly configured. Specify exactly which tools are allowed.

Stateless design

Each API call is independent. Your app controls conversation state with full audit trails at every step.

[ COMPARISON ]

Enterprise-Grade Security Controls

Standard MCP tool calling works, but it doesn't scale. Code Mode solves the hard problems.

DimensionClassic MCPBifrost Code Mode
Tool definition overhead100+ tool schemas sent every requestAI writes code to call tools
Token usageHigh (all tool schemas in context)50%+ reduction
Execution latencyMultiple round-trips per tool40% faster execution
Multi-tool orchestrationSequential tool calls onlyPython orchestrates in one pass
Scalability with serversDegrades with 3+ serversScales to any number
Error handlingLLM retries each tool callPython try/catch in sandbox

[ TRANSPORT PROTOCOLS ]

What You Can Build

STDIO

Local process execution via stdin/stdout.

Local tools
  • Filesystem operations
  • Code search
  • Dev scripts

HTTP

Remote MCP servers via HTTP requests.

Microservices
  • Database tools
  • Internal APIs
  • Authentication

SSE

Persistent streaming for real-time data.

Live data
  • Monitoring
  • Live dashboards
  • Streaming

[ USE CASES ]

Deployment Options

Agentic coding pipelines

Connect AI coding agents to filesystem tools, databases, and deployment pipelines. Bifrost handles tool injection transparently with full audit trails for every operation.

Regulated enterprise environments

Deploy in healthcare, finance, or government with explicit approval workflows, PII redaction, and tamper-evident audit logs for SOC 2 and HIPAA compliance.

Multi-tool orchestration

Coordinate filesystem operations, database queries, and API calls in a single request using Code Mode. Reduce token waste and latency when using 3+ MCP servers.

DevOps & infrastructure automation

Supervised infrastructure actions and deployments with role-based tool access. Only approved tools execute, with complete visibility into every automated step.

Centralized tool governance

Manage tool access across teams with virtual keys and per-key tool filtering. Set different tool policies for development, staging, and production environments.

Claude Desktop & MCP clients

Expose your entire tool ecosystem through a single Bifrost gateway URL. Claude Desktop and other MCP clients connect once and discover all available tools automatically.

[ WHY BIFROST ]

The Fastest Open-Source MCP Gateway

11µs overhead at 5,000 requests per second

Stateless architecture with explicit approval

Code Mode: 50% fewer tokens, 40% faster execution

Dual role: MCP Client and MCP Server

Built-in OAuth 2.0 with automatic token refresh

Production-proven at millions of requests/day

Complete audit trails and OpenTelemetry export

Open source (Apache 2.0) with enterprise support

Go-native with zero Python GIL bottleneck

Build production AI agents with Bifrost

Get enterprise-grade MCP gateway performance with explicit security controls, Code Mode for token efficiency, and a single gateway URL for your entire tool ecosystem.

[ BIFROST FEATURES ]

Open Source & Enterprise

Everything you need to run AI in production, from free open source to enterprise-grade features.

01 Model Catalog

Access 8+ providers and 1000+ AI models from multiple providers through a unified interface. Also support custom deployed models!

02 Budgeting

Set spending limits and track costs across teams, projects, and models.

03 Provider Fallback

Automatic failover between providers ensures 99.99% uptime for your applications.

04 MCP Gateway

Centralize all MCP tool connections, governance, security, and auth. Your AI can safely use MCP tools with centralized policy enforcement. Bye bye chaos!

05 Virtual Key Management

Create different virtual keys for different use-cases with independent budgets and access control.

06 Unified Interface

One consistent API for all providers. Switch models without changing code.

07 Drop-in Replacement

Replace your existing SDK with just one line change. Compatible with OpenAI, Anthropic, LiteLLM, Google Genai, Langchain and more.

08 Built-in Observability

Out-of-the-box OpenTelemetry support for observability. Built-in dashboard for quick glances without any complex setup.

09 Community Support

Active Discord community with responsive support and regular updates.

[quick setup]

Drop-in replacement for any AI SDK

Change just one line of code. Works with OpenAI, Anthropic, Vercel AI SDK, LangChain, and more.

1import os
2from anthropic import Anthropic
3
4anthropic = Anthropic(
5 api_key=os.environ.get("ANTHROPIC_API_KEY"),
6 base_url="https://<bifrost_url>/anthropic",
7)
8
9message = anthropic.messages.create(
10 model="claude-3-5-sonnet-20241022",
11 max_tokens=1024,
12 messages=[
13 {"role": "user", "content": "Hello, Claude"}
14 ]
15)
Drop in once, run everywhere.