MCP Gateway

Why MCP Needs a Governance Layer: Access Control, Audit, and Cost

An MCP governance layer gives enterprise AI teams tool-level access control, audit trails, and cost visibility across every connected Model Context Protocol server.

Model Context Protocol (MCP) adoption has moved faster than the infrastructure needed to govern it. Teams connect an MCP server for file access, another for search, a third for internal APIs, then ten more, and within weeks an AI agent has a reach that no engineer would be handed on day one. An MCP governance layer is the missing control plane that decides which tools each agent can call, who is calling them, what each call returns, and what it costs. Bifrost, the open-source AI gateway by Maxim AI, provides this layer through a single MCP gateway that sits between your models and every connected MCP server.

The stakes are not hypothetical. The 2025 OWASP Top 10 for LLM Applications lists Excessive Agency as a top production risk, citing excessive functionality, excessive permissions, and excessive autonomy as the three root causes. MCP amplifies each of them.

What is an MCP Governance Layer

An MCP governance layer is the infrastructure between AI agents and the MCP servers they use. It enforces per-tool access control, records every tool invocation as an auditable event, and tracks both token and tool-level cost across all connected servers. It replaces scattered, per-server security with a single policy and observability plane, so teams can scale from one MCP server to dozens without losing control of any of it.

Why Ungoverned MCP Deployments Break at Scale

As soon as MCP moves from a local developer setup to a shared production environment, three structural problems surface.

Excessive agency. Agents are handed more tools than they need, often with broader permissions than the task requires. This is the exact failure mode OWASP catalogs under LLM06.
Tool poisoning and indirect prompt injection. Malicious or compromised MCP servers can embed hidden instructions inside tool descriptions, which the model reads and treats as authoritative. Microsoft's developer team has documented how tool poisoning works in MCP and why client-side validation alone is insufficient.
Unbounded token consumption. Every MCP tool definition from every connected server gets injected into the model's context on every single request. A 2025 Anthropic engineering post on code execution with MCP shows one Google Drive to Salesforce workflow dropping from 150,000 tokens to 2,000 tokens once tool definitions stopped being loaded on every turn.

Without a governance layer, none of these problems have a central place to be solved. Each MCP server becomes its own island of access policy, logging, and cost.

Access Control: Scoping What Agents Can Actually Do

Access control for MCP has to operate at the tool level, not the server level. A single MCP server can expose filesystem_read alongside filesystem_write, or crm_lookup_customer alongside crm_delete_customer. Server-level allowlists treat these as a single unit, which defeats least privilege from the start.

Bifrost handles this through virtual keys and MCP Tool Groups.

Virtual keys are scoped credentials issued to each consumer of the gateway: a user, a team, an internal application, or a customer integration. Each key carries an explicit list of the MCP tools it is allowed to call. The model attached to a key never sees definitions for tools outside that scope, so there is no prompt-level workaround.
MCP Tool Groups are named collections of tools that can be attached to any combination of keys, teams, customers, or providers. At request time Bifrost resolves the right set in memory, with no database queries, and merges overlapping groups deterministically.

This pattern aligns with where the broader MCP ecosystem is heading. The MCP specification itself has evolved to require OAuth 2.1 with PKCE, and identity providers are now beginning to treat MCP as a first-class authorization surface. A governance layer is where those standards are enforced consistently, regardless of which upstream MCP servers support them natively.

Audit Logging: Making Agent Actions Traceable

The moment an AI agent can call production tools, every invocation has to be a first-class audit event, not a side effect of request logging.

Bifrost captures each MCP tool execution with:

Tool name and the MCP server it came from
Arguments passed in and the result returned
Latency of the tool call
The virtual key that triggered the request
The parent LLM request that initiated the agent loop

Teams can pull up any agent run and trace the exact sequence of tool calls, or filter by virtual key to audit what a specific team or customer has been running. Content logging can be disabled per environment when arguments or results carry sensitive data, while metadata (tool name, server, latency, status) is always captured.

This matters beyond debugging. Immutable audit trails are required for SOC 2, HIPAA, GDPR, and ISO 27001 programs, and auditors increasingly expect them to cover AI tool invocations, not just API calls. Bifrost's enterprise audit logs are built for that expectation, with per-environment retention and export to downstream SIEM and data lake tooling.

Cost Control: Token Bloat and Tool Call Economics

MCP cost has two components that a governance layer has to address together: the token cost of loading tool definitions, and the real-dollar cost of the tools themselves.

The token bloat problem

The default MCP execution model injects every tool definition from every connected server into the model's context on every single request. Five servers with thirty tools each means 150 tool definitions shipped before the user's prompt is even read. Industry research has converged on a fix: agents write code against the tool catalog instead of receiving the full catalog on every turn. Both Anthropic's engineering team and Cloudflare have published detailed analyses of the approach.

Bifrost implements this natively through Code Mode. Instead of dumping every tool definition into context, Code Mode exposes MCP servers as a virtual filesystem of lightweight Python stubs. The model reads only what it needs through four meta-tools (listToolFiles, readToolFile, getToolDocs, executeToolCode), and Bifrost executes the resulting script in a sandboxed Starlark interpreter. Bifrost's controlled MCP benchmarks show input token usage dropping 92.8% at 508 tools across 16 servers, with pass rate held at 100%.

The tool call economics problem

Not every MCP tool is free. Search APIs, enrichment vendors, code execution services, and paid data providers each carry a per-call price. Bifrost tracks cost at the tool level using a pricing configuration defined per MCP client, and surfaces those costs alongside LLM token costs in the same log view. Teams see the complete cost of an agent run, not just the model portion.

What an MCP Governance Layer Looks Like in Practice

A governance layer that scales has five non-negotiable properties:

A single endpoint. All connected MCP servers sit behind one /mcp URL that agents connect to. New servers appear without client-side reconfiguration.
Per-tool authorization. Access is scoped at the tool level, enforced by the gateway, and invisible to anything outside scope.
Unified audit. Every LLM call and every tool call land in one log model, correlated by request ID.
Cost visibility across both layers. Token spend and tool spend are reported together, broken down by virtual key, team, and provider.
Standards-based authentication. OAuth 2.0 with PKCE, identity-provider integration, and automatic token refresh, rather than static bearer tokens shared across services.

Bifrost centralizes all five inside its governance stack, which covers MCP traffic, model traffic, and the identity and budget primitives that unify them.

Getting Started with Bifrost MCP Gateway

MCP is now the default interface between AI agents and enterprise systems, and ungoverned deployments do not stay ungoverned quietly. Access drift, audit gaps, and runaway token bills all compound as the number of connected servers grows. An MCP governance layer is how teams move from early experimentation to production AI infrastructure without giving up control of what their agents can do, what those actions cost, or how those actions are recorded. To see how Bifrost's MCP governance layer fits with your existing agents and servers, book a demo with the Bifrost team.