AI Governance

Governing Claude Code and Cursor at Enterprise Scale

Model allowlists, per-developer budgets, and tool restrictions for AI coding agents are infrastructure problems. Bifrost enforces all three at the gateway layer across every Claude Code and Cursor session.

Anthropic's own enterprise deployment data puts the average Claude Code cost at $13 per developer per active day, with $150 to $250 per developer per month at scale. For a 200-engineer organization, that translates to $30,000 to $50,000 in monthly spend on Claude Code alone, before accounting for Cursor, Codex CLI, Gemini CLI, or any other AI-assisted coding tool the team has adopted. The variance is the harder problem: the same data shows 10% of users regularly exceeding $30 per active day, and teams running autonomous agent workflows see multipliers on top of that. Without a centralized enforcement layer, there is no practical way to cap a developer's daily spend, restrict which models they can call, or audit which tools their sessions invoked. Bifrost, the open-source AI gateway built in Go by Maxim AI, is the infrastructure layer that closes this gap: every Claude Code and Cursor request routes through Bifrost, where model allowlists, per-key budgets, rate limits, and tool restrictions are applied before any token reaches a provider.

The Governance Problem with AI Coding Agents

Claude Code and Cursor both authenticate against provider APIs using credentials stored in local environment variables or configuration files. In an ungoverned deployment, each developer holds a raw Anthropic or OpenAI API key with no ceiling on spend, no restriction on which models they can request, no visibility into which MCP tools their sessions invoke, and no centralized record of what happened.

Multiply this across a 50-developer team and the governance surface has 50 separate credential-holding endpoints to monitor, each with its own session history that accumulates nowhere visible to the platform team. A single developer running an intensive autonomous refactor session over a weekend can generate more than $200 in charges before anyone is aware of it. Gartner forecasts that 40% of enterprise applications will embed task-specific AI agents by end of 2026; agentic coding tools are the earliest and most pervasive example of this transition. Model selection is uncontrolled: a developer on a free-tier approved model can override to a premium model in one environment variable change. MCP tool access is unscoped: Claude Code can reach any configured server with no per-developer restriction unless the gateway enforces it.

The standard remediation attempts, acceptable-use policies and manual key rotation, do not work at scale for the same reason that any policy-layer control fails against infrastructure-layer behavior: the policy has to be communicated, remembered, and voluntarily followed by every person in the organization, every session. The Bifrost governance model addresses this by moving enforcement out of policy documents and into the request path.

How Bifrost Intercepts Coding Agent Traffic

Both Claude Code and Cursor authenticate using standard API key headers. Claude Code sends its credential in the Authorization: Bearer header via ANTHROPIC_AUTH_TOKEN. Cursor accepts an API key and a base URL override in its provider configuration. Neither requires any modification to its agent logic to route through Bifrost: the only change is the endpoint and the credential.

For Claude Code, the developer sets:

"env": {
  "ANTHROPIC_BASE_URL": "<http://bifrost.internal:8080/anthropic>",
  "ANTHROPIC_AUTH_TOKEN": "bf-vk-developer-alice-001"
}

The ANTHROPIC_AUTH_TOKEN value is a Bifrost virtual key, not an Anthropic API key. Claude Code sends it in the Authorization: Bearer header. Bifrost intercepts it, resolves the virtual key to the correct provider, applies every configured governance rule, and forwards the request. No Anthropic credential is held by the developer.

For Cursor, the configuration is equivalent: set the API key field to the Bifrost virtual key and the base URL to the Bifrost endpoint. Cursor uses the OpenAI-compatible API surface that Bifrost exposes uniformly across all providers.

Once all developer tooling routes through Bifrost, the governance model is centrally enforced. Developer credentials become virtual keys that the platform team issues, scopes, and revokes. Provider API keys stay in the gateway and never reach developer workstations.

Model Allowlists: Restricting Which Models Coding Agents Can Call

The most common governance requirement for coding agent deployments is restricting model access. Organizations approve specific models for specific roles: junior engineers are provisioned with a cost-efficient model, senior engineers get access to a more capable model, and only approved accounts can call the highest-tier reasoning models.

Bifrost implements this through allowed_models on each provider configuration within a virtual key. When a developer's Claude Code session requests a model not in the allowlist, the gateway returns a 403 before the request reaches the provider.

{
  "virtual_keys": [{
    "id": "vk-eng-junior",
    "name": "Junior Engineering Pool",
    "provider_configs": [{
      "provider": "anthropic",
      "weight": 1.0,
      "allowed_models": ["claude-haiku-4-5-20251001"]
    }]
  }, {
    "id": "vk-eng-senior",
    "name": "Senior Engineering Pool",
    "provider_configs": [{
      "provider": "anthropic",
      "weight": 0.6,
      "allowed_models": ["claude-sonnet-4-6", "claude-haiku-4-5-20251001"]
    }, {
      "provider": "bedrock",
      "weight": 0.4,
      "allowed_models": ["bedrock/global.anthropic.claude-sonnet-4-6"]
    }]
  }]
}

Claude Code uses two internal model slots, ANTHROPIC_DEFAULT_SONNET_MODEL and ANTHROPIC_DEFAULT_HAIKU_MODEL, which can be pinned to any provider-model combination in the developer's settings.json. Bifrost's routing rules support dynamic aliasing: platform teams define aliases like sonnet-model or haiku-model at the gateway level and route them to different actual models by scope, header, or any other request attribute. This means developers can have standardized model names in their configuration while the platform team controls what those names resolve to, per role, per environment, or per policy period.

For Bifrost Enterprise, Access Profiles codify this into reusable, declaratively-seeded policies covering provider restrictions, model allowlists, budgets, rate limits, and MCP tool controls in a single profile object that auto-allocates virtual keys at scale.

Per-Developer Budgets: Hard Caps on Daily and Monthly Spend

Virtual key budgets in Bifrost attach a spend ceiling to each credential. When a developer's virtual key exhausts its budget, subsequent requests return an HTTP 402 response. The session stops generating charges. No developer can exceed their cap without the platform team modifying the budget, and that modification is a logged configuration event.

Budget configuration attaches to virtual keys with configurable reset windows:

{
  "budgets": [{
    "id": "budget-alice",
    "virtual_key_id": "vk-alice",
    "max_limit": 30.00,
    "reset_duration": "1d"
  }, {
    "id": "budget-alice-monthly",
    "virtual_key_id": "vk-alice",
    "max_limit": 250.00,
    "reset_duration": "1M",
    "calendar_aligned": true
  }]
}

Both budgets are enforced simultaneously. Exhausting either the daily or the monthly cap blocks the virtual key for that window. Calendar-aligned monthly budgets reset at the start of each calendar month in UTC, so billing periods correspond directly to the organization's billing cycle.

The hierarchical budget system adds team-level and customer-level caps above the virtual key layer. A team of ten developers, each with a $250/month individual cap, can carry an additional team-level cap of $1,500/month. If five developers are heavy users in the same month, the team cap limits aggregate spend even if individual caps have not yet been reached.

Rate limits run alongside budgets and operate on token throughput and request frequency rather than dollar spend. A developer virtual key can carry a token_max_limit of 2,000,000 per hour alongside a request_max_limit of 100 per minute. This prevents burst-mode autonomous agent sessions from consuming the full monthly budget in a single afternoon without generating a spend alert.

MCP Tool Restrictions: Per-Developer Tool Access Control

Claude Code connects to MCP servers to execute file operations, run bash commands, access internal APIs, and call any other tools defined in its MCP configuration. In a direct-connection deployment, every configured MCP server is equally accessible to every developer session. There is no per-developer tool scope.

Routing through Bifrost applies MCP tool filtering per virtual key. The default behavior is deny: a virtual key with no MCP configuration cannot invoke any MCP tool. Platform teams attach explicit tool allowlists to each key, enforcing least privilege at the tool level.

Instead of each developer's Claude Code configuration pointing at N individual MCP servers, it points at a single Bifrost MCP endpoint. Bifrost exposes only the tools the virtual key permits. The developer's session discovers only the tools it is authorized to call, with no visibility into the broader tool registry. This also centralizes MCP server credential management: tool server credentials live in Bifrost, not in each developer's local configuration.

For teams using MCP Tool Groups in Bifrost Enterprise, platform teams compose named tool bundles from the registry and attach them wholesale to roles or virtual keys. A "security-engineer" tool group might include code scanning, dependency audit, and secret detection tools. A "frontend-developer" tool group might include design system, component library, and CSS validation tools. Assigning a tool group to a virtual key is a single configuration operation that takes effect on the next request.

Audit Trails: Who Called What with Which Tools

Once Claude Code and Cursor traffic routes through Bifrost, every session generates a complete, tamper-evident record. Each request log includes the virtual key identity, the provider and model called, token counts at input and output, latency, MCP tools invoked (if any), budget state at request time, and the policy decisions made.

Bifrost's audit logs are cryptographically signed using HMAC to prevent post-hoc modification. They capture security events, configuration changes, and data access events, including prompt injection attempts and guardrail violations. For engineering organizations subject to SOC 2, GDPR, or HIPAA, the audit trail answers the questions those audits require: which credential made which model call, what data was included in the prompt, and what content policy actions occurred.

Logs export to Splunk, Datadog, Elastic, and webhook endpoints. The Datadog connector surfaces per-developer spend, rate limit utilization, and model usage in LLM Observability dashboards alongside the organization's existing engineering metrics.

Enterprise Identity Integration: SSO and RBAC for Developer Tooling

At scale, issuing and managing individual virtual keys manually does not work. Bifrost Enterprise integrates with Okta and Microsoft Entra via OpenID Connect, so virtual key provisioning derives from existing IdP groups. When a developer joins the engineering organization in Okta, they receive a gateway credential with the permissions defined for their role group. When they leave, deprovisioning in the IdP propagates to the gateway automatically.

Role-based access control in Bifrost Enterprise enforces who can create or modify virtual keys, adjust budgets, view audit logs, or change routing rules. A security team can be granted read-only access to audit logs without the ability to modify any governance configuration. A team lead can adjust their team's budget cap without touching any other team's configuration.

The Bifrost governance resource page maps each governance control to specific compliance frameworks, including NIST AI Risk Management Framework alignment, for teams that need to demonstrate control coverage during audits.

Deployment: Making Bifrost the Only Path to Providers

Governance only works if every coding agent session routes through the gateway. The deployment sequence is:

Deploy Bifrost within the organization's network (self-hosted, in-VPC, or on-premises). All provider credentials are held by Bifrost.
Issue virtual keys scoped to the appropriate role and team for each developer or developer group.
Distribute Bifrost endpoint and virtual key configuration to developer settings.json files or via your preferred secrets distribution mechanism.
Rotate and retire any raw provider keys held by individual developers or on developer workstations.

From this point, a developer without a Bifrost virtual key has no path to any LLM provider. A developer whose virtual key has a model allowlist cannot call a model outside it. A developer whose daily budget is exhausted gets a 402 response until the window resets. None of this requires ongoing enforcement effort from the platform team.

The full CLI agent integration guide for Claude Code is at docs.getbifrost.ai/cli-agents/claude-code, and the CLI agents overview covers Cursor, Codex CLI, Gemini CLI, and the other coding agents Bifrost supports natively.

Getting Started

Bifrost is available as an open-source Docker image deployable in minutes. The open-source tier covers virtual keys, model allowlists, per-developer budgets, rate limits, and MCP tool filtering. Bifrost Enterprise adds Access Profiles, SSO integration, RBAC, immutable audit logs, clustering, and in-VPC deployment.

To walk through a governance configuration for Claude Code and Cursor deployments across your engineering organization, book a demo with the Bifrost team.

Governing Claude Code and Cursor at Enterprise Scale

The Governance Problem with AI Coding Agents

How Bifrost Intercepts Coding Agent Traffic

Model Allowlists: Restricting Which Models Coding Agents Can Call

Per-Developer Budgets: Hard Caps on Daily and Monthly Spend

MCP Tool Restrictions: Per-Developer Tool Access Control

Audit Trails: Who Called What with Which Tools

Enterprise Identity Integration: SSO and RBAC for Developer Tooling

Deployment: Making Bifrost the Only Path to Providers

Getting Started

Read next

Budget and Rate Limit Architecture for Multi-Tenant LLM Platforms

What Is LLM Governance? A Framework for Platform Engineers in 2026

PII Redaction at the Gateway Layer for Regulated Industries

Ship your AI agents 5x faster ⚡️