Overview

Bifrost’s budget management system provides comprehensive cost control and financial governance for enterprise AI deployments. It operates through a hierarchical budget structure that enables granular cost management, usage tracking, and financial oversight across your entire organization. Core Hierarchy:
Customer (has independent budget)
    ↓ (one-to-many)
Team (has independent budget) 
    ↓ (one-to-many)
Virtual Key (has independent budget + rate limits)

OR

Customer (has independent budget)
    ↓ (direct attachment)
Virtual Key (has independent budget + rate limits)

OR

Virtual Key (standalone - has independent budget + rate limits)
Key Capabilities:
  • Virtual Keys - Primary access control via x-bf-vk header (exclusive team OR customer attachment)
  • Budget Management - Independent budget limits at each hierarchy level with cumulative checking
  • Rate Limiting - Request and token-based throttling (VK-level only)
  • Model/Provider Filtering - Granular access control per virtual key
  • Usage Tracking - Real-time monitoring and audit trails
  • Audit Headers - Optional team and customer identification
Budgeting Modes:
  • Mandatory (enforce_governance_header: true) - All requests require x-bf-vk header
  • Optional (enforce_governance_header: false) - Governance applied only when x-bf-vk header present
For detailed implementation architecture, see Architecture > Plugins > Governance.

Virtual Keys

Virtual Keys are the primary governance entity in Bifrost. Users and applications authenticate using the x-bf-vk header, which maps to specific access permissions, budgets, and rate limits. Key Features:
  • Access Control - Model and provider filtering
  • Cost Management - Independent budgets (checked along with team/customer budgets if attached)
  • Rate Limiting - Token and request-based throttling (VK-level only)
  • Key Restrictions - Limit VK to specific provider API keys (if configured, VK can only use those keys)
  • Exclusive Attachment - Belongs to either one team OR one customer OR neither (mutually exclusive)
  • Active/Inactive Status - Enable/disable access instantly

Configuration

  1. Navigate to Virtual Keys
    • Open Bifrost UI at http://localhost:8080
    • Go to GovernanceVirtual Keys
  2. Create Virtual Key Virtual Key Creation
Required Fields:
  • Name: Descriptive identifier
  • Description: Optional usage details
Access Control:
  • Allowed Models: Specific models (empty = all allowed)
  • Allowed Providers: Specific providers (empty = all allowed)
Budget Settings:
  • Max Limit: Dollar amount (e.g., 10.50)
  • Reset Duration: 1m, 1h, 1d, 1w, 1M
Rate Limits:
  • Token Limit: Max tokens per period
  • Request Limit: Max requests per period
  • Reset Duration: Reset frequency for each limit
Associations:
  • Team: Assign to existing team (mutually exclusive with customer)
  • Customer: Assign to existing customer (mutually exclusive with team)
  • Provider Keys: Restrict VK to specific API keys (optional - leave empty for no restrictions)
  1. Save Configuration
    • Click Create Virtual Key
    • Note the generated VK value for client use

Key Restrictions

Virtual Keys can be restricted to use only specific provider API keys. When key restrictions are configured, the VK can only access those designated keys, providing fine-grained control over which API keys different users or applications can utilize. How It Works:
  • No Restrictions (default): VK can use any available provider keys based on load balancing
  • With Restrictions: VK limited to only the specified key IDs, regardless of other available keys
Example Scenario:
Available Provider Keys:
├── key-prod-001 → sk-prod-key... (Production OpenAI key)
├── key-dev-002  → sk-dev-key...  (Development OpenAI key)  
└── key-test-003 → sk-test-key... (Testing OpenAI key)

Virtual Key Restrictions:
├── vk-prod-main
│   ├── Allowed Models: [gpt-4o]
│   └── Restricted Keys: [key-prod-001] ← ONLY production key
├── vk-dev-main  
│   ├── Allowed Models: [gpt-4o-mini]
│   └── Restricted Keys: [key-dev-002, key-test-003] ← Dev + test keys
└── vk-unrestricted
    ├── Allowed Models: [all models]
    └── Restricted Keys: [] ← Can use ANY available key
Request Behavior:
# Production VK - will ONLY use key-prod-001
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "x-bf-vk: vk-prod-main" \
  -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello!"}]}'

# Development VK - will load balance between key-dev-002 and key-test-003
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "x-bf-vk: vk-dev-main" \
  -d '{"model": "gpt-4o-mini", "messages": [{"role": "user", "content": "Hello!"}]}'

# VK with no key restrictions - can use any available OpenAI key
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "x-bf-vk: vk-unrestricted" \
  -d '{"model": "gpt-4o-mini", "messages": [{"role": "user", "content": "Hello!"}]}'
Use Cases:
  • Environment Separation - Production VKs use production keys, dev VKs use dev keys
  • Cost Control - Different teams use keys with different billing accounts
  • Access Control - Restrict sensitive keys to specific VKs only
  • Compliance - Ensure certain workloads only use compliant/audited keys

Teams

Teams provide organizational grouping for virtual keys with department-level budget management. Teams can belong to one customer and have their own independent budget allocation. Key Features:
  • Organizational Structure - Group multiple virtual keys
  • Independent Budgets - Department-level cost control (separate from customer budgets)
  • Customer Association - Can belong to one customer (optional)
  • No Rate Limits - Teams cannot have rate limits (VK-level only)
Configuration
  1. Navigate to Teams
    • Open Bifrost UI at http://localhost:8080
    • Go to GovernanceTeams
  2. Create Team Team Creation
Required Fields:
  • Name: Team identifier
  • Customer: Optional parent customer
Budget Settings:
  • Max Limit: Department budget in dollars
  • Reset Duration: Budget reset frequency
Virtual Key Assignment:
  • Assign existing virtual keys to team
  • Create new virtual keys under team
  1. Save Configuration
    • Click Create Team
    • Assign virtual keys to the team

Customers

Customers represent the highest level in the governance hierarchy, typically corresponding to organizations or major business units. They provide top-level budget control and organizational structure. Key Features:
  • Top-Level Organization - Highest hierarchy level
  • Independent Budgets - Organization-wide cost control (separate from team/VK budgets)
  • Team Management - Contains multiple teams and direct VKs
  • No Rate Limits - Customers cannot have rate limits (VK-level only)
Configuration
  1. Navigate to Customers
    • Open Bifrost UI at http://localhost:8080
    • Go to GovernanceCustomers
  2. Create Customer Customer Creation
Required Fields:
  • Name: Organization identifier
Budget Settings:
  • Max Limit: Organization budget in dollars
  • Reset Duration: Budget reset frequency
Team Management:
  • View all teams under customer
  • Create new teams under customer
  • Monitor aggregate usage
  1. Save Configuration
    • Click Create Customer
    • Create teams under the customer

Usage & Headers

Required Header

All governance-enabled requests must include the virtual key header:
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "x-bf-vk: vk-engineering-main" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Optional Audit Headers

Include additional headers for enhanced tracking and audit trails:
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "x-bf-vk: vk-engineering-main" \
  -H "x-bf-team: team-eng-001" \
  -H "x-bf-customer: customer-acme-corp" \
  -H "x-bf-user-id: user-alice" \
  -d '{
    "model": "gpt-4o-mini", 
    "messages": [{"role": "user", "content": "Hello!"}]
  }'
Header Definitions:
  • x-bf-vk - Required virtual key for access control
  • x-bf-team - Optional team identifier for audit trails
  • x-bf-customer - Optional customer identifier for audit trails
  • x-bf-user-id - Optional user identifier for detailed tracking

Cost Calculation

Bifrost automatically calculates costs based on:
  • Provider Pricing - Real-time model pricing data
  • Token Usage - Input + output tokens from API responses
  • Request Type - Different pricing for chat, text, embedding, speech, transcription
  • Cache Status - Reduced costs for cached responses
  • Batch Operations - Volume discounts for batch requests
Cost calculation details are covered in Architecture > Plugins > Governance.

Budget Checking Flow

When a request is made with a virtual key, Bifrost checks all applicable budgets independently in the hierarchy. Each budget must have sufficient remaining balance for the request to proceed. Checking Sequence: For VK → Team → Customer:
1. ✓ VK Budget (if VK has budget)
2. ✓ Team Budget (if VK's team has budget)  
3. ✓ Customer Budget (if team's customer has budget)
For VK → Customer (direct):
1. ✓ VK Budget (if VK has budget)
2. ✓ Customer Budget (if VK's customer has budget)
For Standalone VK:
1. ✓ VK Budget (if VK has budget)
Important Notes:
  • All applicable budgets must pass - any single budget failure blocks the request
  • Budgets are independent - each tracks its own usage and limits
  • Costs are deducted from all applicable budgets - same cost applied to each level
  • Rate limits checked only at VK level - teams and customers have no rate limits
Example:
  • VK budget: 9/9/10 remaining ✓
  • Team budget: 15/15/20 remaining ✓
  • Customer budget: 45/45/50 remaining ✓
  • Result: Allowed (no budget is exceeded)
  • After request:
    • Request cost: $2
    • Updated VK=11/11/10, Team=17/17/20, Customer=47/47/50
    • Then the next request will be blocked.

Error Responses

  • Virtual Key Not Found (400)
{
  "error": {
    "type": "virtual_key_required",
    "message": "x-bf-vk header is missing"
  }
}
  • Virtual Key Blocked (403)
{
  "error": {
    "type": "virtual_key_blocked", 
    "message": "Virtual key is inactive"
  }
}
  • Model Not Allowed (403)
{
  "error": {
    "type": "model_blocked",
    "message": "Model 'gpt-4o' is not allowed for this virtual key"
  }
}
  • Provider Not Allowed (403)
{
  "error": {
    "type": "provider_blocked",
    "message": "Provider 'anthropic' is not allowed for this virtual key"
  }
}
  • Rate Limit Exceeded (429)
{
  "error": {
    "type": "rate_limited",
    "message": "Rate limits exceeded: [token limit exceeded (1500/1000, resets every 1h)]"
  }
}
  • Token Limit Exceeded (429)
{
  "error": {
    "type": "token_limited",
    "message": "Rate limits exceeded: [token limit exceeded (1500/1000, resets every 1h)]"
  }
}
  • Request Limit Exceeded (429)
{
  "error": {
    "type": "request_limited", 
    "message": "Rate limits exceeded: [request limit exceeded (101/100, resets every 1m)]"
  }
}
  • Budget Exceeded (402)
{
  "error": {
    "type": "budget_exceeded",
    "message": "Budget check failed: VK budget exceeded: 105.50 > 100.00 dollars"
  }
}
Budget Error Variations:
  • "VK budget exceeded: 105.50 > 100.00 dollars" - Virtual Key budget exceeded
  • "Team budget exceeded: 250.75 > 250.00 dollars" - Team budget exceeded
  • "Customer budget exceeded: 1500.25 > 1500.00 dollars" - Customer budget exceeded

Reset Durations

Budgets and rate limits support flexible reset durations: Format Examples:
  • 1m - 1 minute
  • 5m - 5 minutes
  • 1h - 1 hour
  • 1d - 1 day
  • 1w - 1 week
  • 1M - 1 month
Common Patterns:
  • Rate Limits: 1m, 1h, 1d for request throttling
  • Budgets: 1d, 1w, 1M for cost control
  • Development: 5m, 15m for testing scenarios

Next Steps