Load Balance

Smart Key Distribution

Bifrost’s key management system goes beyond simple API key storage. It provides intelligent load balancing, model-specific key filtering, and weighted distribution to optimize performance and manage costs across multiple API keys. When you configure multiple keys for a provider, Bifrost automatically distributes requests using sophisticated selection algorithms that consider key weights, model compatibility, and deployment mappings.

How Key Selection Works

Bifrost follows a precise selection process for every request:

Context Override Check: First checks if a key is explicitly provided in context (bypassing management)
Provider Key Lookup: Retrieves all configured keys for the requested provider
Model Filtering: Filters keys that support the requested model
Deployment Validation: For Azure/Bedrock, validates deployment mappings
Weighted Selection: Uses weighted random selection among eligible keys

This ensures optimal key usage while respecting your configuration constraints.

Implementation Examples

# Configure multiple keys with weights via API
curl -X POST http://localhost:8080/api/providers \
  -H "Content-Type: application/json" \
  -d '{
    "provider": "openai",
    "keys": [
      {
        "value": "env.OPENAI_API_KEY_1",
        "models": ["gpt-4o", "gpt-4o-mini"],
        "weight": 0.7
      },
      {
        "value": "env.OPENAI_API_KEY_2", 
        "models": [],
        "weight": 0.3
      }
    ]
  }'

# Regular request (uses weighted key selection)
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4o-mini",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

# Request with direct API key (bypasses key management)
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-your-direct-api-key" \
  -d '{
    "model": "openai/gpt-4o-mini", 
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Weighted Load Balancing

Bifrost uses weighted random selection to distribute requests across multiple keys. This allows you to: Control Traffic Distribution:

Assign higher weights to premium keys with better rate limits
Balance between production and backup keys
Gradually migrate traffic during key rotation

Weight Calculation Example:

Key 1: Weight 0.7 (70% probability)
Key 2: Weight 0.3 (30% probability)
Total Weight: 1.0

Random selection ensures statistical distribution over time

Algorithm Details:

Calculate total weight of all eligible keys
Generate random number between 0 and total weight
Select key based on cumulative weight ranges
If selected key fails, automatic fallback to next available key

Model Whitelisting and Filtering

Keys can be restricted to specific models for access control and cost management: Model Filtering Logic:

Empty models array: Key supports ALL models for that provider
Populated models array: Key only supports listed models
Model mismatch: Key is excluded from selection for that request

Use Cases:

Premium Models: Dedicated keys for expensive models (GPT-4, Claude-3)
Team Separation: Different keys for different teams or projects
Cost Control: Restrict access to specific model tiers
Compliance: Separate keys for different security requirements

Example Model Restrictions:

{
  "keys": [
    {
      "value": "premium-key",
      "models": ["gpt-4o", "o1-preview"],  // Only premium models
      "weight": 1.0
    },
    {
      "value": "standard-key", 
      "models": ["gpt-4o-mini", "gpt-3.5-turbo"], // Only standard models
      "weight": 1.0
    }
  ]
}

Deployment Mapping (Azure & Bedrock)

For cloud providers with deployment-based routing, Bifrost validates deployment availability: Azure OpenAI:

Keys must have deployment mappings for specific models
Deployment name maps to actual Azure deployment identifier
Missing deployment excludes key from selection

AWS Bedrock:

Supports model profiles and direct model access
Deployment mappings enable inference profile routing
ARN configuration determines URL formation

Deployment Validation Process:

Check if provider uses deployments (Azure/Bedrock)
Verify deployment exists for requested model
Exclude keys without proper deployment mapping
Continue with standard weighted selection

Direct Key Bypass

For scenarios requiring explicit key control, Bifrost supports bypassing the entire key management system: Go SDK Context Override: Pass a key directly in the request context using schemas.BifrostContextKey. This completely bypasses provider key lookup and selection. Gateway Header-based Keys: Send API keys in Authorization (Bearer) or x-api-key headers. Requires allow_direct_keys setting to be enabled. Enable Direct Keys:

Navigate to Configuration page
Toggle “Allow Direct Keys” to enabled
Save configuration

When to Use Direct Keys:

Per-user API key scenarios
External key management systems
Testing with specific keys
Debugging key-related issues

Quick Start

Integrations

Open Source Features

Enterprise Features

Smart Key Distribution

How Key Selection Works

Implementation Examples

Weighted Load Balancing

Model Whitelisting and Filtering

Deployment Mapping (Azure & Bedrock)

Direct Key Bypass

Quick Start

Integrations

Open Source Features

Enterprise Features

​Smart Key Distribution

​How Key Selection Works

​Implementation Examples

​Weighted Load Balancing

​Model Whitelisting and Filtering

​Deployment Mapping (Azure & Bedrock)

​Direct Key Bypass

Smart Key Distribution

How Key Selection Works

Implementation Examples

Weighted Load Balancing

Model Whitelisting and Filtering

Deployment Mapping (Azure & Bedrock)

Direct Key Bypass