๐Ÿ“‹ Overview

Bifrost provides drop-in API compatibility for major AI providers:
  • Zero code changes required in your applications
  • Same request/response formats as original APIs
  • Automatic provider routing and fallbacks
  • Enhanced features (multi-provider, tools, monitoring)
Simply change your base_url and keep everything else the same.

๐Ÿ”„ Quick Migration

Before (Direct Provider)

import openai

client = openai.OpenAI(
    base_url="https://api.openai.com",  # Original API
    api_key="your-openai-key"
)

After (Bifrost)

import openai

client = openai.OpenAI(
    base_url="http://localhost:8080/openai",  # Point to Bifrost
    api_key="your-openai-key"
)
Thatโ€™s it! Your application now benefits from Bifrostโ€™s features with no other changes.

๐ŸŒ Supported Integrations

ProviderEndpoint PatternCompatibilityDocumentation
OpenAI/openai/v1/*Full compatibilityOpenAI Compatible
Anthropic/anthropic/v1/*Full compatibilityAnthropic Compatible
Google GenAI/genai/v1beta/*Full compatibilityGenAI Compatible
LiteLLM/litellm/*Proxy compatibilityComing soon

โœจ Benefits of Drop-in Integration

๐Ÿ“ˆ Enhanced Capabilities

Your existing code gets these features automatically:
  • Multi-provider fallbacks - Automatic failover between multiple providers, regardless of the SDK you use
  • Load balancing - Distribute requests across multiple API keys
  • Rate limiting - Built-in request throttling and queuing
  • Tool integration - MCP tools available in all requests
  • Monitoring - Prometheus metrics and observability
  • Cost optimization - Smart routing to cheaper models

๐Ÿ”’ Security & Control

  • Centralized API key management - Store keys in one secure location
  • Request filtering - Block inappropriate content or requests
  • Usage tracking - Monitor and control API consumption
  • Access controls - Fine-grained permissions per client

๐ŸŽฏ Operational Benefits

  • Single deployment - One service handles all AI providers
  • Unified logging - Consistent request/response logging
  • Performance insights - Cross-provider latency comparison
  • Error handling - Graceful degradation and error recovery

๐Ÿ› ๏ธ Integration Patterns

SDK-based Integration

Use existing SDKs with modified base URL:
// OpenAI SDK
import OpenAI from "openai";
const openai = new OpenAI({
  baseURL: "http://localhost:8080/openai",
  apiKey: process.env.OPENAI_API_KEY,
});

// Anthropic SDK
import Anthropic from "@anthropic-ai/sdk";
const anthropic = new Anthropic({
  baseURL: "http://localhost:8080/anthropic",
  apiKey: process.env.ANTHROPIC_API_KEY,
});

HTTP Client Integration

For custom HTTP clients:
import requests

# OpenAI format
response = requests.post(
    "http://localhost:8080/openai/v1/chat/completions",
    headers={
        "Authorization": f"Bearer {openai_key}",
        "Content-Type": "application/json"
    },
    json={
        "model": "gpt-4o-mini",
        "messages": [{"role": "user", "content": "Hello!"}]
    }
)

# Anthropic format
response = requests.post(
    "http://localhost:8080/anthropic/v1/messages",
    headers={
        "Content-Type": "application/json",
    },
    json={
        "model": "claude-3-sonnet-20240229",
        "max_tokens": 1000,
        "messages": [{"role": "user", "content": "Hello!"}]
    }
)

Environment-based Configuration

Use environment variables for easy switching:
# Development - direct to providers
export OPENAI_BASE_URL="https://api.openai.com"
export ANTHROPIC_BASE_URL="https://api.anthropic.com"

# Production - via Bifrost
export OPENAI_BASE_URL="http://bifrost:8080/openai"
export ANTHROPIC_BASE_URL="http://bifrost:8080/anthropic"

๐ŸŒ Multi-Provider Usage

Provider-Prefixed Models

Use multiple providers seamlessly by prefixing model names with the provider:
import openai

# Single client, multiple providers
client = openai.OpenAI(
    base_url="http://localhost:8080/openai",
    api_key="dummy"  # API keys configured in Bifrost
)

# OpenAI models
response1 = client.chat.completions.create(
    model="gpt-4o-mini", # (default OpenAI since it's OpenAI's SDK)
    messages=[{"role": "user", "content": "Hello!"}]
)

# Anthropic models using OpenAI SDK format
response2 = client.chat.completions.create(
    model="anthropic/claude-3-sonnet-20240229",
    messages=[{"role": "user", "content": "Hello!"}]
)

# Google Vertex models
response3 = client.chat.completions.create(
    model="vertex/gemini-pro",
    messages=[{"role": "user", "content": "Hello!"}]
)

# Azure OpenAI models
response4 = client.chat.completions.create(
    model="azure/gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}]
)

# Local Ollama models
response5 = client.chat.completions.create(
    model="ollama/llama3.1:8b",
    messages=[{"role": "user", "content": "Hello!"}]
)

Provider-Specific Optimization

import openai

client = openai.OpenAI(
    base_url="http://localhost:8080/openai",
    api_key="dummy"
)

def choose_optimal_model(task_type: str, content: str):
    """Choose the best model based on task requirements"""

    if task_type == "code":
        # OpenAI excels at code generation
        return "openai/gpt-4o-mini"

    elif task_type == "creative":
        # Anthropic is great for creative writing
        return "anthropic/claude-3-sonnet-20240229"

    elif task_type == "analysis" and len(content) > 10000:
        # Anthropic has larger context windows
        return "anthropic/claude-3-sonnet-20240229"

    elif task_type == "multilingual":
        # Google models excel at multilingual tasks
        return "vertex/gemini-pro"

    else:
        # Default to fastest/cheapest
        return "openai/gpt-4o-mini"

# Usage examples
code_response = client.chat.completions.create(
    model=choose_optimal_model("code", ""),
    messages=[{"role": "user", "content": "Write a Python web scraper"}]
)

creative_response = client.chat.completions.create(
    model=choose_optimal_model("creative", ""),
    messages=[{"role": "user", "content": "Write a short story about AI"}]
)

๐Ÿš€ Deployment Scenarios

Microservices Architecture

# docker-compose.yml
version: "3.8"
services:
  bifrost:
    image: maximhq/bifrost
    ports:
      - "8080:8080"
    volumes:
      - ./config.json:/app/config/config.json
    environment:
      - OPENAI_API_KEY
      - ANTHROPIC_API_KEY

  my-app:
    build: .
    environment:
      - OPENAI_BASE_URL=http://bifrost:8080/openai
      - ANTHROPIC_BASE_URL=http://bifrost:8080/anthropic
    depends_on:
      - bifrost

Kubernetes Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: bifrost
spec:
  replicas: 3
  selector:
    matchLabels:
      app: bifrost
  template:
    metadata:
      labels:
        app: bifrost
    spec:
      containers:
        - name: bifrost
          image: maximhq/bifrost:latest
          ports:
            - containerPort: 8080
          env:
            - name: OPENAI_API_KEY
              valueFrom:
                secretKeyRef:
                  name: ai-keys
                  key: openai-key
---
apiVersion: v1
kind: Service
metadata:
  name: bifrost-service
spec:
  selector:
    app: bifrost
  ports:
    - port: 8080
      targetPort: 8080
  type: LoadBalancer

Reverse Proxy Setup

# nginx.conf
upstream bifrost {
    server bifrost:8080;
}

server {
    listen 80;
    server_name api.yourcompany.com;

    # OpenAI proxy
    location /openai/ {
        proxy_pass http://bifrost/openai/;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }

    # Anthropic proxy
    location /anthropic/ {
        proxy_pass http://bifrost/anthropic/;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

๐Ÿงช Testing Integration

Compatibility Testing

Verify your application works with Bifrost:
# Test OpenAI compatibility
curl -X POST http://localhost:8080/openai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4o-mini", "messages": [{"role": "user", "content": "test"}]}'

# Test Anthropic compatibility
curl -X POST http://localhost:8080/anthropic/v1/messages \
  -H "Content-Type: application/json" \
  -d '{"model": "claude-3-sonnet-20240229", "max_tokens": 100, "messages": [{"role": "user", "content": "test"}]}'

Feature Validation

Test enhanced features through compatible APIs:
import openai

client = openai.OpenAI(
    base_url="http://localhost:8080/openai",
    api_key=openai_key
)

# This request automatically gets:
# - Fallback handling
# - MCP tool integration
# - Monitoring
# - Load balancing
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "user", "content": "List files in current directory"}
    ]
)

# Tools are automatically available
print(response.choices[0].message.tool_calls)

๐Ÿ“Š Migration Strategies

Gradual Migration

  1. Start with development - Test Bifrost in dev environment
  2. Canary deployment - Route 5% of traffic through Bifrost
  3. Feature-by-feature - Migrate specific endpoints gradually
  4. Full migration - Switch all traffic to Bifrost

Blue-Green Migration

import os
import random

# Route traffic based on feature flag
def get_base_url(provider: str) -> str:
    if os.getenv("USE_BIFROST", "false") == "true":
        return f"http://bifrost:8080/{provider}"
    else:
        return f"https://api.{provider}.com"

# Gradual rollout
def should_use_bifrost() -> bool:
    rollout_percentage = int(os.getenv("BIFROST_ROLLOUT", "0"))
    return random.randint(1, 100) <= rollout_percentage

Feature Flag Integration

# Using feature flags for safe migration
import openai
from feature_flags import get_flag

def create_client():
    if get_flag("use_bifrost_openai"):
        base_url = "http://bifrost:8080/openai"
    else:
        base_url = "https://api.openai.com"

    return openai.OpenAI(
        base_url=base_url,
        api_key=os.getenv("OPENAI_API_KEY")
    )

๐Ÿ“š Integration Guides

Choose your provider integration:

๐Ÿค– OpenAI Compatible

๐Ÿง  Anthropic Compatible

๐Ÿ”ฎ Google GenAI Compatible

๐Ÿ”„ Migration Guide

Architecture: For integration design patterns and performance details, see Architecture Documentation.