Overview

Bifrost provides complete Google GenAI API compatibility through protocol adaptation. The integration handles request transformation, response normalization, and error mapping between Google’s GenAI API specification and Bifrost’s internal processing pipeline. This integration enables you to utilize Bifrost’s features like governance, load balancing, semantic caching, multi-provider support, and more, all while preserving your existing Google GenAI SDK-based architecture. Endpoint: /genai

Setup

from google import genai
from google.genai.types import HttpOptions

# Configure client to use Bifrost
client = genai.Client(
    api_key="dummy-key",  # Keys handled by Bifrost
    http_options=HttpOptions(base_url="http://localhost:8080/genai")
)

# Make requests as usual
response = client.models.generate_content(
    model="gemini-1.5-flash",
    contents="Hello!"
)

print(response.text)

Provider/Model Usage Examples

Use multiple providers through the same GenAI SDK format by prefixing model names with the provider:
from google import genai
from google.genai.types import HttpOptions

client = genai.Client(
    api_key="dummy-key",
    http_options=HttpOptions(base_url="http://localhost:8080/genai")
)

# Google Vertex models (default)
vertex_response = client.models.generate_content(
    model="gemini-1.5-flash",
    contents="Hello from Gemini!"
)

# OpenAI models via GenAI SDK format
openai_response = client.models.generate_content(
    model="openai/gpt-4o-mini",
    contents="Hello from OpenAI!"
)

# Anthropic models via GenAI SDK format
anthropic_response = client.models.generate_content(
    model="anthropic/claude-3-sonnet-20240229",
    contents="Hello from Claude!"
)

# Azure OpenAI models
azure_response = client.models.generate_content(
    model="azure/gpt-4o",
    contents="Hello from Azure!"
)

# Local Ollama models
ollama_response = client.models.generate_content(
    model="ollama/llama3.1:8b",
    contents="Hello from Ollama!"
)

Adding Custom Headers

Pass custom headers required by Bifrost plugins (like governance, telemetry, etc.):
from google import genai
from google.genai.types import HttpOptions

# Configure client with custom headers
client = genai.Client(
    api_key="dummy-key",
    http_options=HttpOptions(
        base_url="http://localhost:8080/genai",
        headers={
            "x-bf-vk": "vk_12345",  # Virtual key for governance
            "x-bf-user-id": "user_789",  # User identification
            "x-bf-team-id": "team_456",  # Team identification
            "x-bf-trace-id": "trace_abc123",  # Request tracing
        }
    )
)

response = client.models.generate_content(
    model="gemini-1.5-flash",
    contents="Hello with custom headers!"
)

Using Direct Keys

Pass API keys directly in requests to bypass Bifrost’s load balancing. You can pass any provider’s API key (OpenAI, Anthropic, Mistral, etc.) since Bifrost only looks for Authorization or x-api-key headers. This requires the Allow Direct API keys option to be enabled in Bifrost configuration.
Learn more: See Quickstart Configuration for enabling direct API key usage.
from google import genai
from google.genai.types import HttpOptions

# Pass different provider keys per request using headers
client = genai.Client(
    api_key="dummy-key",
    http_options=HttpOptions(base_url="http://localhost:8080/genai")
)

# Use Anthropic key for Claude models
anthropic_response = client.models.generate_content(
    model="anthropic/claude-3-sonnet-20240229",
    contents="Hello Claude!",
    request_options={
        "headers": {"x-api-key": "your-anthropic-api-key"}
    }
)

# Use OpenAI key for GPT models
openai_response = client.models.generate_content(
    model="openai/gpt-4o-mini",
    contents="Hello GPT!",
    request_options={
        "headers": {"Authorization": "Bearer sk-your-openai-key"}
    }
)

Supported Features

The Google GenAI integration supports all features that are available in both the Google GenAI SDK and Bifrost core functionality. If the Google GenAI SDK supports a feature and Bifrost supports it, the integration will work seamlessly. 😄

Next Steps