Try Bifrost Enterprise free for 14 days.
Request access

[ Provider Guide ]

Google Gemini Provider on Bifrost

Bifrost routes Google Gemini models with comprehensive OpenAI-compatible support. The integration handles chat, embeddings, image generation (via Imagen), video, speech, and file operations with extensive parameter conversion.

Gemini provider summary

Bifrost routes Google Gemini models with full OpenAI compatibility. Gemini provides advanced multimodal capabilities including chat, embeddings, image generation (Imagen), video generation, speech, and comprehensive file handling.

Common Gemini model IDs used in Bifrost routes:

  • gemini-2.0-flash-001 (Latest)
  • gemini-1.5-pro-001 (High capability)
  • gemini-1.5-flash-001 (Fast)
  • embedding-001 (Embeddings)
PropertyDetails
DescriptionGoogle's Gemini models for chat, embeddings, image/video generation, and speech.
Provider route on Bifrostgemini/<model>
Provider docGoogle AI
API endpoint for providerhttps://generativelanguage.googleapis.com

Supported operations

Bifrost exposes these operations through OpenAI-compatible gateway routes; the table lists upstream Google Gemini API endpoints. Chat, Responses, Speech, and Transcriptions support streaming. Image Variation is not supported upstream. See Supported operations in Bifrost docs.

OperationNon-streamingStreamingUpstream endpoint
Chat CompletionsYesYes/v1beta/models/{model}:generateContent
Responses APIYesYes/v1beta/models/{model}:generateContent
Speech (TTS)YesYes/v1beta/models/{model}:generateContent
Transcriptions (STT)YesYes/v1beta/models/{model}:generateContent
Image GenerationYesNo/v1beta/models/{model}:generateContent or :predict (Imagen)
Image EditYesNo/v1beta/models/{model}:generateContent or :predict (Imagen)
Video GenerationYesNo/v1beta/models/{model}:predictLongRunning
Image VariationNoNo-
EmbeddingsYesNo/v1beta/models/{model}:embedContent
FilesYesNo/upload/storage/v1beta/files
BatchYesNo/v1beta/batchJobs
List ModelsYesNo/v1beta/models

Supported OpenAI parameters

Quick reference of OpenAI parameters accepted when routing through Gemini via Bifrost.

[
  "stream",
  "temperature",
  "top_p",
  "max_tokens",
  "max_completion_tokens",
  "stop",
  "tools",
  "tool_choice",
  "user",
  "reasoning",
  "response_format"
]

Supported Gemini models

Use the provider prefix gemini/ in Bifrost model routes for deterministic provider targeting.

FamilyModel IDBifrost routeTypical usage
Gemini 2.0 Flashgemini-2.0-flash-001gemini/gemini-2.0-flash-001Latest flagship
Gemini 1.5 Progemini-1.5-pro-001gemini/gemini-1.5-pro-001High capability
Gemini 1.5 Flashgemini-1.5-flash-001gemini/gemini-1.5-flash-001Fast, efficient
Gemini Embeddingembedding-001gemini/embedding-001Embeddings

Multimodal capabilities

Gemini vision models support text, images (URL and base64), video, audio, PDFs, and code execution. Multiple images per message are supported.

Supported content types:

  • ✅ Text content
  • ✅ Image URLs (http, https)
  • ✅ Base64-encoded images
  • ✅ Video files
  • ✅ Audio content
  • ✅ PDF documents
  • ✅ Code execution context

Authentication

Gemini supports API key authentication and OAuth2 Bearer token authentication. Bifrost selects the appropriate method based on the upstream endpoint type. See Authentication in Bifrost docs.

API key authentication

API keys can be sent in two ways depending on the endpoint:

Header method (standard Gemini endpoints)

  • Format: x-goog-api-key: YOUR_API_KEY
  • Used for standard routes such as /v1beta/models/{model}:generateContent

Query parameter method (Imagen and custom endpoints)

  • Format: ?key=YOUR_API_KEY appended to the request URL
  • Used for Imagen models and other custom endpoints
https://generativelanguage.googleapis.com/v1beta/models/imagen-4.0-generate-001:predict?key=YOUR_API_KEY

Bifrost automatically chooses header vs query-parameter API key auth based on the endpoint. Configure your Gemini API key in Bifrost provider settings; OAuth2 Bearer tokens are also supported where applicable.

API reference

OpenAI-compatible Bifrost gateway routes mapped to Google Gemini upstream APIs. Content aligned with Bifrost Gemini provider docs.

1) Chat Completions

Primary path via POST /v1/chat/completions. Upstream: /v1beta/models/{model}:generateContent. Supports multimodal input, tools, thinking, and streaming.

ParameterGemini handlingNotes
max_completion_tokensmaxOutputTokens
temperature, top_pDirect pass-through
stopstopSequences
response_formatresponseMimeType + responseJsonSchema
tools / tool_choicefunctionCallingConfigSee tool choice mapping
reasoningthinkingConfigeffort → thinkingLevel; max_tokens → thinkingBudget
top_k, penalties, seedVia extra_paramsGemini-specific

Dropped: logit_biaslogprobstop_logprobsparallel_tool_callsservice_tier.

Tool choice: auto → AUTO, none → NONE, required → ANY. Assistant role maps to model; consecutive tool messages merge into one user message.

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini/gemini-2.0-flash-001",
    "messages": [{"role": "user", "content": "Hello"}],
    "top_k": 40
  }'

2) Responses API

Same upstream generateContent with Responses ↔ Gemini conversion. Gateway: POST /v1/responses.

ParameterTransformationNotes
max_output_tokensmaxOutputTokens
instructionsSystem instruction text
inputMessagesString or array
textresponseMimeType + responseJsonSchema
tools / reasoningSame as Chat Completions
stop, top_kVia extra_paramsstop → stopSequences
  • Tools: function, computer_use_preview, web_search, mcp
  • Streaming emits content_part.added for text and reasoning
curl -X POST http://localhost:8080/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini/gemini-2.0-flash-001",
    "input": "Hello, how are you?",
    "instructions": "You are a helpful assistant."
  }'

3) Speech (TTS)

Text-to-speech via chat generation with responseModalities: ["AUDIO"]. Gateway: POST /v1/audio/speech. Supports streaming.

ParameterGemini handlingNotes
inputcontents[0].parts[0].textText to synthesize
voicespeechConfig.voiceConfig.prebuiltVoiceConfig.voiceNamee.g. Chant-Female
response_formatwav only (default)PCM from Gemini auto-converted to WAV

Gemini returns PCM (s16le, 24kHz, mono); Bifrost converts to WAV when response_format: "wav" (default). Multi-speaker configs supported via multiSpeakerVoiceConfig.

curl -X POST http://localhost:8080/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini/gemini-2.0-flash-001",
    "input": "Hello, welcome to Bifrost.",
    "voice": "Chant-Female"
  }'

4) Transcriptions (STT)

Implemented as chat completion with audio inline data. Gateway: POST /v1/audio/transcriptions. Supports streaming.

ParameterTransformationNotes
fileinlineData in contentsAudio bytes with MIME detection
promptFirst text partDefaults to transcript prompt
languageVia extra_paramsIf supported by model
curl -X POST http://localhost:8080/v1/audio/transcriptions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini/gemini-2.0-flash-001",
    "file": "<binary-audio-data>",
    "prompt": "Transcribe this audio in the original language."
  }'

5) Embeddings

Single and batch text embeddings. Gateway: POST /v1/embeddings. Upstream: /v1beta/models/{model}:embedContent. Non-streaming.

ParameterTransformationNotes
inputcontent.parts[0].textArrays joined with space for batch
dimensionsoutputDimensionality
task type, titleVia extra_params
  • embeddings[].valuesdata[].embedding
  • Usage from metadata.billableCharacterCount and token metadata
curl -X POST http://localhost:8080/v1/embeddings \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini/embedding-001",
    "input": "Hello world",
    "dimensions": 768
  }'

6) Batch API

Inline request arrays or file-based batch input. Gateway maps to OpenAI-style /v1/batches; upstream /v1beta/batchJobs.

  • POST /v1beta/batchJobs — create
  • GET /v1beta/batchJobs — list (pageToken)
  • GET /v1beta/batchJobs/{batch_id} — retrieve
  • POST /v1beta/batchJobs/{batch_id}:cancel — cancel

Status mapping includes in_progress, completed, failed, cancelled, expired. Results as inline responses or JSONL file output.

7) Files API

Upload files for batch jobs and multimodal requests. S3-style upload path on Google. Gateway: /v1/files.

  • POST /upload/storage/v1beta/files — upload (multipart)
  • GET /v1beta/files — list
  • GET /v1beta/files/{file_id} — retrieve metadata
  • DELETE /v1beta/files/{file_id} — delete
  • GET /v1beta/files/{file_id}/content — download

Fields: nameid, displayNamefilename, RFC3339 createTime → Unix timestamp.

curl -X POST http://localhost:8080/v1/files \
  -F "file=@document.pdf" \
  -F "filename=document.pdf"

8) Image Generation

Gemini models use :generateContent with responseModalities: ["IMAGE"]. Imagen models use :predict (auto-detected; API key via ?key=). Non-streaming.

ParameterHandlingNotes
promptText / Instances[0].PromptGemini vs Imagen path
ncandidateCount or sampleCountModel-dependent
sizeWxH → aspectRatio + imageSizeImagen: 1k/2k buckets
output_formatMIME typepng, jpeg, webp
seed, negative_promptDirect pass-through
curl -X POST http://localhost:8080/v1/images/generations \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini/imagen-4.0-generate-001",
    "prompt": "A sunset over the mountains",
    "size": "1024x1024",
    "n": 1,
    "output_format": "png"
  }'

9) Image Edit

multipart/form-data only. Gemini and Imagen paths; Imagen supports inpainting, outpainting, inpaint_removal, bgswap. Image variation is not supported.

curl -X POST http://localhost:8080/v1/images/edits \
  -F "model=gemini/gemini-2.0-flash-001" \
  -F "prompt=Add a rainbow in the sky" \
  -F "image[]=@photo.png;type=image/png"

10) List Models

Lists Gemini models with OpenAI-style metadata. Gateway: GET /v1/models. Upstream: GET /v1beta/models with pageSize / pageToken.

  • nameid (with gemini/ prefix)
  • displayNamename
  • inputTokenLimit / outputTokenLimit → max token fields
curl http://localhost:8080/v1/models

11) Video Generation

Veo models via long-running predictLongRunning. JSON body on POST /v1/videos. Poll with GET /v1/videos/{id}; download via /content.

OperationSupportedGateway
GenerateYesPOST /v1/videos
Retrieve statusYesGET /v1/videos/{id}
DownloadYesGET /v1/videos/{id}/content
Delete / List / RemixNoNot supported

size maps to aspect ratio (e.g. 1280x720 → 16:9). Safety filters may return failed with content_filter.

curl -X POST http://localhost:8080/v1/videos \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini/veo-3.1-generate-preview",
    "prompt": "A calico cat playing piano on stage",
    "seconds": "8",
    "size": "1280x720"
  }'

Implementation caveats

CaveatImpactSeverity
Role remappingAssistant role maps to "model" in Gemini formatLow
System message handlingSystem instructions become systemInstruction field (separate)Medium
Consecutive tool messagesMerged into single user message per Gemini requirementsMedium
Thinking content markingThinking blocks appear as marked text parts, not separateLow
Function call argumentsConverted from objects to JSON strings (requires parsing)Medium
Streaming finish reasonsOnly appear in final chunk; no early completion detectionLow

Authoritative references

[ BIFROST FEATURES ]

Open Source & Enterprise

Everything you need to run AI in production, from free open source to enterprise-grade features.

01 Governance

SAML support for SSO and Role-based access control and policy enforcement for team collaboration.

02 Adaptive Load Balancing

Automatically optimizes traffic distribution across provider keys and models based on real-time performance metrics.

03 Cluster Mode

High availability deployment with automatic failover and load balancing. Peer-to-peer clustering where every instance is equal.

04 Alerts

Real-time notifications for budget limits, failures, and performance issues on Email, Slack, PagerDuty, Teams, Webhook and more.

05 Log Exports

Export and analyze request logs, traces, and telemetry data from Bifrost with enterprise-grade data export capabilities for compliance, monitoring, and analytics.

06 Audit Logs

Comprehensive logging and audit trails for compliance and debugging.

07 Vault Support

Secure API key management with HashiCorp Vault, AWS Secrets Manager, Google Secret Manager, and Azure Key Vault integration.

08 VPC Deployment

Deploy Bifrost within your private cloud infrastructure with VPC isolation, custom networking, and enhanced security controls.

09 Guardrails

Automatically detect and block unsafe model outputs with real-time policy enforcement and content moderation across all agents.

[ SHIP RELIABLE AI ]

Try Bifrost Enterprise with a 14-day Free Trial

[quick setup]

Drop-in replacement for any AI SDK

Change just one line of code. Works with OpenAI, Anthropic, Vercel AI SDK, LangChain, and more.

1import os
2from anthropic import Anthropic
3
4anthropic = Anthropic(
5 api_key=os.environ.get("ANTHROPIC_API_KEY"),
6 base_url="https://<bifrost_url>/anthropic",
7)
8
9message = anthropic.messages.create(
10 model="claude-3-5-sonnet-20241022",
11 max_tokens=1024,
12 messages=[
13 {"role": "user", "content": "Hello, Claude"}
14 ]
15)
Drop in once, run everywhere.