Gemini provider summary
Bifrost routes Google Gemini models with full OpenAI compatibility. Gemini provides advanced multimodal capabilities including chat, embeddings, image generation (Imagen), video generation, speech, and comprehensive file handling.
Common Gemini model IDs used in Bifrost routes:
gemini-2.0-flash-001(Latest)gemini-1.5-pro-001(High capability)gemini-1.5-flash-001(Fast)embedding-001(Embeddings)
| Property | Details |
|---|---|
| Description | Google's Gemini models for chat, embeddings, image/video generation, and speech. |
| Provider route on Bifrost | gemini/<model> |
| Provider doc | Google AI |
| API endpoint for provider | https://generativelanguage.googleapis.com |
Supported operations
Bifrost exposes these operations through OpenAI-compatible gateway routes; the table lists upstream Google Gemini API endpoints. Chat, Responses, Speech, and Transcriptions support streaming. Image Variation is not supported upstream. See Supported operations in Bifrost docs.
| Operation | Non-streaming | Streaming | Upstream endpoint |
|---|---|---|---|
| Chat Completions | Yes | Yes | /v1beta/models/{model}:generateContent |
| Responses API | Yes | Yes | /v1beta/models/{model}:generateContent |
| Speech (TTS) | Yes | Yes | /v1beta/models/{model}:generateContent |
| Transcriptions (STT) | Yes | Yes | /v1beta/models/{model}:generateContent |
| Image Generation | Yes | No | /v1beta/models/{model}:generateContent or :predict (Imagen) |
| Image Edit | Yes | No | /v1beta/models/{model}:generateContent or :predict (Imagen) |
| Video Generation | Yes | No | /v1beta/models/{model}:predictLongRunning |
| Image Variation | No | No | - |
| Embeddings | Yes | No | /v1beta/models/{model}:embedContent |
| Files | Yes | No | /upload/storage/v1beta/files |
| Batch | Yes | No | /v1beta/batchJobs |
| List Models | Yes | No | /v1beta/models |
Supported OpenAI parameters
Quick reference of OpenAI parameters accepted when routing through Gemini via Bifrost.
[ "stream", "temperature", "top_p", "max_tokens", "max_completion_tokens", "stop", "tools", "tool_choice", "user", "reasoning", "response_format" ]
Supported Gemini models
Use the provider prefix gemini/ in Bifrost model routes for deterministic provider targeting.
| Family | Model ID | Bifrost route | Typical usage |
|---|---|---|---|
| Gemini 2.0 Flash | gemini-2.0-flash-001 | gemini/gemini-2.0-flash-001 | Latest flagship |
| Gemini 1.5 Pro | gemini-1.5-pro-001 | gemini/gemini-1.5-pro-001 | High capability |
| Gemini 1.5 Flash | gemini-1.5-flash-001 | gemini/gemini-1.5-flash-001 | Fast, efficient |
| Gemini Embedding | embedding-001 | gemini/embedding-001 | Embeddings |
Multimodal capabilities
Gemini vision models support text, images (URL and base64), video, audio, PDFs, and code execution. Multiple images per message are supported.
Supported content types:
- ✅ Text content
- ✅ Image URLs (http, https)
- ✅ Base64-encoded images
- ✅ Video files
- ✅ Audio content
- ✅ PDF documents
- ✅ Code execution context
Authentication
Gemini supports API key authentication and OAuth2 Bearer token authentication. Bifrost selects the appropriate method based on the upstream endpoint type. See Authentication in Bifrost docs.
API key authentication
API keys can be sent in two ways depending on the endpoint:
Header method (standard Gemini endpoints)
- Format:
x-goog-api-key: YOUR_API_KEY - Used for standard routes such as
/v1beta/models/{model}:generateContent
Query parameter method (Imagen and custom endpoints)
- Format:
?key=YOUR_API_KEYappended to the request URL - Used for Imagen models and other custom endpoints
https://generativelanguage.googleapis.com/v1beta/models/imagen-4.0-generate-001:predict?key=YOUR_API_KEY
Bifrost automatically chooses header vs query-parameter API key auth based on the endpoint. Configure your Gemini API key in Bifrost provider settings; OAuth2 Bearer tokens are also supported where applicable.
API reference
OpenAI-compatible Bifrost gateway routes mapped to Google Gemini upstream APIs. Content aligned with Bifrost Gemini provider docs.
1) Chat Completions
Primary path via POST /v1/chat/completions. Upstream: /v1beta/models/{model}:generateContent. Supports multimodal input, tools, thinking, and streaming.
| Parameter | Gemini handling | Notes |
|---|---|---|
| max_completion_tokens | maxOutputTokens | |
| temperature, top_p | Direct pass-through | |
| stop | stopSequences | |
| response_format | responseMimeType + responseJsonSchema | |
| tools / tool_choice | functionCallingConfig | See tool choice mapping |
| reasoning | thinkingConfig | effort → thinkingLevel; max_tokens → thinkingBudget |
| top_k, penalties, seed | Via extra_params | Gemini-specific |
Dropped: logit_biaslogprobstop_logprobsparallel_tool_callsservice_tier.
Tool choice: auto → AUTO, none → NONE, required → ANY. Assistant role maps to model; consecutive tool messages merge into one user message.
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gemini/gemini-2.0-flash-001",
"messages": [{"role": "user", "content": "Hello"}],
"top_k": 40
}'2) Responses API
Same upstream generateContent with Responses ↔ Gemini conversion. Gateway: POST /v1/responses.
| Parameter | Transformation | Notes |
|---|---|---|
| max_output_tokens | maxOutputTokens | |
| instructions | System instruction text | |
| input | Messages | String or array |
| text | responseMimeType + responseJsonSchema | |
| tools / reasoning | Same as Chat Completions | |
| stop, top_k | Via extra_params | stop → stopSequences |
- Tools:
function,computer_use_preview,web_search,mcp - Streaming emits
content_part.addedfor text and reasoning
curl -X POST http://localhost:8080/v1/responses \
-H "Content-Type: application/json" \
-d '{
"model": "gemini/gemini-2.0-flash-001",
"input": "Hello, how are you?",
"instructions": "You are a helpful assistant."
}'3) Speech (TTS)
Text-to-speech via chat generation with responseModalities: ["AUDIO"]. Gateway: POST /v1/audio/speech. Supports streaming.
| Parameter | Gemini handling | Notes |
|---|---|---|
| input | contents[0].parts[0].text | Text to synthesize |
| voice | speechConfig.voiceConfig.prebuiltVoiceConfig.voiceName | e.g. Chant-Female |
| response_format | wav only (default) | PCM from Gemini auto-converted to WAV |
Gemini returns PCM (s16le, 24kHz, mono); Bifrost converts to WAV when response_format: "wav" (default). Multi-speaker configs supported via multiSpeakerVoiceConfig.
curl -X POST http://localhost:8080/v1/audio/speech \
-H "Content-Type: application/json" \
-d '{
"model": "gemini/gemini-2.0-flash-001",
"input": "Hello, welcome to Bifrost.",
"voice": "Chant-Female"
}'4) Transcriptions (STT)
Implemented as chat completion with audio inline data. Gateway: POST /v1/audio/transcriptions. Supports streaming.
| Parameter | Transformation | Notes |
|---|---|---|
| file | inlineData in contents | Audio bytes with MIME detection |
| prompt | First text part | Defaults to transcript prompt |
| language | Via extra_params | If supported by model |
curl -X POST http://localhost:8080/v1/audio/transcriptions \
-H "Content-Type: application/json" \
-d '{
"model": "gemini/gemini-2.0-flash-001",
"file": "<binary-audio-data>",
"prompt": "Transcribe this audio in the original language."
}'5) Embeddings
Single and batch text embeddings. Gateway: POST /v1/embeddings. Upstream: /v1beta/models/{model}:embedContent. Non-streaming.
| Parameter | Transformation | Notes |
|---|---|---|
| input | content.parts[0].text | Arrays joined with space for batch |
| dimensions | outputDimensionality | |
| task type, title | Via extra_params |
embeddings[].values→data[].embedding- Usage from
metadata.billableCharacterCountand token metadata
curl -X POST http://localhost:8080/v1/embeddings \
-H "Content-Type: application/json" \
-d '{
"model": "gemini/embedding-001",
"input": "Hello world",
"dimensions": 768
}'6) Batch API
Inline request arrays or file-based batch input. Gateway maps to OpenAI-style /v1/batches; upstream /v1beta/batchJobs.
POST /v1beta/batchJobs— createGET /v1beta/batchJobs— list (pageToken)GET /v1beta/batchJobs/{batch_id}— retrievePOST /v1beta/batchJobs/{batch_id}:cancel— cancel
Status mapping includes in_progress, completed, failed, cancelled, expired. Results as inline responses or JSONL file output.
7) Files API
Upload files for batch jobs and multimodal requests. S3-style upload path on Google. Gateway: /v1/files.
POST /upload/storage/v1beta/files— upload (multipart)GET /v1beta/files— listGET /v1beta/files/{file_id}— retrieve metadataDELETE /v1beta/files/{file_id}— deleteGET /v1beta/files/{file_id}/content— download
Fields: name → id, displayName → filename, RFC3339 createTime → Unix timestamp.
curl -X POST http://localhost:8080/v1/files \ -F "file=@document.pdf" \ -F "filename=document.pdf"
8) Image Generation
Gemini models use :generateContent with responseModalities: ["IMAGE"]. Imagen models use :predict (auto-detected; API key via ?key=). Non-streaming.
| Parameter | Handling | Notes |
|---|---|---|
| prompt | Text / Instances[0].Prompt | Gemini vs Imagen path |
| n | candidateCount or sampleCount | Model-dependent |
| size | WxH → aspectRatio + imageSize | Imagen: 1k/2k buckets |
| output_format | MIME type | png, jpeg, webp |
| seed, negative_prompt | Direct pass-through |
curl -X POST http://localhost:8080/v1/images/generations \
-H "Content-Type: application/json" \
-d '{
"model": "gemini/imagen-4.0-generate-001",
"prompt": "A sunset over the mountains",
"size": "1024x1024",
"n": 1,
"output_format": "png"
}'9) Image Edit
multipart/form-data only. Gemini and Imagen paths; Imagen supports inpainting, outpainting, inpaint_removal, bgswap. Image variation is not supported.
curl -X POST http://localhost:8080/v1/images/edits \ -F "model=gemini/gemini-2.0-flash-001" \ -F "prompt=Add a rainbow in the sky" \ -F "image[]=@photo.png;type=image/png"
10) List Models
Lists Gemini models with OpenAI-style metadata. Gateway: GET /v1/models. Upstream: GET /v1beta/models with pageSize / pageToken.
name→id(withgemini/prefix)displayName→nameinputTokenLimit/outputTokenLimit→ max token fields
curl http://localhost:8080/v1/models
11) Video Generation
Veo models via long-running predictLongRunning. JSON body on POST /v1/videos. Poll with GET /v1/videos/{id}; download via /content.
| Operation | Supported | Gateway |
|---|---|---|
| Generate | Yes | POST /v1/videos |
| Retrieve status | Yes | GET /v1/videos/{id} |
| Download | Yes | GET /v1/videos/{id}/content |
| Delete / List / Remix | No | Not supported |
size maps to aspect ratio (e.g. 1280x720 → 16:9). Safety filters may return failed with content_filter.
curl -X POST http://localhost:8080/v1/videos \
-H "Content-Type: application/json" \
-d '{
"model": "gemini/veo-3.1-generate-preview",
"prompt": "A calico cat playing piano on stage",
"seconds": "8",
"size": "1280x720"
}'Implementation caveats
| Caveat | Impact | Severity |
|---|---|---|
| Role remapping | Assistant role maps to "model" in Gemini format | Low |
| System message handling | System instructions become systemInstruction field (separate) | Medium |
| Consecutive tool messages | Merged into single user message per Gemini requirements | Medium |
| Thinking content marking | Thinking blocks appear as marked text parts, not separate | Low |
| Function call arguments | Converted from objects to JSON strings (requires parsing) | Medium |
| Streaming finish reasons | Only appear in final chunk; no early completion detection | Low |
Authoritative references
- Bifrost Gemini provider reference: docs.getbifrost.ai/providers/supported-providers/gemini
- Google Gemini documentation: ai.google.dev/docs
- Bifrost provider support overview: docs.getbifrost.ai/providers/supported-providers/overview