Vertex AI provider summary
Bifrost routes requests to Google Vertex AI with multi-model support. The provider handles both Gemini and Anthropic Claude models through unified conversion logic, with automatic OAuth2 authentication and region-specific endpoint construction.
Common Vertex AI model IDs used in Bifrost routes:
gemini-2.0-flash(Latest Gemini)gemini-1.5-pro(Advanced)claude-3-5-sonnet@20241022(Claude)imagen-3.0-generate-002(Image Gen)
| Property | Details |
|---|---|
| Description | Google Vertex AI multi-model provider supporting Gemini and Claude models. |
| Provider route on Bifrost | vertex/<model> |
| Provider doc | Vertex AI API Reference |
| API endpoint for provider | https://us-central1-aiplatform.googleapis.com |
| Authentication | OAuth2, Service Account, API Key |
Supported operations
Vertex AI supports 7 major operations across chat, responses API, embeddings, image generation, and video generation. Chat and Responses support streaming.
| Operation | Non-streaming | Streaming | Upstream endpoint |
|---|---|---|---|
| Chat Completions | Yes | Yes | /generate |
| Responses API | Yes | Yes | /messages |
| Embeddings | Yes | No | /embeddings |
| Image Generation | Yes | No | /generateContent or /predict |
| Image Edit | Yes | No | /generateContent or /predict |
| Video Generation | Yes | No | /predictLongRunning |
| List Models | Yes | No | /models |
Parameter handling
Vertex AI parameters are converted from OpenAI format to Vertex-specific formats. Model detection is automatic based on model name prefixes (gemini vs claude).
Video generation constraints:
- Video generation is exclusive to the Veo model
- Requires valid resolution and duration parameters
Model-specific variations:
- Gemini models: Uses /generate endpoint
- Claude models: Uses /messages endpoint
- Image operations: Uses /generateContent or /predict based on model
Supported Vertex AI parameters
Quick reference of OpenAI-compatible parameters accepted when routing through Bifrost to Vertex AI.
[ "stream", "temperature", "max_tokens", "max_output_tokens", "top_p", "top_k", "stop", "tools", "tool_choice", "response_format" ]
Supported Vertex AI models
Use the provider prefix vertex/ in Bifrost model routes for deterministic provider targeting.
| Family | Model ID | Bifrost route | Typical usage |
|---|---|---|---|
| Gemini 2.0 Flash | gemini-2.0-flash | vertex/gemini-2.0-flash | Fast, cost-effective model |
| Gemini 2.0 Pro | gemini-2.0-pro | vertex/gemini-2.0-pro | Advanced reasoning |
| Gemini 1.5 Flash | gemini-1.5-flash | vertex/gemini-1.5-flash | Fast responses |
| Gemini 1.5 Pro | gemini-1.5-pro | vertex/gemini-1.5-pro | Advanced multimodal |
| Claude 3.5 Sonnet | claude-3-5-sonnet@20241022 | vertex/claude-3-5-sonnet@20241022 | High performance Claude |
| Claude 3 Opus | claude-3-opus@20240229 | vertex/claude-3-opus@20240229 | Complex tasks |
| Imagen 3 | imagen-3.0-generate-002 | vertex/imagen-3.0-generate-002 | Image generation |
| Veo | veo-1 | vertex/veo-1 | Video generation |
Setup and configuration
Vertex AI requires Google Cloud project configuration and authentication credentials. Three authentication methods are supported. See Setup & configuration in Bifrost docs.
The aliases field (mapping model names to fine-tuned model IDs) requires v1.5.0-prerelease2 or later. On v1.4.x, use deployments inside vertex_key_config instead — see the v1.5.0 migration guide.
1. Service Account JSON (recommended for production)
Provide a credential JSON string in auth_credentials. The JSON must contain a type field. Supported types include service_account (most common), impersonated_service_account, authorized_user, external_account, and external_account_authorized_user.
Web UI
- Navigate to Model Providers → Configurations → Google Vertex
- Click Add Key (or edit an existing key)
- Under Authentication Method, select Service Account (JSON)
- Set Project ID, Region (e.g. us-central1), and Auth Credentials (paste JSON or env var such as env.VERTEX_CREDENTIALS)
- Set Project Number only if using fine-tuned models; configure Aliases for fine-tuned model IDs
- Save
API
# Step 1: Create the provider
curl -X POST http://localhost:8080/api/providers \
-H "Content-Type: application/json" \
-d '{"provider": "vertex"}'
# Step 2: Create a key (Service Account JSON)
curl -X POST http://localhost:8080/api/providers/vertex/keys \
-H "Content-Type: application/json" \
-d '{
"name": "vertex-sa-key",
"value": "",
"models": ["*"],
"weight": 1.0,
"vertex_key_config": {
"project_id": "env.VERTEX_PROJECT_ID",
"region": "us-central1",
"auth_credentials": "env.VERTEX_CREDENTIALS"
}
}'config.json
{
"providers": {
"vertex": {
"keys": [
{
"name": "vertex-sa-key",
"value": "",
"models": ["*"],
"weight": 1.0,
"vertex_key_config": {
"project_id": "env.VERTEX_PROJECT_ID",
"region": "us-central1",
"auth_credentials": "env.VERTEX_CREDENTIALS"
}
}
]
}
}
}2. Application Default Credentials
Leave auth_credentials empty. Bifrost calls google.FindDefaultCredentials(), which resolves credentials in this order:
GOOGLE_APPLICATION_CREDENTIALS(path to a JSON credential file)- Application default credential file from gcloud auth application-default login
- GCE/GKE/Cloud Run/App Engine metadata server (attached service account or Workload Identity)
In the Web UI, select Service Account (Attached), set Project ID and Region, and leave credentials empty. For GKE, see GKE Workload Identity Federation in Bifrost docs.
curl -X POST http://localhost:8080/api/providers/vertex/keys \
-H "Content-Type: application/json" \
-d '{
"name": "vertex-adc-key",
"value": "",
"models": ["*"],
"weight": 1.0,
"vertex_key_config": {
"project_id": "env.VERTEX_PROJECT_ID",
"region": "us-central1",
"auth_credentials": ""
}
}'3. API key (Gemini and fine-tuned models only)
Set value to your Vertex API key. API key authentication works only for Gemini models and fine-tuned Gemini models. For Anthropic models on Vertex, use Service Account or Application Default Credentials.
curl -X POST http://localhost:8080/api/providers/vertex/keys \
-H "Content-Type: application/json" \
-d '{
"name": "vertex-api-key",
"value": "env.VERTEX_API_KEY",
"models": ["gemini-pro", "gemini-2.0-flash", "my-fine-tuned-model"],
"weight": 1.0,
"aliases": {
"my-fine-tuned-model": "123456789"
},
"vertex_key_config": {
"project_id": "env.VERTEX_PROJECT_ID",
"project_number": "env.VERTEX_PROJECT_NUMBER",
"region": "us-central1"
}
}'Fine-tuned model support on Vertex is currently in beta. Test non-Gemini fine-tuned models before production use.
Configuration fields
vertex_key_config
| Field | Required | Description |
|---|---|---|
| project_id | Yes | Google Cloud project ID |
| region | Yes | GCP region (e.g. us-central1, eu-west1, global) |
| auth_credentials | No | Service account JSON string (leave empty for ADC) |
| project_number | No | GCP project number (required for fine-tuned models) |
Key-level fields
| Field | Required | Description |
|---|---|---|
| value | No | Vertex API key (Gemini and fine-tuned models only; leave empty for Service Account / ADC) |
| aliases | No | Map model names to fine-tuned model IDs or endpoint identifiers (v1.5.0-prerelease2+) |
| models | Yes | Models this key can serve; use ["*"] to allow all |
API reference
Gateway paths and Vertex AI upstream endpoints.
1) Chat Completions
Primary chat path. Bifrost detects Gemini vs Anthropic from the model name and converts to the appropriate Vertex format. Upstream: /generate (Gemini) or Anthropic message format (Claude). See Chat Completions in Bifrost docs.
| Parameter | Vertex handling | Notes |
|---|---|---|
| model | Maps to Vertex model ID | Region-specific endpoint constructed automatically |
| All other params | Model-specific conversion | Converted per underlying provider (Gemini/Anthropic) |
Gemini models
System prompts, tool usage, and streaming map to Gemini formats. See Gemini provider docs.
Anthropic models (Claude)
- Reasoning parameters convert to
thinkingstructure - System messages extracted to a separate
systemfield - API version set to
vertex-2023-10-16; minimum reasoning budget 1024 tokens - Model field removed from upstream request (Vertex uses different identification)
Region selection
| Region | Endpoint | Purpose |
|---|---|---|
| us-central1 | us-central1-aiplatform.googleapis.com | US Central |
| us-west1 | us-west1-aiplatform.googleapis.com | US West |
| eu-west1 | eu-west1-aiplatform.googleapis.com | Europe West |
| global | aiplatform.googleapis.com | Global (no region prefix) |
Streaming: Gemini uses SSE; Anthropic uses Anthropic message streaming. Configure vertex_key_config with project_id, region, and auth_credentials (or leave empty for ADC).
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "vertex/gemini-2.0-flash",
"messages": [{"role": "user", "content": "Hello"}]
}'2) Responses API
Available for both Gemini and Anthropic (Claude) models on Vertex. Upstream routes to /messages for Claude. See Responses API in Bifrost docs.
| Parameter | Vertex handling | Notes |
|---|---|---|
| instructions | Becomes system message | Model-specific conversion |
| input | Converted to messages | String or array support |
| max_output_tokens | Model-specific field mapping | Gemini vs Anthropic conversion |
| All other params | Model-specific conversion | Converted per underlying provider |
- Anthropic: endpoint
/v1/messages;anthropic_versionset tovertex-2023-10-16 - Model and region fields removed from upstream request; raw body passthrough supported
- Gemini: conversion follows Gemini Responses API format
curl -X POST http://localhost:8080/v1/responses \
-H "Content-Type: application/json" \
-d '{
"model": "vertex/claude-3-5-sonnet",
"input": "What is AI?",
"instructions": "You are a helpful assistant"
}'3) Embeddings
Supported for Gemini and other embedding-capable models. Upstream /embeddings — no streaming. Use extra_params for task-specific options. See Embeddings in Bifrost docs.
| Parameter | Vertex mapping | Notes |
|---|---|---|
| input | instances[].content | Text to embed |
| dimensions | parameters.outputDimensionality | Optional output size |
Advanced parameters (extra_params)
| Parameter | Type | Description |
|---|---|---|
| task_type | string | RETRIEVAL_QUERY, RETRIEVAL_DOCUMENT, SEMANTIC_SIMILARITY, CLASSIFICATION, CLUSTERING (optional) |
| title | string | Optional title to improve embeddings (used with task_type) |
| autoTruncate | boolean | Auto-truncate input to max tokens (defaults to true) |
Response includes values, statistics.token_count, and statistics.truncated. Bifrost preserves float64 precision from Vertex.
curl -X POST http://localhost:8080/v1/embeddings \
-H "Content-Type: application/json" \
-d '{
"model": "vertex/text-embedding-004",
"input": ["text to embed"],
"dimensions": 256,
"task_type": "RETRIEVAL_DOCUMENT",
"title": "Document title",
"autoTruncate": true
}'4) Image Generation
Supported for Gemini and Imagen. Bifrost auto-detects model type and routes to /generateContent or /predict. Streaming is not supported. See Image Generation in Bifrost docs.
| Parameter | Vertex handling | Notes |
|---|---|---|
| model | Mapped to deployment/model identifier | Model type detected automatically |
| prompt | Model-specific conversion | Converted per Gemini or Imagen |
| All other params | Model-specific conversion | Converted per underlying provider |
- Gemini: same conversion as Gemini Image Generation
- Imagen: Imagen-specific format via
IsImagenModel() - Fine-tuned:
.../endpoints/{deployment}:generateContent - Region field removed from request body before upstream call
curl -X POST http://localhost:8080/v1/images/generations \
-H "Content-Type: application/json" \
-d '{
"model": "vertex/imagen-4.0-generate-001",
"prompt": "A sunset over the mountains",
"size": "1024x1024",
"n": 2
}'5) Image Edit
Uses multipart/form-data, not JSON. Supported for Gemini and Imagen only; other models return ConfigurationError. Image variation is not supported. See Image Edit in Bifrost docs.
| Parameter | Type | Required | Notes |
|---|---|---|---|
| model | string | Yes | Must be Gemini or Imagen model |
| prompt | string | Yes | Text description of the edit |
| image[] | binary | Yes | Image file(s) to edit (supports multiple) |
| mask | binary | No | Mask image file |
| type | string | No | inpainting, outpainting, inpaint_removal, bgswap (Imagen only) |
| n | int | No | Number of images to generate (1–10) |
| output_format | string | No | png, webp, jpeg |
| output_compression | int | No | Compression level (0–100%) |
| seed | int | No | Via ExtraParams["seed"] |
| negative_prompt | string | No | Via ExtraParams["negativePrompt"] |
| maskMode | string | No | Imagen only: MASK_MODE_USER_PROVIDED, BACKGROUND, FOREGROUND, SEMANTIC |
| dilation | float | No | Imagen only: range [0, 1] |
| maskClasses | int[] | No | Imagen only: for MASK_MODE_SEMANTIC |
Conversion matches Gemini Image Edit. Gemini strips unsupported fields; Imagen supports mask modes and dilation. Streaming not supported.
curl -X POST http://localhost:8080/v1/images/edits \ -F "model=vertex/imagen-3.0-generate-002" \ -F "prompt=Add sunglasses to the person" \ -F "image[]=@photo.png"
6) List Models
GET /v1/models — uses project_id and region from key config. No request parameters required. See List Models in Bifrost docs.
Vertex's List Models API returns only custom fine-tuned models (digit-only deployment IDs). Bifrost performs three-pass discovery to include foundation models:
- Custom models from the Vertex API response
- Foundation models from your
aliasesconfiguration - Models in the key-level
modelsallowlist not already in aliases
- Empty
modelsand no aliases → no models returned models: ["*"]→ all passes included- Non-empty
models→ filtered to allowlist; duplicates prevented - Pagination handled internally when
next_page_tokenis present
curl http://localhost:8080/v1/models
Example response shape
{
"models": [
{
"name": "projects/{project}/locations/{region}/models/gemini-2.0-flash",
"display_name": "Gemini 2.0 Flash",
"description": "Fast multimodal model",
"version_id": "1",
"version_aliases": ["latest", "stable"],
"capabilities": [...],
"deployed_models": [...]
}
],
"next_page_token": "..."
}7) Video Generation
Veo models only via /predictLongRunning. Parameters match Gemini Video Generation. See Video Generation in Bifrost docs.
curl -X POST http://localhost:8080/v1/videos \
-H "Content-Type: application/json" \
-d '{
"model": "vertex/veo-2.0-generate-001",
"prompt": "A bird flying through clouds"
}'Implementation caveats
| Caveat | Impact | Severity |
|---|---|---|
| Project ID and region required | Request fails without valid project_id and region in vertex_key_config | High |
| List Models API limitation | Vertex API returns only custom models; Bifrost three-pass discovery adds aliases/allowlist | High |
| OAuth2 token management | Tokens cached and refreshed automatically; first request may be slower | Medium |
| Anthropic model detection | Gemini vs Claude conversion applied transparently by model name | Medium |
| Anthropic version lock | anthropic_version always vertex-2023-10-16 for Claude on Vertex | Low |
| Embeddings precision | float64 vectors preserved in /v1/embeddings responses | Low |
| Video generation exclusivity | Only Veo models; non-Veo returns configuration error | Medium |
Authoritative references
- Bifrost Vertex AI provider reference: docs.getbifrost.ai/providers/supported-providers/vertex
- Google Vertex AI API docs: cloud.google.com/vertex-ai/generative-ai/docs/reference/rest
- Bifrost provider support overview: docs.getbifrost.ai/providers/supported-providers/overview