Google Vertex AI on Bifrost: Models, Endpoints, Setup, and Mappings

Vertex AI provider summary

Bifrost routes requests to Google Vertex AI with multi-model support. The provider handles both Gemini and Anthropic Claude models through unified conversion logic, with automatic OAuth2 authentication and region-specific endpoint construction.

Common Vertex AI model IDs used in Bifrost routes:

gemini-2.0-flash (Latest Gemini)
gemini-1.5-pro (Advanced)
claude-3-5-sonnet@20241022 (Claude)
imagen-3.0-generate-002 (Image Gen)

Property	Details
Description	Google Vertex AI multi-model provider supporting Gemini and Claude models.
Provider route on Bifrost	vertex/<model>
Provider doc	Vertex AI API Reference
API endpoint for provider	https://us-central1-aiplatform.googleapis.com
Authentication	OAuth2, Service Account, API Key

Supported operations

Vertex AI supports 7 major operations across chat, responses API, embeddings, image generation, and video generation. Chat and Responses support streaming.

Operation	Non-streaming	Streaming	Upstream endpoint
Chat Completions	Yes	Yes	/generate
Responses API	Yes	Yes	/messages
Embeddings	Yes	No	/embeddings
Image Generation	Yes	No	/generateContent or /predict
Image Edit	Yes	No	/generateContent or /predict
Video Generation	Yes	No	/predictLongRunning
List Models	Yes	No	/models

Parameter handling

Vertex AI parameters are converted from OpenAI format to Vertex-specific formats. Model detection is automatic based on model name prefixes (gemini vs claude).

Video generation constraints:

Video generation is exclusive to the Veo model
Requires valid resolution and duration parameters

Model-specific variations:

Gemini models: Uses /generate endpoint
Claude models: Uses /messages endpoint
Image operations: Uses /generateContent or /predict based on model

Supported Vertex AI parameters

Quick reference of OpenAI-compatible parameters accepted when routing through Bifrost to Vertex AI.

[
  "stream",
  "temperature",
  "max_tokens",
  "max_output_tokens",
  "top_p",
  "top_k",
  "stop",
  "tools",
  "tool_choice",
  "response_format"
]

Supported Vertex AI models

Use the provider prefix vertex/ in Bifrost model routes for deterministic provider targeting.

Family	Model ID	Bifrost route	Typical usage
Gemini 2.0 Flash	gemini-2.0-flash	vertex/gemini-2.0-flash	Fast, cost-effective model
Gemini 2.0 Pro	gemini-2.0-pro	vertex/gemini-2.0-pro	Advanced reasoning
Gemini 1.5 Flash	gemini-1.5-flash	vertex/gemini-1.5-flash	Fast responses
Gemini 1.5 Pro	gemini-1.5-pro	vertex/gemini-1.5-pro	Advanced multimodal
Claude 3.5 Sonnet	claude-3-5-sonnet@20241022	vertex/claude-3-5-sonnet@20241022	High performance Claude
Claude 3 Opus	claude-3-opus@20240229	vertex/claude-3-opus@20240229	Complex tasks
Imagen 3	imagen-3.0-generate-002	vertex/imagen-3.0-generate-002	Image generation
Veo	veo-1	vertex/veo-1	Video generation

Setup and configuration

Vertex AI requires Google Cloud project configuration and authentication credentials. Three authentication methods are supported. See Setup & configuration in Bifrost docs.

The aliases field (mapping model names to fine-tuned model IDs) requires v1.5.0-prerelease2 or later. On v1.4.x, use deployments inside vertex_key_config instead — see the v1.5.0 migration guide.

1. Service Account JSON (recommended for production)

Provide a credential JSON string in auth_credentials. The JSON must contain a type field. Supported types include service_account (most common), impersonated_service_account, authorized_user, external_account, and external_account_authorized_user.

Web UI

Navigate to Model Providers → Configurations → Google Vertex
Click Add Key (or edit an existing key)
Under Authentication Method, select Service Account (JSON)
Set Project ID, Region (e.g. us-central1), and Auth Credentials (paste JSON or env var such as env.VERTEX_CREDENTIALS)
Set Project Number only if using fine-tuned models; configure Aliases for fine-tuned model IDs
Save

API

# Step 1: Create the provider
curl -X POST http://localhost:8080/api/providers \
  -H "Content-Type: application/json" \
  -d '{"provider": "vertex"}'

# Step 2: Create a key (Service Account JSON)
curl -X POST http://localhost:8080/api/providers/vertex/keys \
  -H "Content-Type: application/json" \
  -d '{
    "name": "vertex-sa-key",
    "value": "",
    "models": ["*"],
    "weight": 1.0,
    "vertex_key_config": {
      "project_id": "env.VERTEX_PROJECT_ID",
      "region": "us-central1",
      "auth_credentials": "env.VERTEX_CREDENTIALS"
    }
  }'

config.json

{
  "providers": {
    "vertex": {
      "keys": [
        {
          "name": "vertex-sa-key",
          "value": "",
          "models": ["*"],
          "weight": 1.0,
          "vertex_key_config": {
            "project_id": "env.VERTEX_PROJECT_ID",
            "region": "us-central1",
            "auth_credentials": "env.VERTEX_CREDENTIALS"
          }
        }
      ]
    }
  }
}

2. Application Default Credentials

Leave auth_credentials empty. Bifrost calls google.FindDefaultCredentials(), which resolves credentials in this order:

GOOGLE_APPLICATION_CREDENTIALS (path to a JSON credential file)
Application default credential file from gcloud auth application-default login
GCE/GKE/Cloud Run/App Engine metadata server (attached service account or Workload Identity)

In the Web UI, select Service Account (Attached), set Project ID and Region, and leave credentials empty. For GKE, see GKE Workload Identity Federation in Bifrost docs.

curl -X POST http://localhost:8080/api/providers/vertex/keys \
  -H "Content-Type: application/json" \
  -d '{
    "name": "vertex-adc-key",
    "value": "",
    "models": ["*"],
    "weight": 1.0,
    "vertex_key_config": {
      "project_id": "env.VERTEX_PROJECT_ID",
      "region": "us-central1",
      "auth_credentials": ""
    }
  }'

3. API key (Gemini and fine-tuned models only)

Set value to your Vertex API key. API key authentication works only for Gemini models and fine-tuned Gemini models. For Anthropic models on Vertex, use Service Account or Application Default Credentials.

curl -X POST http://localhost:8080/api/providers/vertex/keys \
  -H "Content-Type: application/json" \
  -d '{
    "name": "vertex-api-key",
    "value": "env.VERTEX_API_KEY",
    "models": ["gemini-pro", "gemini-2.0-flash", "my-fine-tuned-model"],
    "weight": 1.0,
    "aliases": {
      "my-fine-tuned-model": "123456789"
    },
    "vertex_key_config": {
      "project_id": "env.VERTEX_PROJECT_ID",
      "project_number": "env.VERTEX_PROJECT_NUMBER",
      "region": "us-central1"
    }
  }'

Fine-tuned model support on Vertex is currently in beta. Test non-Gemini fine-tuned models before production use.

Configuration fields

vertex_key_config

Field	Required	Description
project_id	Yes	Google Cloud project ID
region	Yes	GCP region (e.g. us-central1, eu-west1, global)
auth_credentials	No	Service account JSON string (leave empty for ADC)
project_number	No	GCP project number (required for fine-tuned models)

Key-level fields

Field	Required	Description
value	No	Vertex API key (Gemini and fine-tuned models only; leave empty for Service Account / ADC)
aliases	No	Map model names to fine-tuned model IDs or endpoint identifiers (v1.5.0-prerelease2+)
models	Yes	Models this key can serve; use ["*"] to allow all

API reference

Gateway paths and Vertex AI upstream endpoints.

1) Chat Completions

Primary chat path. Bifrost detects Gemini vs Anthropic from the model name and converts to the appropriate Vertex format. Upstream: /generate (Gemini) or Anthropic message format (Claude). See Chat Completions in Bifrost docs.

Parameter	Vertex handling	Notes
model	Maps to Vertex model ID	Region-specific endpoint constructed automatically
All other params	Model-specific conversion	Converted per underlying provider (Gemini/Anthropic)

Gemini models

System prompts, tool usage, and streaming map to Gemini formats. See Gemini provider docs.

Anthropic models (Claude)

Reasoning parameters convert to thinking structure
System messages extracted to a separate system field
API version set to vertex-2023-10-16; minimum reasoning budget 1024 tokens
Model field removed from upstream request (Vertex uses different identification)

Region selection

Region	Endpoint	Purpose
us-central1	us-central1-aiplatform.googleapis.com	US Central
us-west1	us-west1-aiplatform.googleapis.com	US West
eu-west1	eu-west1-aiplatform.googleapis.com	Europe West
global	aiplatform.googleapis.com	Global (no region prefix)

Streaming: Gemini uses SSE; Anthropic uses Anthropic message streaming. Configure vertex_key_config with project_id, region, and auth_credentials (or leave empty for ADC).

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "vertex/gemini-2.0-flash",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

2) Responses API

Available for both Gemini and Anthropic (Claude) models on Vertex. Upstream routes to /messages for Claude. See Responses API in Bifrost docs.

Parameter	Vertex handling	Notes
instructions	Becomes system message	Model-specific conversion
input	Converted to messages	String or array support
max_output_tokens	Model-specific field mapping	Gemini vs Anthropic conversion
All other params	Model-specific conversion	Converted per underlying provider

Anthropic: endpoint /v1/messages; anthropic_version set to vertex-2023-10-16
Model and region fields removed from upstream request; raw body passthrough supported
Gemini: conversion follows Gemini Responses API format

curl -X POST http://localhost:8080/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "vertex/claude-3-5-sonnet",
    "input": "What is AI?",
    "instructions": "You are a helpful assistant"
  }'

3) Embeddings

Supported for Gemini and other embedding-capable models. Upstream /embeddings — no streaming. Use extra_params for task-specific options. See Embeddings in Bifrost docs.

Parameter	Vertex mapping	Notes
input	instances[].content	Text to embed
dimensions	parameters.outputDimensionality	Optional output size

Advanced parameters (extra_params)

Parameter	Type	Description
task_type	string	RETRIEVAL_QUERY, RETRIEVAL_DOCUMENT, SEMANTIC_SIMILARITY, CLASSIFICATION, CLUSTERING (optional)
title	string	Optional title to improve embeddings (used with task_type)
autoTruncate	boolean	Auto-truncate input to max tokens (defaults to true)

Response includes values, statistics.token_count, and statistics.truncated. Bifrost preserves float64 precision from Vertex.

curl -X POST http://localhost:8080/v1/embeddings \
  -H "Content-Type: application/json" \
  -d '{
    "model": "vertex/text-embedding-004",
    "input": ["text to embed"],
    "dimensions": 256,
    "task_type": "RETRIEVAL_DOCUMENT",
    "title": "Document title",
    "autoTruncate": true
  }'

4) Image Generation

Supported for Gemini and Imagen. Bifrost auto-detects model type and routes to /generateContent or /predict. Streaming is not supported. See Image Generation in Bifrost docs.

Parameter	Vertex handling	Notes
model	Mapped to deployment/model identifier	Model type detected automatically
prompt	Model-specific conversion	Converted per Gemini or Imagen
All other params	Model-specific conversion	Converted per underlying provider

Gemini: same conversion as Gemini Image Generation
Imagen: Imagen-specific format via IsImagenModel()
Fine-tuned: .../endpoints/{deployment}:generateContent
Region field removed from request body before upstream call

curl -X POST http://localhost:8080/v1/images/generations \
  -H "Content-Type: application/json" \
  -d '{
    "model": "vertex/imagen-4.0-generate-001",
    "prompt": "A sunset over the mountains",
    "size": "1024x1024",
    "n": 2
  }'

5) Image Edit

Uses multipart/form-data, not JSON. Supported for Gemini and Imagen only; other models return ConfigurationError. Image variation is not supported. See Image Edit in Bifrost docs.

Parameter	Type	Required	Notes
model	string	Yes	Must be Gemini or Imagen model
prompt	string	Yes	Text description of the edit
image[]	binary	Yes	Image file(s) to edit (supports multiple)
mask	binary	No	Mask image file
type	string	No	inpainting, outpainting, inpaint_removal, bgswap (Imagen only)
n	int	No	Number of images to generate (1–10)
output_format	string	No	png, webp, jpeg
output_compression	int	No	Compression level (0–100%)
seed	int	No	Via ExtraParams["seed"]
negative_prompt	string	No	Via ExtraParams["negativePrompt"]
maskMode	string	No	Imagen only: MASK_MODE_USER_PROVIDED, BACKGROUND, FOREGROUND, SEMANTIC
dilation	float	No	Imagen only: range [0, 1]
maskClasses	int[]	No	Imagen only: for MASK_MODE_SEMANTIC

Conversion matches Gemini Image Edit. Gemini strips unsupported fields; Imagen supports mask modes and dilation. Streaming not supported.

curl -X POST http://localhost:8080/v1/images/edits \
  -F "model=vertex/imagen-3.0-generate-002" \
  -F "prompt=Add sunglasses to the person" \
  -F "image[]=@photo.png"

6) List Models

GET /v1/models — uses project_id and region from key config. No request parameters required. See List Models in Bifrost docs.

Vertex's List Models API returns only custom fine-tuned models (digit-only deployment IDs). Bifrost performs three-pass discovery to include foundation models:

Custom models from the Vertex API response
Foundation models from your aliases configuration
Models in the key-level models allowlist not already in aliases

Empty models and no aliases → no models returned
models: ["*"] → all passes included
Non-empty models → filtered to allowlist; duplicates prevented
Pagination handled internally when next_page_token is present

curl http://localhost:8080/v1/models

Example response shape

{
  "models": [
    {
      "name": "projects/{project}/locations/{region}/models/gemini-2.0-flash",
      "display_name": "Gemini 2.0 Flash",
      "description": "Fast multimodal model",
      "version_id": "1",
      "version_aliases": ["latest", "stable"],
      "capabilities": [...],
      "deployed_models": [...]
    }
  ],
  "next_page_token": "..."
}

7) Video Generation

Veo models only via /predictLongRunning. Parameters match Gemini Video Generation. See Video Generation in Bifrost docs.

curl -X POST http://localhost:8080/v1/videos \
  -H "Content-Type: application/json" \
  -d '{
    "model": "vertex/veo-2.0-generate-001",
    "prompt": "A bird flying through clouds"
  }'

Implementation caveats

Caveat	Impact	Severity
Project ID and region required	Request fails without valid project_id and region in vertex_key_config	High
List Models API limitation	Vertex API returns only custom models; Bifrost three-pass discovery adds aliases/allowlist	High
OAuth2 token management	Tokens cached and refreshed automatically; first request may be slower	Medium
Anthropic model detection	Gemini vs Claude conversion applied transparently by model name	Medium
Anthropic version lock	anthropic_version always vertex-2023-10-16 for Claude on Vertex	Low
Embeddings precision	float64 vectors preserved in /v1/embeddings responses	Low
Video generation exclusivity	Only Veo models; non-Veo returns configuration error	Medium

Authoritative references

Bifrost Vertex AI provider reference: docs.getbifrost.ai/providers/supported-providers/vertex
Google Vertex AI API docs: cloud.google.com/vertex-ai/generative-ai/docs/reference/rest
Bifrost provider support overview: docs.getbifrost.ai/providers/supported-providers/overview

Google Vertex AI Provider on Bifrost

Vertex AI provider summary

Supported operations

Parameter handling

Supported Vertex AI parameters

Supported Vertex AI models

Setup and configuration

1. Service Account JSON (recommended for production)

2. Application Default Credentials

3. API key (Gemini and fine-tuned models only)

Configuration fields

API reference

Implementation caveats

Authoritative references

Explore Bifrost Resources

Governance

Guardrails

MCP Gateway

Open Source & Enterprise

Try Bifrost Enterprise with a 14-day Free Trial

Drop-in replacement for any AI SDK

[ Features ]

[ Resources ]

[ Industries ]

[ Developers ]

[ Company ]