Azure on Bifrost: Models, Endpoints, Setup, and Mappings

Azure provider summary

Bifrost manages Azure OpenAI Service with automatic deployment-to-model-name mapping, multi-auth support, and intelligent model family detection.

Azure supports:

OpenAI models (GPT-4, GPT-3.5, DALL-E, etc.)
Anthropic Claude models (via Azure partnership)
Flexible deployment mapping via aliases
Multiple authentication methods

Property	Details
Description	Microsoft's OpenAI API service with enterprise features and API compatibility.
Provider route on Bifrost	azure/<model>
Provider doc	Azure OpenAI Service
API endpoint pattern	https://<resource>.openai.azure.com
Default API version	2024-10-21

Authentication methods

Bifrost supports three authentication methods. Precedence: Entra ID (if configured) → API key → managed identity.

Method	Credentials	Use case
Entra ID (Service Principal)	client_id, client_secret, tenant_id	Recommended for enterprise
API Key	Direct API key auth	Simple setup
Default Credential (managed identity)	Automatic discovery	Cloud native

Supported operations

Azure OpenAI Service supports most OpenAI operations. Batch, text completions, speech (TTS), and image variation are not supported upstream. Responses API uses a preview API version and is available for both OpenAI and Anthropic models.

Operation	Non-streaming	Streaming	Upstream endpoint
Chat Completions	Yes	Yes	/v1/chat/completions
Responses API	Yes	Yes	/v1/responses
Embeddings	Yes	No	/v1/embeddings
Image Generation	Yes	Yes	/openai/v1/images/generations
Image Edit	Yes	Yes	/openai/v1/images/edits
Files	Yes	No	/openai/v1/files
List Models	Yes	-	/openai/v1/models
Video Generation	Yes	-	/openai/v1/videos
Image Variation	No	No	-
Speech (TTS)	No	No	-
Batch	No	No	-

Deployment mapping

Azure deployments must be mapped to model names via aliases in your API key configuration. This enables Bifrost to route requests to the correct Azure deployment.

{
  "aliases": {
    "gpt-4o": "my-gpt4o-deployment",
    "gpt-4-turbo": "my-gpt4-turbo-deployment",
    "claude-3-5-sonnet": "my-claude-deployment"
  },
  "azure_key_config": {
    "endpoint": "https://your-org.openai.azure.com",
    "api_version": "2024-10-21"
  }
}

API reference

Standard OpenAI-compatible endpoints routed through Azure OpenAI Service.

1) Chat Completions

Primary chat endpoint. Automatically routes to Azure deployment via alias mapping.

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "api-key: YOUR_AZURE_API_KEY" \
  -d '{
    "model": "azure/gpt-4o",
    "messages": [{"role": "user", "content": "Hello"}],
    "deployment": "my-gpt4-deployment",
    "endpoint": "https://my-org.openai.azure.com"
  }'

2) Responses API

The Responses API is available for both OpenAI and Anthropic models on Azure. Bifrost routes to upstream /openai/v1/responses using the preview API version (distinct from Chat Completions API version).

Parameter	Azure handling	Notes
instructions	Becomes system message	Model-specific conversion
input	Converted to user message(s)	String or array support
max_output_tokens	Model-specific field mapping	OpenAI vs Anthropic conversion
All other params	Model-specific conversion	Converted per underlying provider

OpenAI models (GPT-4, etc.): conversion follows OpenAI's Responses API format.

Anthropic models (Claude): instructions becomes a system message; reasoning maps to the thinking structure.

curl -X POST http://localhost:8080/v1/responses \
  -H "Content-Type: application/json" \
  -H "api-key: YOUR_AZURE_API_KEY" \
  -d '{
    "model": "azure/claude-3-sonnet",
    "input": "Hello, how are you?",
    "instructions": "You are a helpful assistant",
    "deployment": "my-claude-deployment",
    "endpoint": "https://my-org.openai.azure.com"
  }'

Uses /openai/v1/responses with preview API version
Request body conversions are handled automatically by Bifrost
Supports raw request body passthrough for advanced use cases

See also OpenAI Responses API (gpt-oss reasoning) and Anthropic Responses API for model-family specifics.

3) Embeddings

Text embeddings via Azure OpenAI Service. Maps to configured deployment.

Embeddings are supported for OpenAI models only (not available for Anthropic models on Azure).

curl -X POST http://localhost:8080/v1/embeddings \
  -H "Content-Type: application/json" \
  -H "api-key: YOUR_AZURE_API_KEY" \
  -d '{
    "model": "text-embedding-3-small",
    "input": ["text to embed"],
    "deployment": "my-embedding-deployment"
  }'

4) Image Generation

Image Generation is supported for OpenAI models on Azure and uses the OpenAI-compatible format. Bifrost routes to upstream /openai/v1/images/generations. Conversion matches OpenAI Image Generation.

Parameter	Azure handling	Notes
model	Mapped to deployment_id	Deployment ID must be configured in aliases
prompt	Direct pass-through	Prompt text for image generation
All other params	Direct pass-through	Uses OpenAI format via struct embedding

bifrostReq.Model maps to deployment ID; bifrostReq.Prompt passes through
Additional fields from the request are embedded into the upstream struct (OpenAI-compatible)

curl -X POST http://localhost:8080/v1/images/generations \
  -H "Content-Type: application/json" \
  -H "api-key: YOUR_AZURE_API_KEY" \
  -d '{
    "model": "azure/dall-e-3",
    "prompt": "A sunset over the mountains",
    "size": "1024x1024",
    "n": 1,
    "deployment": "my-image-gen-deployment"
  }'

Response conversion

Non-streaming: Azure responses unmarshal into BifrostImageGenerationResponse (superset of OpenAI; fields pass through)
Streaming: Server-Sent Events with the same event types as OpenAI image generation streaming

5) Video Generation

Azure routes video generation to OpenAI's Sora models via the Azure OpenAI-compatible endpoint. Parameters match OpenAI Video Generation. Bifrost upstream path: /openai/v1/videos.

Operation	Supported	Gateway endpoint
Generate	Yes	POST /v1/videos
Retrieve	Yes	GET /v1/videos/{id}
Download	Yes	GET /v1/videos/{id}/content
Delete	Yes	DELETE /v1/videos/{id}
List	Yes	GET /v1/videos
Remix	No	Not supported

curl -X POST http://localhost:8080/v1/videos \
  -H "Content-Type: application/json" \
  -H "api-key: YOUR_AZURE_API_KEY" \
  -d '{
    "model": "azure/sora",
    "prompt": "A calico cat playing piano on stage"
  }'

6) Files API

Files operations are supported for OpenAI models only (not Anthropic deployments on Azure). Bifrost routes file requests to upstream /openai/v1/files. Files are stored in Azure.

Operation	Support
Upload	Yes
List	Yes
Retrieve	Yes
Delete	Yes
Get Content	Yes

curl -X POST http://localhost:8080/v1/files \
  -H "api-key: YOUR_AZURE_API_KEY" \
  -F "file=@document.pdf" \
  -F "purpose=assistants"

Gateway endpoints:

POST /v1/files (multipart upload)
GET /v1/files
GET /v1/files/{file_id}
DELETE /v1/files/{file_id}
GET /v1/files/{file_id}/content

7) List Models

Lists available models and deployments configured for your Azure key. No request parameters are required. Bifrost routes to upstream /openai/v1/models. The response includes model metadata, capabilities, and lifecycle status.

curl http://localhost:8080/v1/models

Example response shape

{
  "data": [
    {
      "id": "gpt-4",
      "object": "model",
      "created": 1687882411,
      "status": "active",
      "lifecycle_status": "stable",
      "capabilities": {
        "chat_completion": true,
        "embeddings": false
      }
    }
  ]
}

Implementation caveats

Caveat	Impact	Severity
Deployment mapping required	Models must be aliased to Azure deployments	High
Responses API preview version	Responses uses preview API version; may differ from Chat Completions version	Medium
No batch support	Batch operations return UnsupportedOperationError	Medium
Model auto-detection	Bifrost detects OpenAI vs Anthropic and applies conversion	Low
Anthropic beta headers	Validated per deployment type	Low
Image variation unsupported	Image variation operation not available in Azure	Low