Try Bifrost Enterprise free for 14 days.
Request access

[ Provider Guide ]

OpenAI Provider on Bifrost

OpenAI serves as Bifrost's baseline schema. Parameters pass through with minimal conversion—primarily validation and filtering of OpenAI-specific features for downstream provider compatibility.

OpenAI provider summary

Bifrost routes OpenAI models with full schema compatibility. Parameters validate and filter based on downstream provider requirements, so multi-provider setups seamlessly adapt requests.

Common OpenAI model IDs used in Bifrost routes:

  • gpt-4o-2024-11-20 (Latest)
  • gpt-4-turbo-2024-04-09 (Turbo)
  • gpt-3.5-turbo-0125 (Fast)
  • o1-2024-12-17 (Reasoning)
PropertyDetails
DescriptionOpenAI models for chat, reasoning, image generation, and audio tasks.
Provider route on Bifrostopenai/<model>
Provider docOpenAI API Reference
API endpoint for providerhttps://api.openai.com
Supported endpoints/v1/chat/completions, /v1/responses, /v1/completions, /v1/embeddings, /v1/audio/*, /v1/images/*, /v1/videos, /v1/files, /v1/batches, /v1/models

Supported operations

OpenAI is Bifrost's baseline schema: 13 operations across chat, Responses API, embeddings, audio, images, video, files, batch, and model listing. Streaming is available for chat, responses, text completions, speech, transcriptions, and image generation/edit. See Supported operations in Bifrost docs.

OperationNon-streamingStreamingUpstream endpoint
Chat CompletionsYesYes/v1/chat/completions
Responses APIYesYes/v1/responses
Text CompletionsYesYes/v1/completions
EmbeddingsYes/v1/embeddings
Speech (TTS)YesYes/v1/audio/speech
Transcriptions (STT)YesYes/v1/audio/transcriptions
Image GenerationYesYes/v1/images/generations
Image EditYesYes/v1/images/edits
Image VariationYes/v1/images/variations
FilesYes/v1/files
BatchYes/v1/batches
Video GenerationYes/v1/videos
List ModelsYes/v1/models

Parameter handling

OpenAI parameters pass through with validation. Bifrost filters provider-specific fields (store, service_tier) before reaching downstream providers. The user field is truncated to 64 characters in chat/text operations.

Reasoning support (o1/o3 models only):

  • Non-o1 models: reasoning summary converted from content blocks
  • o1-oss variants: reasoning content blocks passed directly
  • Minimum budget enforced for structured output conversion

Token enforcement:

  • max_completion_tokens and max_output_tokens enforce 16-token minimum
  • Values below 16 automatically scale up to 16

Supported OpenAI parameters

Quick reference of OpenAI parameters accepted when routing through Bifrost.

[
  "stream",
  "temperature",
  "top_p",
  "top_k",
  "max_tokens",
  "max_completion_tokens",
  "stop",
  "presence_penalty",
  "frequency_penalty",
  "logit_bias",
  "logprobs",
  "top_logprobs",
  "seed",
  "response_format",
  "tools",
  "tool_choice",
  "user",
  "reasoning",
  "parallel_tool_calls",
  "service_tier"
]

Supported OpenAI models

Use the provider prefix openai/ in Bifrost model routes for deterministic provider targeting.

FamilyModel IDBifrost routeTypical usage
GPT-4ogpt-4o-2024-11-20openai/gpt-4o-2024-11-20Flagship reasoning model
GPT-4 Turbogpt-4-turbo-2024-04-09openai/gpt-4-turbo-2024-04-09Previous generation turbo
GPT-4gpt-4-0613openai/gpt-4-0613Baseline GPT-4
GPT-3.5 Turbogpt-3.5-turbo-0125openai/gpt-3.5-turbo-0125Fast, lower-cost option
O1o1-2024-12-17openai/o1-2024-12-17Extended reasoning model
O1-previewo1-preview-2024-09-12openai/o1-preview-2024-09-12Earlier reasoning preview

API reference

OpenAI is Bifrost's baseline schema: parameters pass through with validation and filtering. Gateway routes map 1:1 to upstream OpenAI endpoints. Content aligned with Bifrost OpenAI provider docs.

1) Chat Completions

Primary chat path at /v1/chat/completions. See Chat Completions in Bifrost docs.

ParameterRequiredNotes
modelYesModel identifier
messagesYesChatMessage array; roles: system, user, assistant, tool, developer
temperatureNoSampling temperature (0–2)
top_pNoNucleus sampling
stopNoStop sequences
max_completion_tokensNoMin 16 enforced by Bifrost
frequency_penaltyNoFrequency penalty (-2 to 2)
presence_penaltyNoPresence penalty (-2 to 2)
logit_biasNoToken logit adjustments
logprobsNoInclude log probabilities
top_logprobsNoLog probabilities per token
seedNoReproducibility seed
response_formatNoStructured output format
toolsNoFunction tools; tool_choice: auto, none, required, or specific
parallel_tool_callsNoMultiple simultaneous tool calls
stream_optionsNoStreaming options; include_usage set by default
reasoningNoreasoning.effort and reasoning.max_tokens passed through
userNoTruncated to 64 characters
metadataNoCustom metadata
storeNoFiltered when routing to non-OpenAI providers
service_tierNoFiltered when routing to non-OpenAI providers
prompt_cache_keyNoFiltered when routing to non-OpenAI providers
predictionNoPredicted output for acceleration
audioNoAudio output config
modalitiesNoResponse modalities (text, audio)
  • Messages: text, image_url, input_audio; tool messages include tool_call_id
  • Streaming: SSE with delta.content, delta.tool_calls, finish_reason, usage on final chunk
  • cache_control stripped from messages, content blocks, and tools
  • Reasoning: effort minimal/low/medium/high; minimal → low when routing to other providers
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4o-2024-11-20",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

2) Responses API

Structured output API at /v1/responses. Non-gpt-oss models use reasoning summaries; gpt-oss uses reasoning content blocks. See Responses API in Bifrost docs.

ParameterRequiredNotes
modelYesModel identifier
inputYesText or ContentBlock array
max_output_tokensYesMin 16 enforced by Bifrost
instructionsNoSystem instructions
tools / tool_choiceNoResponsesTool objects and choice strategy
reasoningNoreasoning.max_tokens removed from upstream JSON
temperatureNoSampling temperature
top_pNoNucleus sampling
parallel_tool_callsNoMultiple simultaneous tool calls
previous_response_idNoContinue from prior response
conversationNoConversation ID
backgroundNoBackground mode
includeNoExtra fields in response (e.g. web_search sources)
truncationNoauto or off
userNoTruncated to 64 characters
storeNoStore response for later retrieval
stream_optionsNoinclude_usage set by default for streaming

Supported tool types: functionfile_searchcomputer_use_previewweb_searchmcpcode_interpreterimage_generationlocal_shellcustomweb_search_preview. Action types zoom/region → screenshot. Response: id, status, output, usage.

SSE eventDescription
response.createdResponse created
response.in_progressIn progress
response.output_item.addedOutput item added
response.content_part.addedContent part added
response.output_text.deltaText delta
response.function_call_arguments.deltaFunction call arguments delta
response.completedCompleted
response.incompleteIncomplete
curl -X POST http://localhost:8080/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4o-2024-11-20",
    "input": "Hello",
    "max_output_tokens": 1024
  }'

3) Text Completions (Legacy)

Legacy API at /v1/completions — prefer Chat Completions for new work. Supports streaming. See Text Completions in Bifrost docs.

ParameterRequiredNotes
modelYesModel identifier
promptYesCompletion prompt(s); array prompts → multiple completions
max_tokensNoMaximum output tokens
temperatureNoSampling temperature
top_pNoNucleus sampling
stopNoStop sequences
userNoTruncated to 64 characters
curl -X POST http://localhost:8080/v1/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-3.5-turbo-0125",
    "prompt": "Hello, my name is",
    "max_tokens": 50
  }'

4) Embeddings

/v1/embeddings — no streaming. See Embeddings in Bifrost docs.

ParameterRequiredNotes
modelYesModel identifier
inputYesText or array of texts
encoding_formatNofloat or base64
dimensionsNoOutput embedding dimensions
userNoNot truncated (unlike chat/text)
curl -X POST http://localhost:8080/v1/embeddings \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/text-embedding-3-large",
    "input": "Hello world"
  }'

5) Speech (Text-to-Speech)

/v1/audio/speech — returns raw binary audio; streaming via SSE base64 chunks where supported. See Speech in Bifrost docs.

ParameterRequiredNotes
modelYestts-1 or tts-1-hd
inputYesText to convert to speech
voiceYesalloy, echo, fable, onyx, nova, shimmer
response_formatNomp3, opus, aac, flac, wav, pcm
speedNo0.25 to 4.0 (default 1.0)
curl -X POST http://localhost:8080/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/tts-1",
    "input": "Hello world",
    "voice": "alloy"
  }' --output speech.mp3

6) Transcriptions (Speech-to-Text)

/v1/audio/transcriptions — multipart/form-data (not JSON). Formats: mp3, mp4, mpeg, mpga, m4a, wav, webm. Streaming supported. See Transcriptions in Bifrost docs.

ParameterRequiredNotes
fileYesAudio file (multipart/form-data)
modelYese.g. whisper-1
languageNoISO-639-1 language code
promptNoOptional context
temperatureNoSampling temperature
response_formatNojson, text, srt, vtt, verbose_json
curl -X POST http://localhost:8080/v1/audio/transcriptions \
  -F file=@audio.mp3 \
  -F model=openai/whisper-1

7) Image Generation

/v1/images/generations — pass-through parameters; streaming via SSE (image_generation.partial_image, image_generation.completed). See Image Generation in Bifrost docs.

ParameterRequiredNotes
modelYese.g. dall-e-3
promptYesImage description
nNoNumber of images (1–10)
sizeNo256x256 through 1792x1024, auto
qualityNoauto, high, medium, low, hd, standard
styleNonatural, vivid
response_formatNourl or b64_json
backgroundNotransparent, opaque, auto
output_formatNopng, webp, jpeg
partial_imagesNoPartial images 0–3 for streaming
curl -X POST http://localhost:8080/v1/images/generations \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/dall-e-3",
    "prompt": "A serene landscape",
    "n": 1,
    "size": "1024x1024"
  }'

8) Image Edit

/v1/images/edits — multipart/form-data with image[], optional mask; streaming via image_edit.partial_image / image_edit.completed. See Image Edit in Bifrost docs.

ParameterRequiredNotes
modelYesModel identifier
promptYesEdit description
image[]YesImage file(s) to edit (multipart)
maskNoMask image file
nNoNumber of images (1–10)
sizeNoOutput size
qualityNoImage quality
streamNoEnable SSE streaming

9) Image Variation

/v1/images/variations — multipart/form-data; no streaming. Only the first image is sent upstream. See Image Variation in Bifrost docs.

ParameterRequiredNotes
modelYesModel identifier
imageYesSource image (multipart)
nNoNumber of variations (1–10)
sizeNoOutput size
response_formatNourl or b64_json

10) Files API

Upload, list, retrieve, delete, and download files. See Files API in Bifrost docs.

ParameterRequiredNotes
fileYesFile to upload (multipart)
purposeYesbatch, fine-tune, or assistants
filenameNoCustom filename (defaults to file.jsonl)
  • GET /v1/files — list with purpose, limit, after, order
  • GET /v1/files/{file_id} — metadata
  • DELETE /v1/files/{file_id}
  • GET /v1/files/{file_id}/content — download
curl -X POST http://localhost:8080/v1/files \
  -F "file=@document.pdf" \
  -F "purpose=assistants"

11) Batch API

Async batch jobs at /v1/batches. Statuses: validating, failed, in_progress, finalizing, completed, expired, cancelling, cancelled. See Batch API in Bifrost docs.

ParameterRequiredNotes
input_file_idConditionalFile ID or requests array (not both)
requestsConditionalBatchRequestItem array (converted to JSONL)
endpointYesTarget endpoint (e.g. /v1/chat/completions)
completion_windowNo24h (default)
metadataNoCustom metadata
  • GET /v1/batches/{batch_id} — retrieve
  • POST /v1/batches/{batch_id}/cancel — cancel
  • Results: download output file via Files API when status is completed; parse JSONL BatchResultItem lines

12) List Models

GET /v1/models — no request body. Model IDs in responses are prefixed with openai/; results aggregate across configured API keys. See List Models in Bifrost docs.

curl http://localhost:8080/v1/models

13) Video Generation

Sora-style video jobs at /v1/videos. Job statuses: queued → in_progress → completed / failed. See Video Generation in Bifrost docs.

ParameterRequiredNotes
modelYese.g. sora-2
promptYesVideo description
input_referenceNoBase64 data URL only for image-to-video
secondsNoDuration in seconds
sizeNo720x1280, 1280x720, 1024x1792, 1792x1024
OperationEndpointNotes
Get statusGET /v1/videos/{id}Poll until status: completed
DownloadGET /v1/videos/{id}/contentRaw video bytes
DeleteDELETE /v1/videos/{id}Remove video job
List jobsGET /v1/videosQuery: after, limit, order
RemixPOST /v1/videos/{id}/remixBody: {"prompt": "..."}
curl -X POST http://localhost:8080/v1/videos \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/sora-2",
    "prompt": "A cat walking in the rain"
  }'

Common error codes

HTTP status to OpenAI error type mapping from Bifrost docs.

HTTPError type
400invalid_request_error
401authentication_error
403permission_error
404not_found_error
429rate_limit_error
500api_error

Implementation caveats

CaveatImpactSeverity
User field truncationUser IDs over 64 characters are silently truncatedLow
Provider-specific field filteringstore, service_tier, prompt_cache_key filtered for non-OpenAILow
Cache control strippingCache control annotations stripped from messages when routing to non-OpenAILow
Reasoning model differenceso1-oss models receive reasoning content blocks; others receive summariesMedium
Token minimum enforcementmax_completion_tokens values below 16 automatically scaled to 16Low

Authoritative references

[ BIFROST FEATURES ]

Open Source & Enterprise

Everything you need to run AI in production, from free open source to enterprise-grade features.

01 Governance

SAML support for SSO and Role-based access control and policy enforcement for team collaboration.

02 Adaptive Load Balancing

Automatically optimizes traffic distribution across provider keys and models based on real-time performance metrics.

03 Cluster Mode

High availability deployment with automatic failover and load balancing. Peer-to-peer clustering where every instance is equal.

04 Alerts

Real-time notifications for budget limits, failures, and performance issues on Email, Slack, PagerDuty, Teams, Webhook and more.

05 Log Exports

Export and analyze request logs, traces, and telemetry data from Bifrost with enterprise-grade data export capabilities for compliance, monitoring, and analytics.

06 Audit Logs

Comprehensive logging and audit trails for compliance and debugging.

07 Vault Support

Secure API key management with HashiCorp Vault, AWS Secrets Manager, Google Secret Manager, and Azure Key Vault integration.

08 VPC Deployment

Deploy Bifrost within your private cloud infrastructure with VPC isolation, custom networking, and enhanced security controls.

09 Guardrails

Automatically detect and block unsafe model outputs with real-time policy enforcement and content moderation across all agents.

[ SHIP RELIABLE AI ]

Try Bifrost Enterprise with a 14-day Free Trial

[quick setup]

Drop-in replacement for any AI SDK

Change just one line of code. Works with OpenAI, Anthropic, Vercel AI SDK, LangChain, and more.

1import os
2from anthropic import Anthropic
3
4anthropic = Anthropic(
5 api_key=os.environ.get("ANTHROPIC_API_KEY"),
6 base_url="https://<bifrost_url>/anthropic",
7)
8
9message = anthropic.messages.create(
10 model="claude-3-5-sonnet-20241022",
11 max_tokens=1024,
12 messages=[
13 {"role": "user", "content": "Hello, Claude"}
14 ]
15)
Drop in once, run everywhere.