OpenAI on Bifrost: Models, Endpoints, Setup, and Mappings

OpenAI provider summary

Bifrost routes OpenAI models with full schema compatibility. Parameters validate and filter based on downstream provider requirements, so multi-provider setups seamlessly adapt requests.

Common OpenAI model IDs used in Bifrost routes:

gpt-4o-2024-11-20 (Latest)
gpt-4-turbo-2024-04-09 (Turbo)
gpt-3.5-turbo-0125 (Fast)
o1-2024-12-17 (Reasoning)

Property	Details
Description	OpenAI models for chat, reasoning, image generation, and audio tasks.
Provider route on Bifrost	openai/<model>
Provider doc	OpenAI API Reference
API endpoint for provider	https://api.openai.com
Supported endpoints	/v1/chat/completions, /v1/responses, /v1/completions, /v1/embeddings, /v1/audio/, /v1/images/, /v1/videos, /v1/files, /v1/batches, /v1/models

Supported operations

OpenAI is Bifrost's baseline schema: 13 operations across chat, Responses API, embeddings, audio, images, video, files, batch, and model listing. Streaming is available for chat, responses, text completions, speech, transcriptions, and image generation/edit. See Supported operations in Bifrost docs.

Operation	Non-streaming	Streaming	Upstream endpoint
Chat Completions	Yes	Yes	/v1/chat/completions
Responses API	Yes	Yes	/v1/responses
Text Completions	Yes	Yes	/v1/completions
Embeddings	Yes	—	/v1/embeddings
Speech (TTS)	Yes	Yes	/v1/audio/speech
Transcriptions (STT)	Yes	Yes	/v1/audio/transcriptions
Image Generation	Yes	Yes	/v1/images/generations
Image Edit	Yes	Yes	/v1/images/edits
Image Variation	Yes	—	/v1/images/variations
Files	Yes	—	/v1/files
Batch	Yes	—	/v1/batches
Video Generation	Yes	—	/v1/videos
List Models	Yes	—	/v1/models

Parameter handling

OpenAI parameters pass through with validation. Bifrost filters provider-specific fields (store, service_tier) before reaching downstream providers. The user field is truncated to 64 characters in chat/text operations.

Reasoning support (o1/o3 models only):

Non-o1 models: reasoning summary converted from content blocks
o1-oss variants: reasoning content blocks passed directly
Minimum budget enforced for structured output conversion

Token enforcement:

max_completion_tokens and max_output_tokens enforce 16-token minimum
Values below 16 automatically scale up to 16

Supported OpenAI parameters

Quick reference of OpenAI parameters accepted when routing through Bifrost.

[
  "stream",
  "temperature",
  "top_p",
  "top_k",
  "max_tokens",
  "max_completion_tokens",
  "stop",
  "presence_penalty",
  "frequency_penalty",
  "logit_bias",
  "logprobs",
  "top_logprobs",
  "seed",
  "response_format",
  "tools",
  "tool_choice",
  "user",
  "reasoning",
  "parallel_tool_calls",
  "service_tier"
]

Supported OpenAI models

Use the provider prefix openai/ in Bifrost model routes for deterministic provider targeting.

Family	Model ID	Bifrost route	Typical usage
GPT-4o	gpt-4o-2024-11-20	openai/gpt-4o-2024-11-20	Flagship reasoning model
GPT-4 Turbo	gpt-4-turbo-2024-04-09	openai/gpt-4-turbo-2024-04-09	Previous generation turbo
GPT-4	gpt-4-0613	openai/gpt-4-0613	Baseline GPT-4
GPT-3.5 Turbo	gpt-3.5-turbo-0125	openai/gpt-3.5-turbo-0125	Fast, lower-cost option
O1	o1-2024-12-17	openai/o1-2024-12-17	Extended reasoning model
O1-preview	o1-preview-2024-09-12	openai/o1-preview-2024-09-12	Earlier reasoning preview

API reference

OpenAI is Bifrost's baseline schema: parameters pass through with validation and filtering. Gateway routes map 1:1 to upstream OpenAI endpoints. Content aligned with Bifrost OpenAI provider docs.

1) Chat Completions

Primary chat path at /v1/chat/completions. See Chat Completions in Bifrost docs.

Parameter	Required	Notes
model	Yes	Model identifier
messages	Yes	ChatMessage array; roles: system, user, assistant, tool, developer
temperature	No	Sampling temperature (0–2)
top_p	No	Nucleus sampling
stop	No	Stop sequences
max_completion_tokens	No	Min 16 enforced by Bifrost
frequency_penalty	No	Frequency penalty (-2 to 2)
presence_penalty	No	Presence penalty (-2 to 2)
logit_bias	No	Token logit adjustments
logprobs	No	Include log probabilities
top_logprobs	No	Log probabilities per token
seed	No	Reproducibility seed
response_format	No	Structured output format
tools	No	Function tools; tool_choice: auto, none, required, or specific
parallel_tool_calls	No	Multiple simultaneous tool calls
stream_options	No	Streaming options; include_usage set by default
reasoning	No	reasoning.effort and reasoning.max_tokens passed through
user	No	Truncated to 64 characters
metadata	No	Custom metadata
store	No	Filtered when routing to non-OpenAI providers
service_tier	No	Filtered when routing to non-OpenAI providers
prompt_cache_key	No	Filtered when routing to non-OpenAI providers
prediction	No	Predicted output for acceleration
audio	No	Audio output config
modalities	No	Response modalities (text, audio)

Messages: text, image_url, input_audio; tool messages include tool_call_id
Streaming: SSE with delta.content, delta.tool_calls, finish_reason, usage on final chunk
cache_control stripped from messages, content blocks, and tools
Reasoning: effort minimal/low/medium/high; minimal → low when routing to other providers

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4o-2024-11-20",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

2) Responses API

Structured output API at /v1/responses. Non-gpt-oss models use reasoning summaries; gpt-oss uses reasoning content blocks. See Responses API in Bifrost docs.

Parameter	Required	Notes
model	Yes	Model identifier
input	Yes	Text or ContentBlock array
max_output_tokens	Yes	Min 16 enforced by Bifrost
instructions	No	System instructions
tools / tool_choice	No	ResponsesTool objects and choice strategy
reasoning	No	reasoning.max_tokens removed from upstream JSON
temperature	No	Sampling temperature
top_p	No	Nucleus sampling
parallel_tool_calls	No	Multiple simultaneous tool calls
previous_response_id	No	Continue from prior response
conversation	No	Conversation ID
background	No	Background mode
include	No	Extra fields in response (e.g. web_search sources)
truncation	No	auto or off
user	No	Truncated to 64 characters
store	No	Store response for later retrieval
stream_options	No	include_usage set by default for streaming

Supported tool types: functionfile_searchcomputer_use_previewweb_searchmcpcode_interpreterimage_generationlocal_shellcustomweb_search_preview. Action types zoom/region → screenshot. Response: id, status, output, usage.

SSE event	Description
response.created	Response created
response.in_progress	In progress
response.output_item.added	Output item added
response.content_part.added	Content part added
response.output_text.delta	Text delta
response.function_call_arguments.delta	Function call arguments delta
response.completed	Completed
response.incomplete	Incomplete

curl -X POST http://localhost:8080/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4o-2024-11-20",
    "input": "Hello",
    "max_output_tokens": 1024
  }'

3) Text Completions (Legacy)

Legacy API at /v1/completions — prefer Chat Completions for new work. Supports streaming. See Text Completions in Bifrost docs.

Parameter	Required	Notes
model	Yes	Model identifier
prompt	Yes	Completion prompt(s); array prompts → multiple completions
max_tokens	No	Maximum output tokens
temperature	No	Sampling temperature
top_p	No	Nucleus sampling
stop	No	Stop sequences
user	No	Truncated to 64 characters

curl -X POST http://localhost:8080/v1/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-3.5-turbo-0125",
    "prompt": "Hello, my name is",
    "max_tokens": 50
  }'

4) Embeddings

/v1/embeddings — no streaming. See Embeddings in Bifrost docs.

Parameter	Required	Notes
model	Yes	Model identifier
input	Yes	Text or array of texts
encoding_format	No	float or base64
dimensions	No	Output embedding dimensions
user	No	Not truncated (unlike chat/text)

curl -X POST http://localhost:8080/v1/embeddings \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/text-embedding-3-large",
    "input": "Hello world"
  }'

5) Speech (Text-to-Speech)

/v1/audio/speech — returns raw binary audio; streaming via SSE base64 chunks where supported. See Speech in Bifrost docs.

Parameter	Required	Notes
model	Yes	tts-1 or tts-1-hd
input	Yes	Text to convert to speech
voice	Yes	alloy, echo, fable, onyx, nova, shimmer
response_format	No	mp3, opus, aac, flac, wav, pcm
speed	No	0.25 to 4.0 (default 1.0)

curl -X POST http://localhost:8080/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/tts-1",
    "input": "Hello world",
    "voice": "alloy"
  }' --output speech.mp3

6) Transcriptions (Speech-to-Text)

/v1/audio/transcriptions — multipart/form-data (not JSON). Formats: mp3, mp4, mpeg, mpga, m4a, wav, webm. Streaming supported. See Transcriptions in Bifrost docs.

Parameter	Required	Notes
file	Yes	Audio file (multipart/form-data)
model	Yes	e.g. whisper-1
language	No	ISO-639-1 language code
prompt	No	Optional context
temperature	No	Sampling temperature
response_format	No	json, text, srt, vtt, verbose_json

curl -X POST http://localhost:8080/v1/audio/transcriptions \
  -F file=@audio.mp3 \
  -F model=openai/whisper-1

7) Image Generation

/v1/images/generations — pass-through parameters; streaming via SSE (image_generation.partial_image, image_generation.completed). See Image Generation in Bifrost docs.

Parameter	Required	Notes
model	Yes	e.g. dall-e-3
prompt	Yes	Image description
n	No	Number of images (1–10)
size	No	256x256 through 1792x1024, auto
quality	No	auto, high, medium, low, hd, standard
style	No	natural, vivid
response_format	No	url or b64_json
background	No	transparent, opaque, auto
output_format	No	png, webp, jpeg
partial_images	No	Partial images 0–3 for streaming

curl -X POST http://localhost:8080/v1/images/generations \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/dall-e-3",
    "prompt": "A serene landscape",
    "n": 1,
    "size": "1024x1024"
  }'

8) Image Edit

/v1/images/edits — multipart/form-data with image[], optional mask; streaming via image_edit.partial_image / image_edit.completed. See Image Edit in Bifrost docs.

Parameter	Required	Notes
model	Yes	Model identifier
prompt	Yes	Edit description
image[]	Yes	Image file(s) to edit (multipart)
mask	No	Mask image file
n	No	Number of images (1–10)
size	No	Output size
quality	No	Image quality
stream	No	Enable SSE streaming

9) Image Variation

/v1/images/variations — multipart/form-data; no streaming. Only the first image is sent upstream. See Image Variation in Bifrost docs.

Parameter	Required	Notes
model	Yes	Model identifier
image	Yes	Source image (multipart)
n	No	Number of variations (1–10)
size	No	Output size
response_format	No	url or b64_json

10) Files API

Upload, list, retrieve, delete, and download files. See Files API in Bifrost docs.

Parameter	Required	Notes
file	Yes	File to upload (multipart)
purpose	Yes	batch, fine-tune, or assistants
filename	No	Custom filename (defaults to file.jsonl)

GET /v1/files — list with purpose, limit, after, order
GET /v1/files/{file_id} — metadata
DELETE /v1/files/{file_id}
GET /v1/files/{file_id}/content — download

curl -X POST http://localhost:8080/v1/files \
  -F "file=@document.pdf" \
  -F "purpose=assistants"

11) Batch API

Async batch jobs at /v1/batches. Statuses: validating, failed, in_progress, finalizing, completed, expired, cancelling, cancelled. See Batch API in Bifrost docs.

Parameter	Required	Notes
input_file_id	Conditional	File ID or requests array (not both)
requests	Conditional	BatchRequestItem array (converted to JSONL)
endpoint	Yes	Target endpoint (e.g. /v1/chat/completions)
completion_window	No	24h (default)
metadata	No	Custom metadata

GET /v1/batches/{batch_id} — retrieve
POST /v1/batches/{batch_id}/cancel — cancel
Results: download output file via Files API when status is completed; parse JSONL BatchResultItem lines

12) List Models

GET /v1/models — no request body. Model IDs in responses are prefixed with openai/; results aggregate across configured API keys. See List Models in Bifrost docs.

curl http://localhost:8080/v1/models

13) Video Generation

Sora-style video jobs at /v1/videos. Job statuses: queued → in_progress → completed / failed. See Video Generation in Bifrost docs.

Parameter	Required	Notes
model	Yes	e.g. sora-2
prompt	Yes	Video description
input_reference	No	Base64 data URL only for image-to-video
seconds	No	Duration in seconds
size	No	720x1280, 1280x720, 1024x1792, 1792x1024

Operation	Endpoint	Notes
Get status	GET /v1/videos/{id}	Poll until status: completed
Download	GET /v1/videos/{id}/content	Raw video bytes
Delete	DELETE /v1/videos/{id}	Remove video job
List jobs	GET /v1/videos	Query: after, limit, order
Remix	POST /v1/videos/{id}/remix	Body: {"prompt": "..."}

curl -X POST http://localhost:8080/v1/videos \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/sora-2",
    "prompt": "A cat walking in the rain"
  }'

Common error codes

HTTP status to OpenAI error type mapping from Bifrost docs.

HTTP	Error type
400	invalid_request_error
401	authentication_error
403	permission_error
404	not_found_error
429	rate_limit_error
500	api_error

Implementation caveats

Caveat	Impact	Severity
User field truncation	User IDs over 64 characters are silently truncated	Low
Provider-specific field filtering	store, service_tier, prompt_cache_key filtered for non-OpenAI	Low
Cache control stripping	Cache control annotations stripped from messages when routing to non-OpenAI	Low
Reasoning model differences	o1-oss models receive reasoning content blocks; others receive summaries	Medium
Token minimum enforcement	max_completion_tokens values below 16 automatically scaled to 16	Low