OpenAI provider summary
Bifrost routes OpenAI models with full schema compatibility. Parameters validate and filter based on downstream provider requirements, so multi-provider setups seamlessly adapt requests.
Common OpenAI model IDs used in Bifrost routes:
gpt-4o-2024-11-20(Latest)gpt-4-turbo-2024-04-09(Turbo)gpt-3.5-turbo-0125(Fast)o1-2024-12-17(Reasoning)
| Property | Details |
|---|---|
| Description | OpenAI models for chat, reasoning, image generation, and audio tasks. |
| Provider route on Bifrost | openai/<model> |
| Provider doc | OpenAI API Reference |
| API endpoint for provider | https://api.openai.com |
| Supported endpoints | /v1/chat/completions, /v1/responses, /v1/completions, /v1/embeddings, /v1/audio/*, /v1/images/*, /v1/videos, /v1/files, /v1/batches, /v1/models |
Supported operations
OpenAI is Bifrost's baseline schema: 13 operations across chat, Responses API, embeddings, audio, images, video, files, batch, and model listing. Streaming is available for chat, responses, text completions, speech, transcriptions, and image generation/edit. See Supported operations in Bifrost docs.
| Operation | Non-streaming | Streaming | Upstream endpoint |
|---|---|---|---|
| Chat Completions | Yes | Yes | /v1/chat/completions |
| Responses API | Yes | Yes | /v1/responses |
| Text Completions | Yes | Yes | /v1/completions |
| Embeddings | Yes | — | /v1/embeddings |
| Speech (TTS) | Yes | Yes | /v1/audio/speech |
| Transcriptions (STT) | Yes | Yes | /v1/audio/transcriptions |
| Image Generation | Yes | Yes | /v1/images/generations |
| Image Edit | Yes | Yes | /v1/images/edits |
| Image Variation | Yes | — | /v1/images/variations |
| Files | Yes | — | /v1/files |
| Batch | Yes | — | /v1/batches |
| Video Generation | Yes | — | /v1/videos |
| List Models | Yes | — | /v1/models |
Parameter handling
OpenAI parameters pass through with validation. Bifrost filters provider-specific fields (store, service_tier) before reaching downstream providers. The user field is truncated to 64 characters in chat/text operations.
Reasoning support (o1/o3 models only):
- Non-o1 models: reasoning summary converted from content blocks
- o1-oss variants: reasoning content blocks passed directly
- Minimum budget enforced for structured output conversion
Token enforcement:
max_completion_tokensandmax_output_tokensenforce 16-token minimum- Values below 16 automatically scale up to 16
Supported OpenAI parameters
Quick reference of OpenAI parameters accepted when routing through Bifrost.
[ "stream", "temperature", "top_p", "top_k", "max_tokens", "max_completion_tokens", "stop", "presence_penalty", "frequency_penalty", "logit_bias", "logprobs", "top_logprobs", "seed", "response_format", "tools", "tool_choice", "user", "reasoning", "parallel_tool_calls", "service_tier" ]
Supported OpenAI models
Use the provider prefix openai/ in Bifrost model routes for deterministic provider targeting.
| Family | Model ID | Bifrost route | Typical usage |
|---|---|---|---|
| GPT-4o | gpt-4o-2024-11-20 | openai/gpt-4o-2024-11-20 | Flagship reasoning model |
| GPT-4 Turbo | gpt-4-turbo-2024-04-09 | openai/gpt-4-turbo-2024-04-09 | Previous generation turbo |
| GPT-4 | gpt-4-0613 | openai/gpt-4-0613 | Baseline GPT-4 |
| GPT-3.5 Turbo | gpt-3.5-turbo-0125 | openai/gpt-3.5-turbo-0125 | Fast, lower-cost option |
| O1 | o1-2024-12-17 | openai/o1-2024-12-17 | Extended reasoning model |
| O1-preview | o1-preview-2024-09-12 | openai/o1-preview-2024-09-12 | Earlier reasoning preview |
API reference
OpenAI is Bifrost's baseline schema: parameters pass through with validation and filtering. Gateway routes map 1:1 to upstream OpenAI endpoints. Content aligned with Bifrost OpenAI provider docs.
1) Chat Completions
Primary chat path at /v1/chat/completions. See Chat Completions in Bifrost docs.
| Parameter | Required | Notes |
|---|---|---|
| model | Yes | Model identifier |
| messages | Yes | ChatMessage array; roles: system, user, assistant, tool, developer |
| temperature | No | Sampling temperature (0–2) |
| top_p | No | Nucleus sampling |
| stop | No | Stop sequences |
| max_completion_tokens | No | Min 16 enforced by Bifrost |
| frequency_penalty | No | Frequency penalty (-2 to 2) |
| presence_penalty | No | Presence penalty (-2 to 2) |
| logit_bias | No | Token logit adjustments |
| logprobs | No | Include log probabilities |
| top_logprobs | No | Log probabilities per token |
| seed | No | Reproducibility seed |
| response_format | No | Structured output format |
| tools | No | Function tools; tool_choice: auto, none, required, or specific |
| parallel_tool_calls | No | Multiple simultaneous tool calls |
| stream_options | No | Streaming options; include_usage set by default |
| reasoning | No | reasoning.effort and reasoning.max_tokens passed through |
| user | No | Truncated to 64 characters |
| metadata | No | Custom metadata |
| store | No | Filtered when routing to non-OpenAI providers |
| service_tier | No | Filtered when routing to non-OpenAI providers |
| prompt_cache_key | No | Filtered when routing to non-OpenAI providers |
| prediction | No | Predicted output for acceleration |
| audio | No | Audio output config |
| modalities | No | Response modalities (text, audio) |
- Messages: text, image_url, input_audio; tool messages include tool_call_id
- Streaming: SSE with delta.content, delta.tool_calls, finish_reason, usage on final chunk
- cache_control stripped from messages, content blocks, and tools
- Reasoning: effort minimal/low/medium/high; minimal → low when routing to other providers
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4o-2024-11-20",
"messages": [{"role": "user", "content": "Hello"}]
}'2) Responses API
Structured output API at /v1/responses. Non-gpt-oss models use reasoning summaries; gpt-oss uses reasoning content blocks. See Responses API in Bifrost docs.
| Parameter | Required | Notes |
|---|---|---|
| model | Yes | Model identifier |
| input | Yes | Text or ContentBlock array |
| max_output_tokens | Yes | Min 16 enforced by Bifrost |
| instructions | No | System instructions |
| tools / tool_choice | No | ResponsesTool objects and choice strategy |
| reasoning | No | reasoning.max_tokens removed from upstream JSON |
| temperature | No | Sampling temperature |
| top_p | No | Nucleus sampling |
| parallel_tool_calls | No | Multiple simultaneous tool calls |
| previous_response_id | No | Continue from prior response |
| conversation | No | Conversation ID |
| background | No | Background mode |
| include | No | Extra fields in response (e.g. web_search sources) |
| truncation | No | auto or off |
| user | No | Truncated to 64 characters |
| store | No | Store response for later retrieval |
| stream_options | No | include_usage set by default for streaming |
Supported tool types: functionfile_searchcomputer_use_previewweb_searchmcpcode_interpreterimage_generationlocal_shellcustomweb_search_preview. Action types zoom/region → screenshot. Response: id, status, output, usage.
| SSE event | Description |
|---|---|
| response.created | Response created |
| response.in_progress | In progress |
| response.output_item.added | Output item added |
| response.content_part.added | Content part added |
| response.output_text.delta | Text delta |
| response.function_call_arguments.delta | Function call arguments delta |
| response.completed | Completed |
| response.incomplete | Incomplete |
curl -X POST http://localhost:8080/v1/responses \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4o-2024-11-20",
"input": "Hello",
"max_output_tokens": 1024
}'3) Text Completions (Legacy)
Legacy API at /v1/completions — prefer Chat Completions for new work. Supports streaming. See Text Completions in Bifrost docs.
| Parameter | Required | Notes |
|---|---|---|
| model | Yes | Model identifier |
| prompt | Yes | Completion prompt(s); array prompts → multiple completions |
| max_tokens | No | Maximum output tokens |
| temperature | No | Sampling temperature |
| top_p | No | Nucleus sampling |
| stop | No | Stop sequences |
| user | No | Truncated to 64 characters |
curl -X POST http://localhost:8080/v1/completions \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-3.5-turbo-0125",
"prompt": "Hello, my name is",
"max_tokens": 50
}'4) Embeddings
/v1/embeddings — no streaming. See Embeddings in Bifrost docs.
| Parameter | Required | Notes |
|---|---|---|
| model | Yes | Model identifier |
| input | Yes | Text or array of texts |
| encoding_format | No | float or base64 |
| dimensions | No | Output embedding dimensions |
| user | No | Not truncated (unlike chat/text) |
curl -X POST http://localhost:8080/v1/embeddings \
-H "Content-Type: application/json" \
-d '{
"model": "openai/text-embedding-3-large",
"input": "Hello world"
}'5) Speech (Text-to-Speech)
/v1/audio/speech — returns raw binary audio; streaming via SSE base64 chunks where supported. See Speech in Bifrost docs.
| Parameter | Required | Notes |
|---|---|---|
| model | Yes | tts-1 or tts-1-hd |
| input | Yes | Text to convert to speech |
| voice | Yes | alloy, echo, fable, onyx, nova, shimmer |
| response_format | No | mp3, opus, aac, flac, wav, pcm |
| speed | No | 0.25 to 4.0 (default 1.0) |
curl -X POST http://localhost:8080/v1/audio/speech \
-H "Content-Type: application/json" \
-d '{
"model": "openai/tts-1",
"input": "Hello world",
"voice": "alloy"
}' --output speech.mp36) Transcriptions (Speech-to-Text)
/v1/audio/transcriptions — multipart/form-data (not JSON). Formats: mp3, mp4, mpeg, mpga, m4a, wav, webm. Streaming supported. See Transcriptions in Bifrost docs.
| Parameter | Required | Notes |
|---|---|---|
| file | Yes | Audio file (multipart/form-data) |
| model | Yes | e.g. whisper-1 |
| language | No | ISO-639-1 language code |
| prompt | No | Optional context |
| temperature | No | Sampling temperature |
| response_format | No | json, text, srt, vtt, verbose_json |
curl -X POST http://localhost:8080/v1/audio/transcriptions \ -F file=@audio.mp3 \ -F model=openai/whisper-1
7) Image Generation
/v1/images/generations — pass-through parameters; streaming via SSE (image_generation.partial_image, image_generation.completed). See Image Generation in Bifrost docs.
| Parameter | Required | Notes |
|---|---|---|
| model | Yes | e.g. dall-e-3 |
| prompt | Yes | Image description |
| n | No | Number of images (1–10) |
| size | No | 256x256 through 1792x1024, auto |
| quality | No | auto, high, medium, low, hd, standard |
| style | No | natural, vivid |
| response_format | No | url or b64_json |
| background | No | transparent, opaque, auto |
| output_format | No | png, webp, jpeg |
| partial_images | No | Partial images 0–3 for streaming |
curl -X POST http://localhost:8080/v1/images/generations \
-H "Content-Type: application/json" \
-d '{
"model": "openai/dall-e-3",
"prompt": "A serene landscape",
"n": 1,
"size": "1024x1024"
}'8) Image Edit
/v1/images/edits — multipart/form-data with image[], optional mask; streaming via image_edit.partial_image / image_edit.completed. See Image Edit in Bifrost docs.
| Parameter | Required | Notes |
|---|---|---|
| model | Yes | Model identifier |
| prompt | Yes | Edit description |
| image[] | Yes | Image file(s) to edit (multipart) |
| mask | No | Mask image file |
| n | No | Number of images (1–10) |
| size | No | Output size |
| quality | No | Image quality |
| stream | No | Enable SSE streaming |
9) Image Variation
/v1/images/variations — multipart/form-data; no streaming. Only the first image is sent upstream. See Image Variation in Bifrost docs.
| Parameter | Required | Notes |
|---|---|---|
| model | Yes | Model identifier |
| image | Yes | Source image (multipart) |
| n | No | Number of variations (1–10) |
| size | No | Output size |
| response_format | No | url or b64_json |
10) Files API
Upload, list, retrieve, delete, and download files. See Files API in Bifrost docs.
| Parameter | Required | Notes |
|---|---|---|
| file | Yes | File to upload (multipart) |
| purpose | Yes | batch, fine-tune, or assistants |
| filename | No | Custom filename (defaults to file.jsonl) |
- GET
/v1/files— list with purpose, limit, after, order - GET
/v1/files/{file_id}— metadata - DELETE
/v1/files/{file_id} - GET
/v1/files/{file_id}/content— download
curl -X POST http://localhost:8080/v1/files \ -F "file=@document.pdf" \ -F "purpose=assistants"
11) Batch API
Async batch jobs at /v1/batches. Statuses: validating, failed, in_progress, finalizing, completed, expired, cancelling, cancelled. See Batch API in Bifrost docs.
| Parameter | Required | Notes |
|---|---|---|
| input_file_id | Conditional | File ID or requests array (not both) |
| requests | Conditional | BatchRequestItem array (converted to JSONL) |
| endpoint | Yes | Target endpoint (e.g. /v1/chat/completions) |
| completion_window | No | 24h (default) |
| metadata | No | Custom metadata |
- GET
/v1/batches/{batch_id}— retrieve - POST
/v1/batches/{batch_id}/cancel— cancel - Results: download output file via Files API when status is completed; parse JSONL BatchResultItem lines
12) List Models
GET /v1/models — no request body. Model IDs in responses are prefixed with openai/; results aggregate across configured API keys. See List Models in Bifrost docs.
curl http://localhost:8080/v1/models
13) Video Generation
Sora-style video jobs at /v1/videos. Job statuses: queued → in_progress → completed / failed. See Video Generation in Bifrost docs.
| Parameter | Required | Notes |
|---|---|---|
| model | Yes | e.g. sora-2 |
| prompt | Yes | Video description |
| input_reference | No | Base64 data URL only for image-to-video |
| seconds | No | Duration in seconds |
| size | No | 720x1280, 1280x720, 1024x1792, 1792x1024 |
| Operation | Endpoint | Notes |
|---|---|---|
| Get status | GET /v1/videos/{id} | Poll until status: completed |
| Download | GET /v1/videos/{id}/content | Raw video bytes |
| Delete | DELETE /v1/videos/{id} | Remove video job |
| List jobs | GET /v1/videos | Query: after, limit, order |
| Remix | POST /v1/videos/{id}/remix | Body: {"prompt": "..."} |
curl -X POST http://localhost:8080/v1/videos \
-H "Content-Type: application/json" \
-d '{
"model": "openai/sora-2",
"prompt": "A cat walking in the rain"
}'Common error codes
HTTP status to OpenAI error type mapping from Bifrost docs.
| HTTP | Error type |
|---|---|
| 400 | invalid_request_error |
| 401 | authentication_error |
| 403 | permission_error |
| 404 | not_found_error |
| 429 | rate_limit_error |
| 500 | api_error |
Implementation caveats
| Caveat | Impact | Severity |
|---|---|---|
| User field truncation | User IDs over 64 characters are silently truncated | Low |
| Provider-specific field filtering | store, service_tier, prompt_cache_key filtered for non-OpenAI | Low |
| Cache control stripping | Cache control annotations stripped from messages when routing to non-OpenAI | Low |
| Reasoning model differences | o1-oss models receive reasoning content blocks; others receive summaries | Medium |
| Token minimum enforcement | max_completion_tokens values below 16 automatically scaled to 16 | Low |
Authoritative references
- Bifrost OpenAI provider reference: docs.getbifrost.ai/providers/supported-providers/openai
- OpenAI platform docs: platform.openai.com/docs/api-reference
- Bifrost provider support overview: docs.getbifrost.ai/providers/supported-providers/overview