Endpoint Overview
Bifrost HTTP transport provides:- Unified API endpoints for all providers
- Drop-in compatible endpoints for existing SDKs
- MCP tool execution endpoint
- Prometheus metrics endpoint
http://localhost:8080 (configurable)
Unified API Endpoints
All endpoints and request/response formats are OpenAI compatible.
POST /v1/chat/completions
Chat conversation endpoint supporting all providers. Request Body:Streaming Responses
To receive a stream of partial responses, set"stream": true in your request. The response will be a text/event-stream of Server-Sent Events (SSE).
Request with Streaming:
data: . The stream is terminated by a [DONE] message.
POST /v1/text/completions
Text completion endpoint for simple text generation. Request Body:POST /v1/mcp/tool/execute
Direct MCP tool execution endpoint. Request Body:Drop-in Compatible Endpoints
OpenAI Compatible
POST /openai/v1/chat/completions Drop-in replacement for OpenAI API:Anthropic Compatible
POST /anthropic/v1/messages Drop-in replacement for Anthropic API:Google GenAI Compatible
POST /genai/v1beta/models/:generateContent Drop-in replacement for Google GenAI API:Monitoring Endpoints
GET /metrics
Prometheus metrics endpoint:Request Parameters
Common Parameters
| Parameter | Type | Description | Example |
|---|---|---|---|
model | string | Provider and model name | "openai/gpt-4o-mini" |
params | object | Model parameters | {"temperature": 0.7} |
fallbacks | array | Fallback model names | ["anthropic/claude-3-sonnet-20240229"] |
Model Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
temperature | float | 1.0 | Randomness (0.0-2.0) |
max_tokens | integer | Provider default | Maximum tokens to generate |
top_p | float | 1.0 | Nucleus sampling (0.0-1.0) |
frequency_penalty | float | 0.0 | Frequency penalty (-2.0-2.0) |
presence_penalty | float | 0.0 | Presence penalty (-2.0-2.0) |
stop | array | null | Stop sequences |
Chat Message Format
Tool Calling
Automatic Tool Integration
MCP tools are automatically available in chat completions:Multi-turn Tool Conversations
Error Handling
Error Response Format
Common Error Codes
| Status | Code | Description |
|---|---|---|
| 400 | invalid_request_error | Bad request format |
| 401 | authentication_error | Invalid API key |
| 403 | permission_error | Access denied |
| 429 | rate_limit_error | Rate limit exceeded |
| 500 | internal_error | Server error |
| 503 | service_unavailable | Provider unavailable |
Error Response Examples
Missing Provider:Language SDK Examples
Python (OpenAI SDK)
JavaScript (OpenAI SDK)
Go (Direct HTTP)
Architecture: For endpoint implementation details and performance, see Architecture Documentation.