Nebius on Bifrost: Models, Endpoints, Setup, and Mappings

Nebius provider summary

Bifrost routes Nebius with full OpenAI API compatibility, streaming, tool calling, and filtered parameters for upstream compatibility.

Nebius supports:

Chat, text completion, embeddings, and responses
Server-Sent Events streaming with delta-based updates
AI project ID for Nebius resource organization
Tool calling — function definitions and execution
Image generation with Nebius-specific size and format conversion

Property	Details
Description	OpenAI-compatible cloud inference and embeddings.
Provider route on Bifrost	nebius/<model>
Provider doc	docs.nebius.com
Authentication	API key (Bearer)
Supported endpoints	/v1/chat/completions, /v1/responses, /v1/completions, /v1/embeddings, /v1/images/generations, /v1/models

Authentication

Configure your Nebius API key in Bifrost provider keys. Bifrost sends Authorization: Bearer <key> on upstream requests. See Nebius in Bifrost docs.

Supported operations

Speech, Transcriptions, Files, and Batch return UnsupportedOperationError. Responses API upstream routes to /v1/chat/completions after internal conversion. See Supported operations in Bifrost docs.

Operation	Non-streaming	Streaming	Upstream endpoint
Chat Completions	Yes	Yes	/v1/chat/completions
Responses API	Yes	Yes	/v1/chat/completions
Text Completions	Yes	Yes	/v1/completions
Embeddings	Yes	—	/v1/embeddings
Image Generation	Yes	—	/v1/images/generations
List Models	Yes	—	/v1/models
Speech (TTS)	No	No	-
Transcriptions (STT)	No	No	-
Files	No	No	-
Batch	No	No	-

AI project ID

Nebius supports an optional ai_project_id for resource organization. Bifrost appends it as a query parameter on the upstream URL. Use in chat, responses, or image generation via request body or extra_params.

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nebius/meta-llama/Meta-Llama-3.1-8B-Instruct-fast",
    "messages": [{"role": "user", "content": "Hello"}],
    "ai_project_id": "project-123"
  }'

API reference

OpenAI-compatible endpoints routed to Nebius via Bifrost.

1) Chat Completions

Primary path at /v1/chat/completions. Standard OpenAI chat parameters. See Chat Completions in Bifrost docs and OpenAI Chat Completions.

Filtered parameters

Parameter	Reason	Notes
prompt_cache_key	Not supported	Removed for Nebius compatibility
verbosity	Anthropic-specific	Removed for Nebius compatibility
store	Not supported	Removed for Nebius compatibility
service_tier	Not supported	Removed for Nebius compatibility

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nebius/meta-llama/Meta-Llama-3.1-8B-Instruct-fast",
    "messages": [{"role": "user", "content": "Hello"}],
    "stream": true
  }'

2) Responses API

Converted internally to Chat Completions. Supports ai_project_id via extra_params. See Responses API in Bifrost docs.

ResponsesRequest → ChatRequest → ChatCompletion → ResponsesResponse

curl -X POST http://localhost:8080/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nebius/meta-llama/Meta-Llama-3.1-8B-Instruct-fast",
    "input": "Hello",
    "max_output_tokens": 1024
  }'

3) Text Completions

Legacy format at /v1/completions. See Text Completions in Bifrost docs.

Parameter	Mapping	Notes
prompt	Direct pass-through
max_tokens	max_tokens
temperature	Direct pass-through
top_p	Direct pass-through
stop	Stop sequences
frequency_penalty	Penalty parameters
presence_penalty	Penalty parameters

curl -X POST http://localhost:8080/v1/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nebius/meta-llama/Meta-Llama-3.1-8B-Instruct-fast",
    "prompt": "Hello, my name is",
    "max_tokens": 50
  }'

4) Embeddings

Text embeddings at /v1/embeddings — no streaming. Response includes vectors and usage. See Embeddings in Bifrost docs.

Parameter	Notes
input	Text or array of texts
model	Embedding model name
encoding_format	"float" or "base64"
dimensions	Custom output dimensions (optional)

curl -X POST http://localhost:8080/v1/embeddings \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nebius/BAAI/bge-en-icl",
    "input": "Hello world"
  }'

5) Image Generation

OpenAI-compatible format at /v1/images/generations. Nebius converts size (WxH) to separate width/height integers and maps jpeg → jpg. Streaming not supported. See Image Generation in Bifrost docs.

Parameter	Type	Required	Notes
model	string	Yes	Model identifier
prompt	string	Yes	Text description of the image to generate
size	string	No	WxH format (e.g. 1024x1024); split into width and height integers
output_format	string	No	png, jpeg, webp — jpeg converted to jpg upstream
response_format	string	No	url or b64_json
seed	int	No	Reproducible generation
negative_prompt	string	No	Negative prompt
num_inference_steps	int	No	Number of inference steps
extra_params	object	No	Nebius-specific: guidance_scale, ai_project_id

Extra parameters (via extra_params)

Parameter	Type	Notes
guidance_scale	int	Guidance scale (0–100)
ai_project_id	string	Nebius project ID (added as query parameter)

curl -X POST http://localhost:8080/v1/images/generations \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nebius/black-forest-labs/flux-dev",
    "prompt": "A serene mountain landscape",
    "size": "1024x1024",
    "output_format": "png",
    "extra_params": {
      "guidance_scale": 7,
      "ai_project_id": "project-123"
    }
  }'

6) List Models

GET /v1/models — lists available Nebius models with capabilities and context lengths. See List Models in Bifrost docs.

curl http://localhost:8080/v1/models

Unsupported features

These operations are not offered by the upstream Nebius API. Bifrost returns UnsupportedOperationError.

Feature	Reason
Speech/TTS	Not offered by Nebius API
Transcription/STT	Not offered by Nebius API
Batch operations	Not offered by Nebius API
File management	Not offered by Nebius API

Implementation caveats

Caveat	Impact	Severity
Cache control stripped	Cache control directives removed from messages; prompt caching does not work	Medium
Parameter filtering	prompt_cache_key, verbosity, store, service_tier removed via filterOpenAISpecificParameters	Low
User field size limit	User identifiers longer than 64 characters are silently dropped (SanitizeUserField)	Low
Image format conversion	jpeg output_format converted to jpg for Nebius upstream	Low