Try Bifrost Enterprise free for 14 days.
Request access

[ Provider Guide ]

Nebius Provider on Bifrost

Nebius is an OpenAI-compatible provider for chat, text completions, embeddings, image generation, and model listing. Bifrost delegates to the OpenAI implementation with standard parameter filtering and Nebius-specific project ID support.

Nebius provider summary

Bifrost routes Nebius with full OpenAI API compatibility, streaming, tool calling, and filtered parameters for upstream compatibility.

Nebius supports:

  • Chat, text completion, embeddings, and responses
  • Server-Sent Events streaming with delta-based updates
  • AI project ID for Nebius resource organization
  • Tool calling — function definitions and execution
  • Image generation with Nebius-specific size and format conversion
PropertyDetails
DescriptionOpenAI-compatible cloud inference and embeddings.
Provider route on Bifrostnebius/<model>
Provider docdocs.nebius.com
AuthenticationAPI key (Bearer)
Supported endpoints/v1/chat/completions, /v1/responses, /v1/completions, /v1/embeddings, /v1/images/generations, /v1/models

Authentication

Configure your Nebius API key in Bifrost provider keys. Bifrost sends Authorization: Bearer <key> on upstream requests. See Nebius in Bifrost docs.

Supported operations

Speech, Transcriptions, Files, and Batch return UnsupportedOperationError. Responses API upstream routes to /v1/chat/completions after internal conversion. See Supported operations in Bifrost docs.

OperationNon-streamingStreamingUpstream endpoint
Chat CompletionsYesYes/v1/chat/completions
Responses APIYesYes/v1/chat/completions
Text CompletionsYesYes/v1/completions
EmbeddingsYes/v1/embeddings
Image GenerationYes/v1/images/generations
List ModelsYes/v1/models
Speech (TTS)NoNo-
Transcriptions (STT)NoNo-
FilesNoNo-
BatchNoNo-

AI project ID

Nebius supports an optional ai_project_id for resource organization. Bifrost appends it as a query parameter on the upstream URL. Use in chat, responses, or image generation via request body or extra_params.

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nebius/meta-llama/Meta-Llama-3.1-8B-Instruct-fast",
    "messages": [{"role": "user", "content": "Hello"}],
    "ai_project_id": "project-123"
  }'

API reference

OpenAI-compatible endpoints routed to Nebius via Bifrost.

1) Chat Completions

Primary path at /v1/chat/completions. Standard OpenAI chat parameters. See Chat Completions in Bifrost docs and OpenAI Chat Completions.

Filtered parameters

ParameterReasonNotes
prompt_cache_keyNot supportedRemoved for Nebius compatibility
verbosityAnthropic-specificRemoved for Nebius compatibility
storeNot supportedRemoved for Nebius compatibility
service_tierNot supportedRemoved for Nebius compatibility
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nebius/meta-llama/Meta-Llama-3.1-8B-Instruct-fast",
    "messages": [{"role": "user", "content": "Hello"}],
    "stream": true
  }'

2) Responses API

Converted internally to Chat Completions. Supports ai_project_id via extra_params. See Responses API in Bifrost docs.

ResponsesRequest → ChatRequest → ChatCompletion → ResponsesResponse
curl -X POST http://localhost:8080/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nebius/meta-llama/Meta-Llama-3.1-8B-Instruct-fast",
    "input": "Hello",
    "max_output_tokens": 1024
  }'

3) Text Completions

Legacy format at /v1/completions. See Text Completions in Bifrost docs.

ParameterMappingNotes
promptDirect pass-through
max_tokensmax_tokens
temperatureDirect pass-through
top_pDirect pass-through
stopStop sequences
frequency_penaltyPenalty parameters
presence_penaltyPenalty parameters
curl -X POST http://localhost:8080/v1/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nebius/meta-llama/Meta-Llama-3.1-8B-Instruct-fast",
    "prompt": "Hello, my name is",
    "max_tokens": 50
  }'

4) Embeddings

Text embeddings at /v1/embeddings — no streaming. Response includes vectors and usage. See Embeddings in Bifrost docs.

ParameterNotes
inputText or array of texts
modelEmbedding model name
encoding_format"float" or "base64"
dimensionsCustom output dimensions (optional)
curl -X POST http://localhost:8080/v1/embeddings \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nebius/BAAI/bge-en-icl",
    "input": "Hello world"
  }'

5) Image Generation

OpenAI-compatible format at /v1/images/generations. Nebius converts size (WxH) to separate width/height integers and maps jpegjpg. Streaming not supported. See Image Generation in Bifrost docs.

ParameterTypeRequiredNotes
modelstringYesModel identifier
promptstringYesText description of the image to generate
sizestringNoWxH format (e.g. 1024x1024); split into width and height integers
output_formatstringNopng, jpeg, webp — jpeg converted to jpg upstream
response_formatstringNourl or b64_json
seedintNoReproducible generation
negative_promptstringNoNegative prompt
num_inference_stepsintNoNumber of inference steps
extra_paramsobjectNoNebius-specific: guidance_scale, ai_project_id

Extra parameters (via extra_params)

ParameterTypeNotes
guidance_scaleintGuidance scale (0–100)
ai_project_idstringNebius project ID (added as query parameter)
curl -X POST http://localhost:8080/v1/images/generations \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nebius/black-forest-labs/flux-dev",
    "prompt": "A serene mountain landscape",
    "size": "1024x1024",
    "output_format": "png",
    "extra_params": {
      "guidance_scale": 7,
      "ai_project_id": "project-123"
    }
  }'

6) List Models

GET /v1/models — lists available Nebius models with capabilities and context lengths. See List Models in Bifrost docs.

curl http://localhost:8080/v1/models

Unsupported features

These operations are not offered by the upstream Nebius API. Bifrost returns UnsupportedOperationError.

FeatureReason
Speech/TTSNot offered by Nebius API
Transcription/STTNot offered by Nebius API
Batch operationsNot offered by Nebius API
File managementNot offered by Nebius API

Implementation caveats

CaveatImpactSeverity
Cache control strippedCache control directives removed from messages; prompt caching does not workMedium
Parameter filteringprompt_cache_key, verbosity, store, service_tier removed via filterOpenAISpecificParametersLow
User field size limitUser identifiers longer than 64 characters are silently dropped (SanitizeUserField)Low
Image format conversionjpeg output_format converted to jpg for Nebius upstreamLow

Authoritative references

[ BIFROST FEATURES ]

Open Source & Enterprise

Everything you need to run AI in production, from free open source to enterprise-grade features.

01 Governance

SAML support for SSO and Role-based access control and policy enforcement for team collaboration.

02 Adaptive Load Balancing

Automatically optimizes traffic distribution across provider keys and models based on real-time performance metrics.

03 Cluster Mode

High availability deployment with automatic failover and load balancing. Peer-to-peer clustering where every instance is equal.

04 Alerts

Real-time notifications for budget limits, failures, and performance issues on Email, Slack, PagerDuty, Teams, Webhook and more.

05 Log Exports

Export and analyze request logs, traces, and telemetry data from Bifrost with enterprise-grade data export capabilities for compliance, monitoring, and analytics.

06 Audit Logs

Comprehensive logging and audit trails for compliance and debugging.

07 Vault Support

Secure API key management with HashiCorp Vault, AWS Secrets Manager, Google Secret Manager, and Azure Key Vault integration.

08 VPC Deployment

Deploy Bifrost within your private cloud infrastructure with VPC isolation, custom networking, and enhanced security controls.

09 Guardrails

Automatically detect and block unsafe model outputs with real-time policy enforcement and content moderation across all agents.

[ SHIP RELIABLE AI ]

Try Bifrost Enterprise with a 14-day Free Trial

[quick setup]

Drop-in replacement for any AI SDK

Change just one line of code. Works with OpenAI, Anthropic, Vercel AI SDK, LangChain, and more.

1import os
2from anthropic import Anthropic
3
4anthropic = Anthropic(
5 api_key=os.environ.get("ANTHROPIC_API_KEY"),
6 base_url="https://<bifrost_url>/anthropic",
7)
8
9message = anthropic.messages.create(
10 model="claude-3-5-sonnet-20241022",
11 max_tokens=1024,
12 messages=[
13 {"role": "user", "content": "Hello, Claude"}
14 ]
15)
Drop in once, run everywhere.