Create a free Databricks workspace, generate an API token, enable billing for production, then integrate with Bifrost for multi-provider routing and cost governance. Complete in 10 minutes.
Bifrost supports Databricks Foundation Model API through OpenAI-compatible endpoints. Databricks provides serverless access to leading LLMs with workspace governance and Unity Catalog integration.
| Property | Details |
|---|---|
| Description | Databricks Foundation Model API provides serverless LLM access with workspace governance, cost tracking, and Unity Catalog integration. |
| Provider route on Bifrost | databricks/<model> |
| Provider doc | Databricks Foundation Model API |
| API endpoint for provider | https://<workspace>.cloud.databricks.com/api/2.0/endpoints/chat/completions |
| Supported endpoints | /chat/completions, /completions, /models |
Use these links for workspace access, API documentation, and token management.
Before you begin, you will need:
[ QUICK START ]
Sign up for a free trial.
Go to databricks.com and sign up. Create a workspace during onboarding.
Click your user icon in the top-right corner and select "User Settings" to access token management.
Your token is displayed once. Copy it immediately and store it securely.
Click "Generate new token" and give it a descriptive name. Copy your token immediately and store it as an environment variable.
export DATABRICKS_TOKEN="dapi..."
Add a payment method when ready for production.
Databricks offers trial credits. When you're ready for production or exceed trial limits, add a payment method in the Billing section.
Authenticate with Bearer tokens per Databricks API.
Databricks's API is OpenAI-compatible and uses Authorization: Bearer DATABRICKS_TOKEN:
$ curl https://<workspace>.cloud.databricks.com/api/2.0/endpoints/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $DATABRICKS_TOKEN" \ -d '{ "model": "databricks-dbrx-instruct", "messages": [{"role":"user","content":"Hello!"}] }'
[ MODELS ]
| Model | API ID | Best for |
|---|---|---|
| Meta Llama 3.3 70B Instruct | databricks-meta-llama-3-3-70b-instruct | Latest Llama on Databricks Model Serving. |
| Meta Llama 3.1 405B Instruct | databricks-meta-llama-3-1-405b-instruct | Largest Llama deployment on Databricks. |
| Meta Llama 3.1 70B Instruct | databricks-meta-llama-3-1-70b-instruct | Production open-weight chat. |
| Meta Llama 3.1 8B Instruct | databricks-meta-llama-3-1-8b-instruct | Efficient Llama 3.1 tier. |
| DBRX Instruct | databricks-dbrx-instruct | Databricks MoE foundation model. |
| Mixtral 8x7B Instruct | databricks-mixtral-8x7b-instruct | MoE instruct model on Databricks. |
| Mistral 7B Instruct | databricks-mistral-7b-instruct | Compact Mistral on Model Serving. |
| GTE Large (En) | databricks-gte-large-en | English embeddings for RAG. |
Models and availability change over time. See the Databricks foundation models documentation for the latest list and pricing.
[ TROUBLESHOOTING ]
| Error | Likely Cause | What to Do |
|---|---|---|
401 Unauthorized | Invalid or missing API token. | Verify your token is correct. Generate a new token if needed. |
400 Bad Request | Invalid request format or workspace URL. | Check workspace URL and request format. Verify model ID. |
429 Rate Limited | Rate limit exceeded for your workspace. | Implement exponential backoff. Use Bifrost for load distribution. |
500 Server Error | Temporary Databricks service issue. | Retry after a delay. Check Databricks status. Configure failover with Bifrost. |
[ PRODUCTION-READY ]
Bifrost is a drop-in replacement for Databricks SDKs. Update your base URL and keep your client code. Bifrost handles cost tracking, virtual keys, budgets, and intelligent failover.
Run the Bifrost gateway and configure your Databricks workspace in the Web UI.
$ npx -y @maximhq/bifrost
✓ Bifrost started ├─ HTTP server listening on http://localhost:8080 ├─ Web UI available at http://localhost:8080 └─ Configure providers and virtual keys in the dashboard
Update your SDK to route through Bifrost's OpenAI-compatible gateway.
from openai import OpenAI client = OpenAI( api_key="sk-bf-your-virtual-key", base_url="http://localhost:8080/openai" ) response = client.chat.completions.create( model="databricks-mlflow/your-endpoint-model", messages=[{"role": "user", "content": "Hello from Bifrost!"}] ) print(response.choices[0].message.content)
x-bf-vk or Authorization: Bearer sk-bf-* per the Bifrost documentation.[ WHAT'S NEXT ]
You have workspace-based LLM access. Add governance, guardrails, and MCP controls for production.
[ BIFROST FEATURES ]
Everything you need to run AI in production, from free open source to enterprise-grade features.
01 Governance
SAML support for SSO and Role-based access control and policy enforcement for team collaboration.
02 Adaptive Load Balancing
Automatically optimizes traffic distribution across provider keys and models based on real-time performance metrics.
03 Cluster Mode
High availability deployment with automatic failover and load balancing. Peer-to-peer clustering where every instance is equal.
04 Alerts
Real-time notifications for budget limits, failures, and performance issues on Email, Slack, PagerDuty, Teams, Webhook and more.
05 Log Exports
Export and analyze request logs, traces, and telemetry data from Bifrost with enterprise-grade data export capabilities for compliance, monitoring, and analytics.
06 Audit Logs
Comprehensive logging and audit trails for compliance and debugging.
07 Vault Support
Secure API key management with HashiCorp Vault, AWS Secrets Manager, Google Secret Manager, and Azure Key Vault integration.
08 VPC Deployment
Deploy Bifrost within your private cloud infrastructure with VPC isolation, custom networking, and enhanced security controls.
09 Guardrails
Automatically detect and block unsafe model outputs with real-time policy enforcement and content moderation across all agents.
[ SHIP RELIABLE AI ]
Change just one line of code. Works with OpenAI, Anthropic, Vercel AI SDK, LangChain, and more.
[ FAQ ]
Databricks Foundation Model API provides serverless access to state-of-the-art LLMs including Llama 2, DBRX, and other models, integrated with Databricks workspace infrastructure for governance and security.
Yes, you need a Databricks workspace. Create a free trial workspace at databricks.com to get started with the Foundation Model API.
Databricks offers access to Llama 2, DBRX (Databricks native model), Mixtral, and other foundation models. Check the Databricks documentation for the latest model catalog.
Databricks provides an OpenAI-compatible API format for chat completions. You can use OpenAI SDKs with appropriate endpoint configuration.
Use Databricks workspace cost tracking and set up budget alerts. Bifrost provides additional virtual keys, per-developer tracking, and multi-provider cost optimization.
Bifrost connects to your Databricks workspace, providing virtual keys for cost tracking, budget governance, and automatic failover to other providers.