Try Bifrost Enterprise free for 14 days.

EDGENEW FEATURES ENTERPRISE PRICING DOCS BLOG

[ ENTERPRISE DEPLOYMENT ]

Deploy Bifrost in Your VPC,
On-Premise, or Air-Gapped

Your infrastructure, your keys, zero egress. Deploy with Terraform, connect your vault, and keep all data inside your network.

[ PERFORMANCE AT A GLANCE ]

11µs

Gateway Overhead

Per request at 5K RPS

99.999%

Uptime SLA

Enterprise In-VPC deployments

In-VPC

Supported

AWS, GCP, Azure, Cloudflare, Vercel

0 Bytes

Data Egress

All processing inside your network

[ THE PROBLEM ]

What Happens When Your AI Gateway
Lives Outside Your Network

Most AI gateways are SaaS-only or lack enterprise deployment flexibility. Scaling AI across regulated, multi-cloud, or restricted environments surfaces problems that managed proxies don't solve.

No data sovereignty

SaaS gateways route every prompt, completion, and API key through a third-party network. Sensitive data leaves your perimeter, and compliance teams block adoption before it starts.

No air-gapped support

Classified and regulated environments require zero external network access. SaaS gateways cannot operate offline, and most open-source proxies still phone home for updates or telemetry.

Multi-cloud fragmentation

When teams deploy a separate gateway in each cloud, policies drift between environments, configuration is duplicated manually, and rate limits or budgets cannot be enforced across providers from a single control plane.

No production-grade HA

Single-instance gateways are single points of failure. Without native clustering, automatic failover, or zero-downtime rolling updates, any outage takes your entire AI stack offline.

[ DEPLOYMENT MODELS ]

Four Deployment Models, One Gateway

Match your infrastructure, compliance, and security requirements. Bifrost deploys the same way regardless of where it runs.

In-VPC

In-VPC / Private Cloud

Deploy entirely within your VPC on AWS, GCP, or Azure. Complete network isolation with native IAM integration, private endpoints, and no external dependencies.

AWS EKS/ECSGCP GKEAzure AKS

99.999% SLA

Multi-zone HA

Zero data egress

View docs →

On-Premise

On-Premise / Bare Metal

Run on your own hardware via Kubernetes, Docker Compose, or a single binary on bare metal VMs. Full control over compute, storage, and networking with no cloud dependency.

Kubernetes v1.19+DockerSingle binary

Private registry mirror

Credential rotation

Helm + Compose

View docs →

Air-Gapped

Air-Gapped Environments

For environments with zero internet access. Export the Bifrost image on a connected machine, transfer via tarball, load into your internal registry. No phone-home, no telemetry.

Docker save/loadInternal registry

Fully offline operation

No telemetry leakage

Image mirroring

View docs →

Edge / Dev

Single-Node & Edge

Run a single Bifrost instance for dev/test, branch offices, or edge deployments. Minimal footprint with instant setup via Docker Compose, fly.io, or a single Go binary.

Docker Composefly.ioHelm

2 vCPU / 4GB minimum

Single binary deploy

30-second setup

View docs →

[ CLOUD SUPPORT ]

Native Integration with Every Major Cloud

Cloud-native authentication and registry distribution per platform. No fighting your cloud provider's security model.

Cloud	Targets	Auth Method	Registry	IaC
AWS	EKS, ECS	IRSA	Artifact Registry	Terraform, Helm
GCP	GKE, Cloud Run	Workload Identity	Artifact Registry	Terraform, Helm
Azure	AKS	Azure WIF	Artifact Registry	Terraform, Helm
On-Premise	K8s, Docker, Bare Metal	Basic Auth	Internal mirror	Helm, Compose, Single binary

main.tf

One module, any cloud

module "bifrost" {
  source         = "github.com/maximhq/bifrost//terraform/modules/bifrost?ref=terraform/v0.1.0"
  cloud_provider = "aws"          # "aws" | "gcp" | "azure" | "kubernetes"
  service        = "eks"          # AWS: "ecs" | "eks", GCP: "gke" | "cloud-run", Azure: "aks"
  region         = "us-east-1"
  image_tag      = "latest"
}

[ CORE CAPABILITIES ]

Enterprise Controls without Changing How You Deploy

Every deployment model ships with the same HA clustering, security controls, and observability stack.

High Availability & Clustering

Zero-downtime at scale

Peer-to-peer architecture, no single point of failure
6 discovery methods: K8s, Consul, etcd, DNS, UDP, mDNS
Gossip-based state sync for rate limits and traffic
3-node minimum
Zero-downtime rolling updates
Adaptive load balancing with health monitoring

View docs →

Security & Compliance

Zero trust architecture, audit ready

Complete VPC isolation, no external dependencies
Vault: HashiCorp, AWS SM, GCP SM, Azure KV
TLS 1.3 in-transit, KMS encryption at-rest
RBAC + OIDC (Okta, Entra ID) + virtual keys
Audit trails for SOC 2 Type II, GDPR, HIPAA, ISO 27001
Container image signing + vulnerability scanning

View docs →

Observability

Full stack monitoring

Built-in dashboard for real-time monitoring
Native Prometheus metrics (scrape or Push Gateway)
OpenTelemetry / OTLP distributed tracing
Native Datadog/BigQuery connectors
Log exports to data lakes and storage
Health endpoints: /health, /ready, /cluster/status

View docs →

[ HOW IT WORKS ]

Zero-Friction Setup: Deploy in Under a Day

No custom agents, no proprietary orchestration. Deploy with your existing infrastructure tools.

Step 01

Choose deployment model

VPC, on-prem, air-gapped, or multi-cloud. Pick the model that matches your infrastructure and compliance requirements.

# In-VPC, On-Premise, Air-Gapped, Multi-Cloud, or Edge

Step 02

Provision infrastructure

Use the Terraform module or Helm chart. AWS, GCP, Azure, and generic Kubernetes all supported out of the box.

terraform apply

Step 03

Configure Bifrost

Single JSON config file. Connect config store (Postgres or file), set up providers, define virtual keys and routing rules.

# config.json
# providers, virtual keys, routing

Step 04

Deploy

Single Go binary with minimal dependencies. Helm install, docker compose up, or terraform apply.

helm install bifrost maximhq/bifrost

Step 05

Integrate your SDKs

Change one line in your existing OpenAI, Anthropic, or LiteLLM SDK. Point at your Bifrost endpoint.

base_url = "https://bifrost.internal"

Step 06

Monitor & scale

Enable clustering for HA, connect Prometheus or OpenTelemetry, set up auto-scaling. Production ready.

curl http://bifrost.internal/cluster/status

[ COMPARISON ]

Self-Hosted Gateway vs SaaS Proxy
vs Bifrost Enterprise

Capability	SaaS Proxy	OSS (LiteLLM etc.)	Bifrost Enterprise
In-VPC deployment	No	Manual setup	Terraform + Helm
Air-gapped support	No	No	Docker save/load
Cloud-native auth	API key only	API key only	IRSA, Workload Identity, Azure WIF
Vault integration	No	No	HashiCorp, AWS SM, GCP SM, Azure KV
RBAC + SSO	Limited	No	Okta, Entra ID, OIDC
Audit logs (SOC 2 Type II, HIPAA)	Limited	No	Immutable, compliance ready
Data stays in your network	No	Yes	Yes, zero egress
P99 latency at 500 RPS	~50ms	~90.72s	~1.68s
Uptime SLA	Varies	None	99.999%

[ USE CASES ]

Real-World Scenarios Where Deployment
Flexibility Changes the Game

Regulated Industries

Healthcare, Finance & Government

Compliance requires all AI traffic to stay within your VPC. No data can transit third-party proxies.

In-VPC deployment with complete network isolation, audit logs for HIPAA/SOC 2 Type II/GDPR, Vault for secrets, and zero data egress. All processing stays inside your controlled environment.

Defense & Air-Gapped

Zero Internet Access Environments

Your environment has no internet. You need a gateway that runs entirely offline with no phone-home.

Docker image export/import workflow, internal registry mirroring, no telemetry leakage. On-prem credential management with offline operation and manual update cycles.

Multi-Cloud Enterprise

AWS + GCP + Azure, Unified

Your org runs AWS for production, GCP for ML, Azure for business units. You need one gateway with consistent governance.

Clustered deployment across clouds with gossip-based state sync, cloud-native auth per environment (IRSA, Workload Identity, WIF), and unified rate limiting and budgets.

Gateway Migration

Migrating from LiteLLM or Other Proxies

You're hitting performance ceilings, missing enterprise features, or struggling with reliability on your current gateway.

Drop-in LiteLLM compatibility, 54x faster P99 latency, 68% less memory. Plus enterprise features like RBAC, guardrails, clustering, and audit logs that alternatives don't offer.

[ GET STARTED ]

Ready to Deploy Bifrost
in Your Environment?

Bifrost is open source and production-ready. Teams deploy in hours and scale without rethinking the architecture.

[ BIFROST FEATURES ]

Open Source & Enterprise

Everything you need to run AI in production, from free open source to enterprise-grade features.

01 Governance

SAML support for SSO and Role-based access control and policy enforcement for team collaboration.

02 Adaptive Load Balancing

Automatically optimizes traffic distribution across provider keys and models based on real-time performance metrics.

03 Cluster Mode

High availability deployment with automatic failover and load balancing. Peer-to-peer clustering where every instance is equal.

04 Alerts

Real-time notifications for budget limits, failures, and performance issues on Email, Slack, PagerDuty, Teams, Webhook and more.

05 Log Exports

Export and analyze request logs, traces, and telemetry data from Bifrost with enterprise-grade data export capabilities for compliance, monitoring, and analytics.

06 Audit Logs

Comprehensive logging and audit trails for compliance and debugging.

07 Vault Support

Secure API key management with HashiCorp Vault, AWS Secrets Manager, Google Secret Manager, and Azure Key Vault integration.

08 VPC Deployment

Deploy Bifrost within your private cloud infrastructure with VPC isolation, custom networking, and enhanced security controls.

09 Guardrails

Automatically detect and block unsafe model outputs with real-time policy enforcement and content moderation across all agents.

[ SHIP RELIABLE AI ]

Try Bifrost Enterprise with a 14-day Free Trial

[quick setup]

Drop-in replacement for any AI SDK

Change just one line of code. Works with OpenAI, Anthropic, Vercel AI SDK, LangChain, and more.

1import os

2from anthropic import Anthropic

4anthropic = Anthropic(

5 api_key=os.environ.get("ANTHROPIC_API_KEY"),

6 base_url="https://<bifrost_url>/anthropic",

9message = anthropic.messages.create(

10 model="claude-3-5-sonnet-20241022",

11 max_tokens=1024,

12 messages=[

13 {"role": "user", "content": "Hello, Claude"}

14 ]

15)

Drop in once, run everywhere.

[ FAQ ]

Frequently Asked Questions

Yes. Bifrost deploys entirely within your VPC on AWS, GCP, or Azure with complete network isolation. All LLM requests, API keys, prompts, and completions stay within your network perimeter. Combined with vault support, no secrets leave your infrastructure.

Yes. Export the Bifrost Docker image on a connected machine using docker save, transfer the tarball to your air-gapped environment, and load it into your internal registry. No phone-home, no telemetry, fully offline operation.

Bifrost provides a single Terraform module that targets AWS (EKS/ECS), GCP (GKE/Cloud Run), Azure (AKS), and generic Kubernetes. Helm charts are also available for Kubernetes deployments. Docker Compose and single binary are provided for on-premise and bare metal.

A single Bifrost node requires 2 vCPU and 4GB RAM minimum. For production HA deployments, a 3-node cluster is recommended. The Go-native binary has a minimal footprint with 68% less memory usage than Python-based alternatives.

Deploy Bifrost in Your VPC,
On-Premise, or Air-Gapped