Try Bifrost Enterprise free for 14 days.
Request access
[ ENTERPRISE DEPLOYMENT ]

Deploy Bifrost in Your VPC,
On-Premise, or Air-Gapped

Your infrastructure, your keys, zero egress. Deploy with Terraform, connect your vault, and keep all data inside your network.

[ PERFORMANCE AT A GLANCE ]

11µs
Gateway Overhead
Per request at 5K RPS
99.999%
Uptime SLA
Enterprise In-VPC deployments
In-VPC
Supported
AWS, GCP, Azure, Cloudflare, Vercel
0 Bytes
Data Egress
All processing inside your network

[ THE PROBLEM ]

What Happens When Your AI Gateway
Lives Outside Your Network

Most AI gateways are SaaS-only or lack enterprise deployment flexibility. Scaling AI across regulated, multi-cloud, or restricted environments surfaces problems that managed proxies don't solve.

No data sovereignty

SaaS gateways route every prompt, completion, and API key through a third-party network. Sensitive data leaves your perimeter, and compliance teams block adoption before it starts.

No air-gapped support

Classified and regulated environments require zero external network access. SaaS gateways cannot operate offline, and most open-source proxies still phone home for updates or telemetry.

Multi-cloud fragmentation

When teams deploy a separate gateway in each cloud, policies drift between environments, configuration is duplicated manually, and rate limits or budgets cannot be enforced across providers from a single control plane.

No production-grade HA

Single-instance gateways are single points of failure. Without native clustering, automatic failover, or zero-downtime rolling updates, any outage takes your entire AI stack offline.

[ DEPLOYMENT MODELS ]

Four Deployment Models, One Gateway

Match your infrastructure, compliance, and security requirements. Bifrost deploys the same way regardless of where it runs.

In-VPC

In-VPC / Private Cloud

Deploy entirely within your VPC on AWS, GCP, or Azure. Complete network isolation with native IAM integration, private endpoints, and no external dependencies.

AWS EKS/ECSGCP GKEAzure AKS
99.999% SLA
Multi-zone HA
Zero data egress
View docs →
On-Premise

On-Premise / Bare Metal

Run on your own hardware via Kubernetes, Docker Compose, or a single binary on bare metal VMs. Full control over compute, storage, and networking with no cloud dependency.

Kubernetes v1.19+DockerSingle binary
Private registry mirror
Credential rotation
Helm + Compose
View docs →
Air-Gapped

Air-Gapped Environments

For environments with zero internet access. Export the Bifrost image on a connected machine, transfer via tarball, load into your internal registry. No phone-home, no telemetry.

Docker save/loadInternal registry
Fully offline operation
No telemetry leakage
Image mirroring
View docs →
Edge / Dev

Single-Node & Edge

Run a single Bifrost instance for dev/test, branch offices, or edge deployments. Minimal footprint with instant setup via Docker Compose, fly.io, or a single Go binary.

Docker Composefly.ioHelm
2 vCPU / 4GB minimum
Single binary deploy
30-second setup
View docs →

[ CLOUD SUPPORT ]

Native Integration with Every Major Cloud

Cloud-native authentication and registry distribution per platform. No fighting your cloud provider's security model.

CloudTargetsAuth MethodRegistryIaC
AWSEKS, ECSIRSAArtifact RegistryTerraform, Helm
GCPGKE, Cloud RunWorkload IdentityArtifact RegistryTerraform, Helm
AzureAKSAzure WIFArtifact RegistryTerraform, Helm
On-PremiseK8s, Docker, Bare MetalBasic AuthInternal mirrorHelm, Compose, Single binary
main.tf
One module, any cloud
module "bifrost" {
  source         = "github.com/maximhq/bifrost//terraform/modules/bifrost?ref=terraform/v0.1.0"
  cloud_provider = "aws"          # "aws" | "gcp" | "azure" | "kubernetes"
  service        = "eks"          # AWS: "ecs" | "eks", GCP: "gke" | "cloud-run", Azure: "aks"
  region         = "us-east-1"
  image_tag      = "latest"
}

[ CORE CAPABILITIES ]

Enterprise Controls without Changing How You Deploy

Every deployment model ships with the same HA clustering, security controls, and observability stack.

High Availability & Clustering

Zero-downtime at scale

  • Peer-to-peer architecture, no single point of failure
  • 6 discovery methods: K8s, Consul, etcd, DNS, UDP, mDNS
  • Gossip-based state sync for rate limits and traffic
  • 3-node minimum
  • Zero-downtime rolling updates
  • Adaptive load balancing with health monitoring
View docs →

Security & Compliance

Zero trust architecture, audit ready

  • Complete VPC isolation, no external dependencies
  • Vault: HashiCorp, AWS SM, GCP SM, Azure KV
  • TLS 1.3 in-transit, KMS encryption at-rest
  • RBAC + OIDC (Okta, Entra ID) + virtual keys
  • Audit trails for SOC 2 Type II, GDPR, HIPAA, ISO 27001
  • Container image signing + vulnerability scanning
View docs →

Observability

Full stack monitoring

  • Built-in dashboard for real-time monitoring
  • Native Prometheus metrics (scrape or Push Gateway)
  • OpenTelemetry / OTLP distributed tracing
  • Native Datadog/BigQuery connectors
  • Log exports to data lakes and storage
  • Health endpoints: /health, /ready, /cluster/status
View docs →

[ HOW IT WORKS ]

Zero-Friction Setup: Deploy in Under a Day

No custom agents, no proprietary orchestration. Deploy with your existing infrastructure tools.

Step 01

Choose deployment model

VPC, on-prem, air-gapped, or multi-cloud. Pick the model that matches your infrastructure and compliance requirements.

# In-VPC, On-Premise, Air-Gapped, Multi-Cloud, or Edge
Step 02

Provision infrastructure

Use the Terraform module or Helm chart. AWS, GCP, Azure, and generic Kubernetes all supported out of the box.

terraform apply
Step 03

Configure Bifrost

Single JSON config file. Connect config store (Postgres or file), set up providers, define virtual keys and routing rules.

# config.json # providers, virtual keys, routing
Step 04

Deploy

Single Go binary with minimal dependencies. Helm install, docker compose up, or terraform apply.

helm install bifrost maximhq/bifrost
Step 05

Integrate your SDKs

Change one line in your existing OpenAI, Anthropic, or LiteLLM SDK. Point at your Bifrost endpoint.

base_url = "https://bifrost.internal"
Step 06

Monitor & scale

Enable clustering for HA, connect Prometheus or OpenTelemetry, set up auto-scaling. Production ready.

curl http://bifrost.internal/cluster/status

[ COMPARISON ]

Self-Hosted Gateway vs SaaS Proxy
vs Bifrost Enterprise

CapabilitySaaS ProxyOSS (LiteLLM etc.)Bifrost Enterprise
In-VPC deploymentNoManual setupTerraform + Helm
Air-gapped supportNoNoDocker save/load
Cloud-native authAPI key onlyAPI key onlyIRSA, Workload Identity, Azure WIF
Vault integrationNoNoHashiCorp, AWS SM, GCP SM, Azure KV
RBAC + SSOLimitedNoOkta, Entra ID, OIDC
Audit logs (SOC 2 Type II, HIPAA)LimitedNoImmutable, compliance ready
Data stays in your networkNoYesYes, zero egress
P99 latency at 500 RPS~50ms~90.72s~1.68s
Uptime SLAVariesNone99.999%

[ USE CASES ]

Real-World Scenarios Where Deployment
Flexibility Changes the Game

Regulated Industries

Healthcare, Finance & Government

Compliance requires all AI traffic to stay within your VPC. No data can transit third-party proxies.

In-VPC deployment with complete network isolation, audit logs for HIPAA/SOC 2 Type II/GDPR, Vault for secrets, and zero data egress. All processing stays inside your controlled environment.

Defense & Air-Gapped

Zero Internet Access Environments

Your environment has no internet. You need a gateway that runs entirely offline with no phone-home.

Docker image export/import workflow, internal registry mirroring, no telemetry leakage. On-prem credential management with offline operation and manual update cycles.

Multi-Cloud Enterprise

AWS + GCP + Azure, Unified

Your org runs AWS for production, GCP for ML, Azure for business units. You need one gateway with consistent governance.

Clustered deployment across clouds with gossip-based state sync, cloud-native auth per environment (IRSA, Workload Identity, WIF), and unified rate limiting and budgets.

Gateway Migration

Migrating from LiteLLM or Other Proxies

You're hitting performance ceilings, missing enterprise features, or struggling with reliability on your current gateway.

Drop-in LiteLLM compatibility, 54x faster P99 latency, 68% less memory. Plus enterprise features like RBAC, guardrails, clustering, and audit logs that alternatives don't offer.

[ GET STARTED ]

Ready to Deploy Bifrost
in Your Environment?

Bifrost is open source and production-ready. Teams deploy in hours and scale without rethinking the architecture.

AICPA SOC
GDPR
ISO 27001
HIPAA

[ BIFROST FEATURES ]

Open Source & Enterprise

Everything you need to run AI in production, from free open source to enterprise-grade features.

01 Governance

SAML support for SSO and Role-based access control and policy enforcement for team collaboration.

02 Adaptive Load Balancing

Automatically optimizes traffic distribution across provider keys and models based on real-time performance metrics.

03 Cluster Mode

High availability deployment with automatic failover and load balancing. Peer-to-peer clustering where every instance is equal.

04 Alerts

Real-time notifications for budget limits, failures, and performance issues on Email, Slack, PagerDuty, Teams, Webhook and more.

05 Log Exports

Export and analyze request logs, traces, and telemetry data from Bifrost with enterprise-grade data export capabilities for compliance, monitoring, and analytics.

06 Audit Logs

Comprehensive logging and audit trails for compliance and debugging.

07 Vault Support

Secure API key management with HashiCorp Vault, AWS Secrets Manager, Google Secret Manager, and Azure Key Vault integration.

08 VPC Deployment

Deploy Bifrost within your private cloud infrastructure with VPC isolation, custom networking, and enhanced security controls.

09 Guardrails

Automatically detect and block unsafe model outputs with real-time policy enforcement and content moderation across all agents.

[ SHIP RELIABLE AI ]

Try Bifrost Enterprise with a 14-day Free Trial

[quick setup]

Drop-in replacement for any AI SDK

Change just one line of code. Works with OpenAI, Anthropic, Vercel AI SDK, LangChain, and more.

1import os
2from anthropic import Anthropic
3
4anthropic = Anthropic(
5 api_key=os.environ.get("ANTHROPIC_API_KEY"),
6 base_url="https://<bifrost_url>/anthropic",
7)
8
9message = anthropic.messages.create(
10 model="claude-3-5-sonnet-20241022",
11 max_tokens=1024,
12 messages=[
13 {"role": "user", "content": "Hello, Claude"}
14 ]
15)
Drop in once, run everywhere.

[ FAQ ]

Frequently Asked Questions

Yes. Bifrost deploys entirely within your VPC on AWS, GCP, or Azure with complete network isolation. All LLM requests, API keys, prompts, and completions stay within your network perimeter. Combined with vault support, no secrets leave your infrastructure.

Yes. Export the Bifrost Docker image on a connected machine using docker save, transfer the tarball to your air-gapped environment, and load it into your internal registry. No phone-home, no telemetry, fully offline operation.

Bifrost provides a single Terraform module that targets AWS (EKS/ECS), GCP (GKE/Cloud Run), Azure (AKS), and generic Kubernetes. Helm charts are also available for Kubernetes deployments. Docker Compose and single binary are provided for on-premise and bare metal.

A single Bifrost node requires 2 vCPU and 4GB RAM minimum. For production HA deployments, a 3-node cluster is recommended. The Go-native binary has a minimal footprint with 68% less memory usage than Python-based alternatives.