Set up AWS Bedrock through IAM, configure credentials, and integrate with Bifrost for cost tracking and multi-model governance. Complete in 10 minutes.
Bifrost supports AWS Bedrock models through AWS SDK and IAM authentication. Bedrock provides access to multiple foundation models from leading providers.
| Property | Details |
|---|---|
| Description | AWS Bedrock provides serverless access to foundation models including Claude, Llama, and others for text, image, and code generation. |
| Provider route on Bifrost | bedrock/<model> |
| Provider doc | AWS Bedrock Documentation |
| API endpoint for provider | bedrock.*.amazonaws.com |
| Supported endpoints | /v1/models, /v1/completions, /v1/chat/completions, /v1/responses, /v1/images/generations, /v1/images/edits, /v1/images/variations, /v1/embeddings, /v1/files, /v1/batches, /v1/count-tokens, /v1/rerank |
| Auth method | AWS IAM Credentials |
Use these AWS-hosted links for console access, API documentation, and authentication details.
Before you begin, you will need:
[ QUICK START ]
Visit aws.amazon.com and create a new account or sign in.
Go to aws.amazon.com and create a new account or sign in to your existing account.
Enable models you want to use in the Bedrock console.
Navigate to the Bedrock console and go to "Model access". Request access to the foundation models you want to use (Claude, Llama, etc).
Create a dedicated user with Bedrock permissions.
In the IAM console, create a new user and attach the BedrockFullAccess policy for development, or a more restrictive policy for production.
Generate access key ID and secret access key.
In the IAM user details, go to "Security credentials" and create a new access key. Copy both the Access Key ID and Secret Access Key immediately.
Authenticate with AWS SDK and invoke a model.
Use your AWS credentials with the Bedrock SDK:
import boto3 client = boto3.client( 'bedrock-runtime', region_name='us-east-1' ) response = client.invoke_model( modelId='anthropic.claude-v2', body=b'{"prompt": "Hello Bedrock!"}' ) print(response['body'].read())
[ MODELS ]
| Model | API ID | Provider | Best for |
|---|---|---|---|
| Jamba 1.5 Large | ai21.jamba-1-5-large-v1:0 | AI21 Labs | Complex reasoning across long documents (256K context). |
| Jamba 1.5 Mini | ai21.jamba-1-5-mini-v1:0 | AI21 Labs | Faster, lower-cost Jamba workloads. |
| Nova 2 Lite | amazon.nova-2-lite-v1:0 | Amazon | Low-latency multimodal tasks with Nova 2. |
| Nova 2 Sonic | amazon.nova-2-sonic-v1:0 | Amazon | Real-time speech and audio interactions. |
| Nova Lite | amazon.nova-lite-v1:0 | Amazon | Fast, cost-effective multimodal workloads. |
| Nova Micro | amazon.nova-micro-v1:0 | Amazon | Ultra-low-latency text generation. |
| Nova Premier | amazon.nova-premier-v1:0 | Amazon | Highest-capability Amazon Nova for complex tasks. |
| Nova Pro | amazon.nova-pro-v1:0 | Amazon | Balanced accuracy, speed, and cost across text, image, and video. |
| Nova Sonic | amazon.nova-sonic-v1:0 | Amazon | Speech-to-speech and conversational audio. |
| Nova Canvas | amazon.nova-canvas-v1:0 | Amazon | Image generation and editing. |
| Nova Reel | amazon.nova-reel-v1:0 | Amazon | Video generation from text and images. |
| Nova Multimodal Embeddings | amazon.nova-multimodal-embeddings-v1:0 | Amazon | Multimodal search and retrieval embeddings. |
| Titan Text Large | amazon.titan-text-express-v1 | Amazon | General-purpose text generation on Titan. |
| Titan Text Embeddings V2 | amazon.titan-embed-text-v2:0 | Amazon | Text embeddings for RAG and semantic search. |
| Titan Embeddings G1 - Text | amazon.titan-embed-text-v1 | Amazon | Legacy text embedding workloads. |
| Titan Multimodal Embeddings G1 | amazon.titan-embed-image-v1 | Amazon | Image and text combined embeddings. |
| Titan Image Generator G1 v2 | amazon.titan-image-generator-v2:0 | Amazon | Image generation with Titan. |
| Claude Opus 4.7 | anthropic.claude-opus-4-7 | Anthropic | Flagship coding, agents, and enterprise workflows (1M context). |
| Claude Opus 4.6 | anthropic.claude-opus-4-6-v1 | Anthropic | Top-tier reasoning and long-running agentic tasks. |
| Claude Sonnet 4.6 | anthropic.claude-sonnet-4-6 | Anthropic | Balanced performance for production agents and coding. |
| Claude Sonnet 4.5 | anthropic.claude-sonnet-4-5-20250929-v1:0 | Anthropic | Agents, coding, and computer use with strong benchmarks. |
| Claude Haiku 4.5 | anthropic.claude-haiku-4-5-20251001-v1:0 | Anthropic | Fast, cost-efficient Claude for high-volume workloads. |
| Claude Opus 4.5 | anthropic.claude-opus-4-5-20251101-v1:0 | Anthropic | Advanced reasoning with extended thinking support. |
| Claude Sonnet 4 | anthropic.claude-sonnet-4-20250514-v1:0 | Anthropic | Strong general-purpose Claude 4 generation. |
| Claude Opus 4.1 | anthropic.claude-opus-4-1-20250805-v1:0 | Anthropic | High-intelligence tasks requiring Opus-class capability. |
| Claude 3.5 Haiku | anthropic.claude-3-5-haiku-20241022-v1:0 | Anthropic | Fast Claude 3.5 tier for latency-sensitive apps. |
| Claude 3 Haiku | anthropic.claude-3-haiku-20240307-v1:0 | Anthropic | Lightweight Claude 3 for simple, fast tasks. |
| Claude Mythos Preview | anthropic.claude-mythos-preview | Anthropic | Preview model for early-access evaluation. |
| Command R+ | cohere.command-r-plus-v1:0 | Cohere | Complex RAG and multi-step tool use. |
| Command R | cohere.command-r-v1:0 | Cohere | RAG and conversational AI at lower cost than R+. |
| Rerank 3.5 | cohere.rerank-v3-5:0 | Cohere | Improving retrieval ranking in RAG pipelines. |
| Embed v4 | cohere.embed-v4:0 | Cohere | Latest-generation Cohere embeddings. |
| Embed English | cohere.embed-english-v3 | Cohere | English-only embedding workloads. |
| Embed Multilingual | cohere.embed-multilingual-v3 | Cohere | Multilingual embedding and search. |
| DeepSeek V3.2 | deepseek.v3-2-v1:0 | DeepSeek | Latest DeepSeek general and coding performance. |
| DeepSeek-V3.1 | deepseek.v3-1-v1:0 | DeepSeek | Strong open-weight-class performance on Bedrock. |
| DeepSeek-R1 | deepseek.r1-v1:0 | DeepSeek | Chain-of-thought reasoning for math, code, and logic. |
| Gemma 3 27B PT | google.gemma-3-27b-pt-v1:0 | Larger Gemma 3 pre-trained base workloads. | |
| Gemma 3 12B IT | google.gemma-3-12b-it-v1:0 | Instruction-tuned Gemma 3 for chat and assistants. | |
| Gemma 3 4B IT | google.gemma-3-4b-it-v1:0 | Compact Gemma 3 for edge and high-volume use. | |
| Llama 4 Maverick 17B Instruct | meta.llama4-maverick-17b-instruct-v1:0 | Meta | Latest Llama 4 family for general instruction following. |
| Llama 4 Scout 17B Instruct | meta.llama4-scout-17b-instruct-v1:0 | Meta | Efficient Llama 4 variant for exploration and routing. |
| Llama 3.3 70B Instruct | meta.llama3-3-70b-instruct-v1:0 | Meta | Strong open-model reasoning and coding (128K context). |
| Llama 3.2 90B Instruct | meta.llama3-2-90b-instruct-v1:0 | Meta | Multimodal-capable large Llama 3.2. |
| Llama 3.2 11B Instruct | meta.llama3-2-11b-instruct-v1:0 | Meta | Balanced Llama 3.2 for vision and text. |
| Llama 3.2 3B Instruct | meta.llama3-2-3b-instruct-v1:0 | Meta | Small multimodal Llama for low latency. |
| Llama 3.2 1B Instruct | meta.llama3-2-1b-instruct-v1:0 | Meta | On-device-class Llama 3.2 workloads. |
| Llama 3.1 405B Instruct | meta.llama3-1-405b-instruct-v1:0 | Meta | Largest Llama 3.1 for maximum capability. |
| Llama 3.1 70B Instruct | meta.llama3-1-70b-instruct-v1:0 | Meta | Production Llama 3.1 at scale. |
| Llama 3.1 8B Instruct | meta.llama3-1-8b-instruct-v1:0 | Meta | Cost-efficient Llama 3.1 inference. |
| Llama 3 70B Instruct | meta.llama3-70b-instruct-v1:0 | Meta | Llama 3 generation general workloads. |
| Llama 3 8B Instruct | meta.llama3-8b-instruct-v1:0 | Meta | Lightweight Llama 3 chat and completion. |
| MiniMax M2.5 | minimax.m2-5-v1:0 | MiniMax | Latest MiniMax general and agent workloads. |
| MiniMax M2.1 | minimax.m2-1-v1:0 | MiniMax | Prior-generation MiniMax at lower cost. |
| MiniMax M2 | minimax.m2-v1:0 | MiniMax | Entry MiniMax tier on Bedrock. |
| Mistral Large 3 | mistral.mistral-large-3-v1:0 | Mistral AI | Flagship Mistral for complex reasoning and agents. |
| Mistral Large | mistral.mistral-large-2407-v1:0 | Mistral AI | Prior large Mistral for multilingual tasks. |
| Mistral Small | mistral.mistral-small-2402-v1:0 | Mistral AI | Cost-efficient Mistral for high volume. |
| Ministral 3 8B | mistral.ministral-3-8b-v1:0 | Mistral AI | Compact Mistral 3 generation. |
| Ministral 3B | mistral.ministral-3b-v1:0 | Mistral AI | Ultra-efficient edge-style inference. |
| Ministral 14B 3.0 | mistral.ministral-14b-3-0-v1:0 | Mistral AI | Mid-size Ministral with strong efficiency. |
| Devstral 2 123B | mistral.devstral-2-123b-v1:0 | Mistral AI | Large code-agent and software engineering tasks. |
| Magistral Small 2509 | mistral.magistral-small-2509-v1:0 | Mistral AI | Reasoning-focused smaller Magistral tier. |
| Pixtral Large | mistral.pixtral-large-2502-v1:0 | Mistral AI | Vision-language and multimodal inputs. |
| Voxtral Mini 3B 2507 | mistral.voxtral-mini-3b-2507-v1:0 | Mistral AI | Compact audio-to-text transcription. |
| Voxtral Small 24B 2507 | mistral.voxtral-small-24b-2507-v1:0 | Mistral AI | Higher-quality speech understanding. |
| Mistral 7B Instruct | mistral.mistral-7b-instruct-v0:2 | Mistral AI | Legacy lightweight Mistral instruct model. |
| Mixtral 8x7B Instruct | mistral.mixtral-8x7b-instruct-v0:1 | Mistral AI | MoE instruct model for diverse tasks. |
| Kimi K2.5 | moonshot.kimi-k2-5-v1:0 | Moonshot AI | Latest Kimi for long-context and agent tasks. |
| Kimi K2 Thinking | moonshot.kimi-k2-thinking-v1:0 | Moonshot AI | Reasoning-heavy Kimi workloads. |
| NVIDIA Nemotron 3 Super 120B | nvidia.nemotron-super-3-120b-v1:0 | NVIDIA | Large Nemotron for enterprise agents. |
| Nemotron Nano 3 30B | nvidia.nemotron-nano-3-30b-v1:0 | NVIDIA | Mid-size Nemotron for balanced cost and quality. |
| NVIDIA Nemotron Nano 9B v2 | nvidia.nemotron-nano-9b-v2-v1:0 | NVIDIA | Efficient Nemotron for high throughput. |
| NVIDIA Nemotron Nano 12B v2 VL BF16 | nvidia.nemotron-nano-12b-v2-vl-bf16-v1:0 | NVIDIA | Vision-language Nemotron workloads. |
| gpt-oss-120b | openai.gpt-oss-120b-v1:0 | OpenAI | Large open-weight GPT-OSS on Bedrock. |
| gpt-oss-20b | openai.gpt-oss-20b-v1:0 | OpenAI | Smaller GPT-OSS for cost-sensitive workloads. |
| GPT OSS Safeguard 120B | openai.gpt-oss-safeguard-120b-v1:0 | OpenAI | Safety-classified large GPT-OSS variant. |
| GPT OSS Safeguard 20B | openai.gpt-oss-safeguard-20b-v1:0 | OpenAI | Safety-classified compact GPT-OSS variant. |
| Qwen3 Coder 480B A35B Instruct | qwen.qwen3-coder-480b-a35b-instruct-v1:0 | Qwen | Large MoE coding model on Bedrock. |
| Qwen3 VL 235B A22B | qwen.qwen3-vl-235b-a22b-v1:0 | Qwen | Vision-language Qwen at scale. |
| Qwen3 235B A22B 2507 | qwen.qwen3-235b-a22b-2507-v1:0 | Qwen | Flagship Qwen3 text reasoning. |
| Qwen3 Next 80B A3B | qwen.qwen3-next-80b-a3b-v1:0 | Qwen | Next-gen Qwen3 architecture. |
| Qwen3 32B | qwen.qwen3-32b-v1:0 | Qwen | Dense Qwen3 for production chat. |
| Qwen3 Coder Next | qwen.qwen3-coder-next-v1:0 | Qwen | Latest Qwen coding-focused model. |
| Qwen3-Coder-30B-A3B-Instruct | qwen.qwen3-coder-30b-a3b-instruct-v1:0 | Qwen | Mid-size Qwen coder for balanced spend. |
| Stable Image Conservative Upscale | stability.stable-image-conservative-upscale-v1:0 | Stability AI | Subtle image upscaling. |
| Stable Image Creative Upscale | stability.stable-image-creative-upscale-v1:0 | Stability AI | Creative detail enhancement when upscaling. |
| Stable Image Fast Upscale | stability.stable-image-fast-upscale-v1:0 | Stability AI | Quick upscaling pipelines. |
| Stable Image Control Sketch | stability.stable-image-control-sketch-v1:0 | Stability AI | Sketch-guided image generation. |
| Stable Image Control Structure | stability.stable-image-control-structure-v1:0 | Stability AI | Structure-preserving image edits. |
| Stable Image Erase Object | stability.stable-image-erase-object-v1:0 | Stability AI | Object removal from images. |
| Stable Image Inpaint | stability.stable-image-inpaint-v1:0 | Stability AI | Masked region inpainting. |
| Stable Image Outpaint | stability.stable-image-outpaint-v1:0 | Stability AI | Extending image borders. |
| Stable Image Remove Background | stability.stable-image-remove-background-v1:0 | Stability AI | Background removal. |
| Stable Image Search and Recolor | stability.stable-image-search-recolor-v1:0 | Stability AI | Semantic recoloring from prompts. |
| Stable Image Search and Replace | stability.stable-image-search-replace-v1:0 | Stability AI | Prompt-based object replacement. |
| Stable Image Style Guide | stability.stable-image-style-guide-v1:0 | Stability AI | Style-consistent generation. |
| Stable Image Style Transfer | stability.stable-image-style-transfer-v1:0 | Stability AI | Applying reference styles to images. |
| Pegasus v1.2 | twelvelabs.pegasus-v1-2-v1:0 | TwelveLabs | Video understanding and captioning. |
| Marengo Embed 3.0 | twelvelabs.marengo-embed-3-0-v1:0 | TwelveLabs | Latest video embedding search. |
| Marengo Embed v2.7 | twelvelabs.marengo-embed-v2-7-v1:0 | TwelveLabs | Prior Marengo video embeddings. |
| Palmyra X5 | writer.palmyra-x5-v1:0 | Writer | Latest Palmyra enterprise text model. |
| Palmyra X4 | writer.palmyra-x4-v1:0 | Writer | Prior Palmyra generation for business writing. |
| Palmyra Vision 7B | writer.palmyra-vision-7b-v1:0 | Writer | Vision-capable Palmyra for document AI. |
| GLM 5 | zai.glm-5-v1:0 | Z.AI | Latest GLM flagship on Bedrock. |
| GLM 4.7 | zai.glm-4-7-v1:0 | Z.AI | Strong GLM 4.7 general reasoning. |
| GLM 4.7 Flash | zai.glm-4-7-flash-v1:0 | Z.AI | Low-latency GLM 4.7 variant. |
Model IDs are in-region identifiers; geo and global inference prefixes (for example us., global.) may apply. Availability varies by AWS Region, confirm access in the Bedrock console and see the official models catalog for the latest list and pricing.
[ TROUBLESHOOTING ]
| Error | Likely Cause | What to Do |
|---|---|---|
AccessDenied | IAM user lacks Bedrock permissions. | Attach BedrockFullAccess or appropriate permissions to the IAM user. |
ModelNotFound | Model not enabled or wrong region. | Enable the model in Bedrock console. Verify you're in the correct region. |
ThrottlingException | Rate limit exceeded. | Implement exponential backoff. Use Bifrost for load distribution. |
[ PRODUCTION-READY ]
Bifrost is a drop-in replacement for AWS Bedrock SDKs. Update your base URL and keep your client code. Bifrost handles cost tracking, virtual keys, budgets, and intelligent failover.
Run the Bifrost gateway and configure your Bedrock credentials in the Web UI.
$ npx -y @maximhq/bifrost
✓ Bifrost started ├─ HTTP server listening on http://localhost:8080 ├─ Web UI available at http://localhost:8080 └─ Configure providers and virtual keys in the dashboard
Update your SDK to route through Bifrost's AWS-compatible gateway.
import boto3 # BEFORE # client = boto3.client('bedrock-runtime', region_name='us-east-1') # AFTER: route via Bifrost + virtual key client = boto3.client( 'bedrock-runtime', region_name='us-east-1', endpoint_url='http://localhost:8080/bedrock' ) response = client.invoke_model( modelId='anthropic.claude-v2', body=b'{"prompt": "Hello from Bifrost!"}' ) print(response['body'].read())
[ WHAT'S NEXT ]
You have your API key. Add governance, guardrails, and MCP controls for production.
[ BIFROST FEATURES ]
Everything you need to run AI in production, from free open source to enterprise-grade features.
01 Governance
SAML support for SSO and Role-based access control and policy enforcement for team collaboration.
02 Adaptive Load Balancing
Automatically optimizes traffic distribution across provider keys and models based on real-time performance metrics.
03 Cluster Mode
High availability deployment with automatic failover and load balancing. Peer-to-peer clustering where every instance is equal.
04 Alerts
Real-time notifications for budget limits, failures, and performance issues on Email, Slack, PagerDuty, Teams, Webhook and more.
05 Log Exports
Export and analyze request logs, traces, and telemetry data from Bifrost with enterprise-grade data export capabilities for compliance, monitoring, and analytics.
06 Audit Logs
Comprehensive logging and audit trails for compliance and debugging.
07 Vault Support
Secure API key management with HashiCorp Vault, AWS Secrets Manager, Google Secret Manager, and Azure Key Vault integration.
08 VPC Deployment
Deploy Bifrost within your private cloud infrastructure with VPC isolation, custom networking, and enhanced security controls.
09 Guardrails
Automatically detect and block unsafe model outputs with real-time policy enforcement and content moderation across all agents.
[ SHIP RELIABLE AI ]
Change just one line of code. Works with OpenAI, Anthropic, Vercel AI SDK, LangChain, and more.
[ FAQ ]
Yes. AWS Bedrock is an AWS service that requires an active AWS account. Start with the AWS free tier which includes Bedrock credits for new customers.
AWS Bedrock is available in us-east-1, us-west-2, eu-west-1, and other regions. Choose the region closest to your application for lowest latency.
Log in to the AWS IAM console, navigate to Users, select your user, go to Security credentials, and create a new access key. Store the credentials securely.
Yes. AWS Bedrock works with temporary security credentials from STS, which is ideal for applications running on EC2, Lambda, and other AWS services.
Use AWS Cost Explorer and CloudWatch to monitor Bedrock usage. For cross-provider cost tracking, route through Bifrost for unified dashboards.
On-demand pricing is pay-per-use. Provisioned throughput offers discounts for predictable, sustained usage. Use Bifrost to manage both modes across providers.