Process multiple types of content including images, audio, and text with AI models. Bifrost supports vision analysis, speech synthesis, and audio transcription across various providers.
alloy
- Balanced, natural voiceecho
- Deep, resonant voicefable
- Expressive, storytelling voiceonyx
- Strong, confident voicenova
- Bright, energetic voiceshimmer
- Gentle, soothing voiceProvider | Vision | Text-to-Speech | Speech-to-Text |
---|---|---|---|
OpenAI | ✅ GPT-4V, GPT-4o | ✅ TTS-1, TTS-1-HD | ✅ Whisper |
Anthropic | ✅ Claude 3 Sonnet/Opus | ❌ | ❌ |
Google Vertex | ✅ Gemini Pro Vision | ✅ | ✅ |
Azure OpenAI | ✅ GPT-4V | ✅ | ✅ Whisper |