Process multiple types of content including images, audio, and text with AI models. Bifrost supports vision analysis, speech synthesis, and audio transcription across various providers.
Provider | Vision | Text-to-Speech | Speech-to-Text |
---|---|---|---|
OpenAI | ✅ GPT-4V, GPT-4o | ✅ TTS-1, TTS-1-HD | ✅ Whisper |
Anthropic | ✅ Claude 3 Sonnet/Opus | ❌ | ❌ |
Google Vertex | ✅ Gemini Pro Vision | ✅ | ✅ |
Azure OpenAI | ✅ GPT-4V | ✅ | ✅ Whisper |