Parasail provider summary
Bifrost routes Parasail with full OpenAI API compatibility and filtered parameters for upstream compatibility.
Parasail supports:
- OpenAI-compatible chat and responses with SSE streaming
- Tool calling — function definitions and execution
- Reasoning via standard
reasoning_effort - Responses API fallback to Chat Completions
| Property | Details |
|---|---|
| Description | OpenAI-compatible high-performance inference. |
| Provider route on Bifrost | parasail/<model> |
| Authentication | API key (Bearer) |
| Supported endpoints | /v1/chat/completions, /v1/responses, /v1/models |
Authentication
Configure your Parasail API key in Bifrost. Requests use Authorization: Bearer <key>.
Supported operations
Text Completions, Embeddings, Image Generation, Speech, Transcriptions, Files, and Batch return UnsupportedOperationError. See Supported operations in Bifrost docs.
| Operation | Non-streaming | Streaming | Upstream endpoint |
|---|---|---|---|
| Chat Completions | Yes | Yes | /v1/chat/completions |
| Responses API | Yes | Yes | /v1/chat/completions |
| List Models | Yes | — | /v1/models |
| Text Completions | No | No | - |
| Embeddings | No | No | - |
| Image Generation | No | No | - |
| Speech (TTS) | No | No | - |
| Transcriptions (STT) | No | No | - |
| Files | No | No | - |
| Batch | No | No | - |
API reference
OpenAI-compatible endpoints routed to Parasail via Bifrost.
1) Chat Completions
Standard OpenAI chat parameters. See Chat Completions in Bifrost docs and OpenAI Chat Completions.
Filtered parameters
| Parameter | Reason | Notes |
|---|---|---|
| prompt_cache_key | Not supported | Removed for Parasail compatibility |
| verbosity | Anthropic-specific | Removed for Parasail compatibility |
| store | Not supported | Removed for Parasail compatibility |
| service_tier | Not supported | Removed for Parasail compatibility |
Reasoning uses reasoning_effort (e.g. high). Bifrost converts the internal Reasoning structure to Parasail's string format.
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "parasail/parasail-llama-33-70b-fp8",
"messages": [{"role": "user", "content": "Hello"}],
"reasoning_effort": "high",
"stream": true
}'2) Responses API
Converted internally to Chat Completions. See Responses API in Bifrost docs.
ResponsesRequest → ChatRequest → ChatCompletion → ResponsesResponse
curl -X POST http://localhost:8080/v1/responses \
-H "Content-Type: application/json" \
-d '{
"model": "parasail/parasail-llama-33-70b-fp8",
"input": "Hello",
"max_output_tokens": 1024
}'3) List Models
Lists available Parasail models with capabilities and context information. See List Models in Bifrost docs.
curl http://localhost:8080/v1/models
Unsupported features
| Feature | Reason |
|---|---|
| Text completions | Not offered by Parasail API |
| Embeddings | Not offered by Parasail API |
| Image generation | Not offered by Parasail API |
| Speech/TTS | Not offered by Parasail API |
| Transcription/STT | Not offered by Parasail API |
| Batch operations | Not offered by Parasail API |
| File management | Not offered by Parasail API |
Implementation caveats
| Caveat | Impact | Severity |
|---|---|---|
| Cache control stripped | Cache control directives removed from messages; prompt caching does not work | Medium |
| Parameter filtering | prompt_cache_key, verbosity, store, service_tier removed | Low |
| User field size limit | User identifiers longer than 64 characters silently dropped | Low |