Try Bifrost Enterprise free for 14 days.
Request access

[ Provider Guide ]

Parasail Provider on Bifrost

Parasail is an OpenAI-compatible provider for high-performance inference. Bifrost delegates to the OpenAI implementation with standard parameter filtering, streaming, and tool calling.

Parasail provider summary

Bifrost routes Parasail with full OpenAI API compatibility and filtered parameters for upstream compatibility.

Parasail supports:

  • OpenAI-compatible chat and responses with SSE streaming
  • Tool calling — function definitions and execution
  • Reasoning via standard reasoning_effort
  • Responses API fallback to Chat Completions
PropertyDetails
DescriptionOpenAI-compatible high-performance inference.
Provider route on Bifrostparasail/<model>
AuthenticationAPI key (Bearer)
Supported endpoints/v1/chat/completions, /v1/responses, /v1/models

Authentication

Configure your Parasail API key in Bifrost. Requests use Authorization: Bearer <key>.

Supported operations

Text Completions, Embeddings, Image Generation, Speech, Transcriptions, Files, and Batch return UnsupportedOperationError. See Supported operations in Bifrost docs.

OperationNon-streamingStreamingUpstream endpoint
Chat CompletionsYesYes/v1/chat/completions
Responses APIYesYes/v1/chat/completions
List ModelsYes/v1/models
Text CompletionsNoNo-
EmbeddingsNoNo-
Image GenerationNoNo-
Speech (TTS)NoNo-
Transcriptions (STT)NoNo-
FilesNoNo-
BatchNoNo-

API reference

OpenAI-compatible endpoints routed to Parasail via Bifrost.

1) Chat Completions

Standard OpenAI chat parameters. See Chat Completions in Bifrost docs and OpenAI Chat Completions.

Filtered parameters

ParameterReasonNotes
prompt_cache_keyNot supportedRemoved for Parasail compatibility
verbosityAnthropic-specificRemoved for Parasail compatibility
storeNot supportedRemoved for Parasail compatibility
service_tierNot supportedRemoved for Parasail compatibility

Reasoning uses reasoning_effort (e.g. high). Bifrost converts the internal Reasoning structure to Parasail's string format.

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "parasail/parasail-llama-33-70b-fp8",
    "messages": [{"role": "user", "content": "Hello"}],
    "reasoning_effort": "high",
    "stream": true
  }'

2) Responses API

Converted internally to Chat Completions. See Responses API in Bifrost docs.

ResponsesRequest → ChatRequest → ChatCompletion → ResponsesResponse
curl -X POST http://localhost:8080/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "parasail/parasail-llama-33-70b-fp8",
    "input": "Hello",
    "max_output_tokens": 1024
  }'

3) List Models

Lists available Parasail models with capabilities and context information. See List Models in Bifrost docs.

curl http://localhost:8080/v1/models

Unsupported features

FeatureReason
Text completionsNot offered by Parasail API
EmbeddingsNot offered by Parasail API
Image generationNot offered by Parasail API
Speech/TTSNot offered by Parasail API
Transcription/STTNot offered by Parasail API
Batch operationsNot offered by Parasail API
File managementNot offered by Parasail API

Implementation caveats

CaveatImpactSeverity
Cache control strippedCache control directives removed from messages; prompt caching does not workMedium
Parameter filteringprompt_cache_key, verbosity, store, service_tier removedLow
User field size limitUser identifiers longer than 64 characters silently droppedLow

Authoritative references

[ BIFROST FEATURES ]

Open Source & Enterprise

Everything you need to run AI in production, from free open source to enterprise-grade features.

01 Governance

SAML support for SSO and Role-based access control and policy enforcement for team collaboration.

02 Adaptive Load Balancing

Automatically optimizes traffic distribution across provider keys and models based on real-time performance metrics.

03 Cluster Mode

High availability deployment with automatic failover and load balancing. Peer-to-peer clustering where every instance is equal.

04 Alerts

Real-time notifications for budget limits, failures, and performance issues on Email, Slack, PagerDuty, Teams, Webhook and more.

05 Log Exports

Export and analyze request logs, traces, and telemetry data from Bifrost with enterprise-grade data export capabilities for compliance, monitoring, and analytics.

06 Audit Logs

Comprehensive logging and audit trails for compliance and debugging.

07 Vault Support

Secure API key management with HashiCorp Vault, AWS Secrets Manager, Google Secret Manager, and Azure Key Vault integration.

08 VPC Deployment

Deploy Bifrost within your private cloud infrastructure with VPC isolation, custom networking, and enhanced security controls.

09 Guardrails

Automatically detect and block unsafe model outputs with real-time policy enforcement and content moderation across all agents.

[ SHIP RELIABLE AI ]

Try Bifrost Enterprise with a 14-day Free Trial

[quick setup]

Drop-in replacement for any AI SDK

Change just one line of code. Works with OpenAI, Anthropic, Vercel AI SDK, LangChain, and more.

1import os
2from anthropic import Anthropic
3
4anthropic = Anthropic(
5 api_key=os.environ.get("ANTHROPIC_API_KEY"),
6 base_url="https://<bifrost_url>/anthropic",
7)
8
9message = anthropic.messages.create(
10 model="claude-3-5-sonnet-20241022",
11 max_tokens=1024,
12 messages=[
13 {"role": "user", "content": "Hello, Claude"}
14 ]
15)
Drop in once, run everywhere.