Try Bifrost Enterprise free for 14 days.
Request access

[ Provider Guide ]

Perplexity Provider on Bifrost

Perplexity is an OpenAI-compatible API with built-in web search, citations, and reasoning. Bifrost maps search parameters, reasoning effort, and extended usage fields.

Perplexity provider summary

Bifrost uses OpenAI chat format as the foundation and adds Perplexity-specific search and usage tracking.

  • OpenAI-compatible chat and responses with SSE streaming
  • Web search via request body or extra_params
  • Citations, search_results, and videos preserved in responses
  • Extended usage: citation tokens, search queries, reasoning tokens, cost
PropertyDetails
DescriptionOpenAI-compatible API with web search and reasoning.
Provider route on Bifrostperplexity/<model>
AuthenticationAPI key (Bearer)
Upstream endpoint/chat/completions

Authentication

Configure your Perplexity API key in Bifrost. Requests use Authorization: Bearer <key>.

Supported operations

Unsupported operations return UnsupportedOperationError. See Supported operations in Bifrost docs.

OperationNon-streamingStreamingUpstream endpoint
Chat CompletionsYesYes/chat/completions
Responses APIYesYes/chat/completions
Text CompletionsNoNo-
EmbeddingsNoNo-
Image GenerationNoNo-
Speech (TTS)NoNo-
Transcriptions (STT)NoNo-
FilesNoNo-
BatchNoNo-
List ModelsNoNo-

API reference

1) Chat Completions

See Chat Completions in Bifrost docs.

Dropped parameters

ParameterNotes
tools / tool_choiceFunction calling not available
stopStop sequences not supported
logit_bias, logprobs, top_logprobsNot supported
seed, parallel_tool_calls, service_tierSilently dropped

Search parameters

Pass in the request body (Gateway) or via extra_params (SDK).

ParameterTypeDescription
search_modestringSearch mode: web, academic, news, etc.
language_preferencestringLanguage preference (e.g. en, fr)
search_domain_filterstring[]Restrict search to specific domains
return_imagesbooleanInclude images in search results
return_related_questionsbooleanReturn related questions
search_recency_filterstringhour, day, week, month, year
search_after_date_filterstringResults after date (ISO)
search_before_date_filterstringResults before date (ISO)
last_updated_after_filterstringContent updated after date
last_updated_before_filterstringContent updated before date
disable_searchbooleanDisable web search entirely
enable_search_classifierbooleanEnable search classifier
top_kintegerTop-k results to use

Reasoning: reasoning.effort maps to reasoning_effort (low, medium, high). minimal becomes low. reasoning.max_tokens is dropped.

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "perplexity/sonar",
    "messages": [{"role": "user", "content": "What is the latest news?"}],
    "search_mode": "web",
    "language_preference": "en",
    "return_images": true,
    "return_related_questions": true,
    "search_domain_filter": ["news.example.com"],
    "search_recency_filter": "week"
  }'

2) Responses API

Converted internally to Chat Completions; responses include citations and search metadata. See Responses API in Bifrost docs.

ParameterTransformation
max_output_tokensDirect pass-through to max_tokens
temperature, top_pDirect pass-through
instructionsConverted to system message (prepended)
reasoning.effortMapped to reasoning_effort
text.formatPassed through as response_format
input (string/array)Converted to messages
curl -X POST http://localhost:8080/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "perplexity/sonar",
    "instructions": "You are a helpful assistant with web search capabilities",
    "input": "What is the latest news in technology?",
    "search_mode": "news",
    "return_images": true
  }'

Unsupported features

FeatureReason
Text completionsNot offered by Perplexity API
EmbeddingsNot offered by Perplexity API
Image generationNot offered by Perplexity API
Speech/TTSNot offered by Perplexity API
Transcription/STTNot offered by Perplexity API
Batch operationsNot offered by Perplexity API
File managementNot offered by Perplexity API
List modelsNot offered by Perplexity API

Implementation caveats

CaveatImpactSeverity
No tool supporttools and tool_choice silently dropped; function calling unavailableHigh
Reasoning effort mappingminimal maps to low (only low/medium/high supported)Medium
Reasoning max tokens droppedreasoning.max_tokens has no effectLow
Stop sequences not supportedstop parameter silently droppedLow

Authoritative references

[ BIFROST FEATURES ]

Open Source & Enterprise

Everything you need to run AI in production, from free open source to enterprise-grade features.

01 Governance

SAML support for SSO and Role-based access control and policy enforcement for team collaboration.

02 Adaptive Load Balancing

Automatically optimizes traffic distribution across provider keys and models based on real-time performance metrics.

03 Cluster Mode

High availability deployment with automatic failover and load balancing. Peer-to-peer clustering where every instance is equal.

04 Alerts

Real-time notifications for budget limits, failures, and performance issues on Email, Slack, PagerDuty, Teams, Webhook and more.

05 Log Exports

Export and analyze request logs, traces, and telemetry data from Bifrost with enterprise-grade data export capabilities for compliance, monitoring, and analytics.

06 Audit Logs

Comprehensive logging and audit trails for compliance and debugging.

07 Vault Support

Secure API key management with HashiCorp Vault, AWS Secrets Manager, Google Secret Manager, and Azure Key Vault integration.

08 VPC Deployment

Deploy Bifrost within your private cloud infrastructure with VPC isolation, custom networking, and enhanced security controls.

09 Guardrails

Automatically detect and block unsafe model outputs with real-time policy enforcement and content moderation across all agents.

[ SHIP RELIABLE AI ]

Try Bifrost Enterprise with a 14-day Free Trial

[quick setup]

Drop-in replacement for any AI SDK

Change just one line of code. Works with OpenAI, Anthropic, Vercel AI SDK, LangChain, and more.

1import os
2from anthropic import Anthropic
3
4anthropic = Anthropic(
5 api_key=os.environ.get("ANTHROPIC_API_KEY"),
6 base_url="https://<bifrost_url>/anthropic",
7)
8
9message = anthropic.messages.create(
10 model="claude-3-5-sonnet-20241022",
11 max_tokens=1024,
12 messages=[
13 {"role": "user", "content": "Hello, Claude"}
14 ]
15)
Drop in once, run everywhere.