Replicate on Bifrost: Models, Predictions, Setup, and Mappings

Replicate provider summary

All operations create predictions via /v1/predictions or deployment endpoints
Model-specific fields via extra_params (flattened into prediction input)
Sync: Prefer: wait (up to 60s); async: poll every 2s
List Models returns account deployments only, not all public models

Property	Details
Description	Prediction-based multimodal inference.
Provider route on Bifrost	replicate/<model>
Authentication	API token (Bearer)

Model identification

Three ways to specify a Replicate model. See Model Identification in Bifrost docs.

1. Version ID

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "replicate/5c7d5dc6dd8bf75c1acaa8565735e7986bc5b66206b55cca93cb72c9bf15ccaa",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

2. Model name (owner/model-name)

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "replicate/meta/llama-2-7b-chat",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

3. Deployment (aliases in key config)

{
  "provider": "replicate",
  "value": "your-api-key",
  "aliases": {
    "my-model": "owner/my-deployment-name"
  }
}

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "replicate/my-model",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Prediction modes

Sync: Send Prefer: wait in request headers. Bifrost blocks until completion or timeout (default 60s), then falls back to polling.

Async (default): Poll prediction URL every 2 seconds. Status: starting → processing → succeeded / failed / canceled.

Supported operations

Operation	Non-streaming	Streaming	Upstream
Chat Completions	Yes	Yes	/v1/predictions
Responses API	Yes	Yes	/v1/predictions
Text Completions	Yes	Yes	/v1/predictions
Image Generation	Yes	Yes	/v1/predictions
Image Edit	Yes	Yes	/v1/predictions
Video Generation	Yes	—	/v1/predictions
Files	Yes	—	/v1/files
List Models	Yes	—	/v1/deployments
Image Variation	No	No	-
Embeddings	No	No	-
Speech (TTS)	No	No	-
Transcriptions (STT)	No	No	-
Batch	No	No	-

List Models returns account-specific deployments only, not all public Replicate models.

API reference

1) Chat Completions

System messages → system_prompt; image URLs → image_input. Some models prepend system prompt instead. See Chat Completions in Bifrost docs.

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "replicate/meta/llama-2-7b-chat",
    "messages": [{"role": "user", "content": "Hello"}],
    "temperature": 0.7,
    "top_k": 50,
    "repetition_penalty": 1.1
  }'

2) Responses API

Converted to predictions; OpenAI gpt-5-structured models may use native Responses format. Status: succeeded → completed, failed → failed, processing → in_progress.

ResponsesRequest → ReplicatePredictionRequest → BifrostResponsesResponse

3) Text Completions

curl -X POST http://localhost:8080/v1/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "replicate/meta/llama-2-7b",
    "prompt": "Once upon a time",
    "max_tokens": 100,
    "temperature": 0.8,
    "top_k": 40
  }'

4) Image Generation

Bifrost	Replicate input
prompt	prompt
n	number_of_images
aspect_ratio	aspect_ratio
resolution	resolution
output_format	output_format
quality	quality
background	background
seed	seed
negative_prompt	negative_prompt
num_inference_steps	num_inference_steps
input_images	input_images (mapped by model)

Flux input image field mapping

Field	Models
image_prompt	flux-1.1-pro, flux-1.1-pro-ultra, flux-pro, flux-1.1-pro-ultra-finetuned
input_image	flux-kontext-pro, flux-kontext-max, flux-kontext-dev
image	flux-dev, flux-fill-pro, flux-dev-lora, flux-krea-dev
input_images	All other models (default)

curl -X POST http://localhost:8080/v1/images/generations \
  -H "Content-Type: application/json" \
  -d '{
    "model": "replicate/black-forest-labs/flux-schnell",
    "prompt": "A serene mountain landscape at sunset",
    "aspect_ratio": "16:9",
    "output_format": "webp",
    "num_inference_steps": 4,
    "seed": 42
  }'

5) Image Edit

Same input image field mapping as image generation. POST /v1/images/edits.

curl -X POST http://localhost:8080/v1/images/edits \
  -F 'model=replicate/black-forest-labs/flux-fill-pro' \
  -F 'image[]=@image.png' \
  -F 'prompt=Replace the sky with a starry night'

6) Files API

Upload, list, retrieve, delete. Content download requires signed URL params (owner, expiry, signature) in request body.

curl -X POST http://localhost:8080/v1/files \
  -F "file=@document.pdf" \
  -F "filename=my-document.pdf"

7) List Models

Returns deployments for your account. Use replicate/my-org/my-deployment as model ID.

curl "http://localhost:8080/v1/models?limit=20"

8) Video Generation

Parameter	Type	Required	Notes
model	string	Yes	owner/model or version ID
prompt	string	Yes	Text description
input_reference	string	No	Reference image → image or input_reference by model
seconds	string	No	Duration → duration
seed	int	No	Reproducibility
negative_prompt	string	No	What to avoid

curl -X POST http://localhost:8080/v1/videos \
  -H "Content-Type: application/json" \
  -d '{
    "model": "replicate/minimax/video-01",
    "prompt": "A cat walking through a garden",
    "seconds": "5"
  }'

Retrieve: GET /v1/videos/{id} → /v1/predictions/{id}. Download: GET /v1/videos/{id}/content.

Extra parameters

Non-standard fields are flattened into the prediction input object. Discover schemas on replicate.com or via the model version OpenAPI schema.

{
  "model": "replicate/stability-ai/sdxl",
  "prompt": "A photo of an astronaut",
  "guidance_scale": 7.5,
  "num_inference_steps": 50,
  "scheduler": "DPMSolverMultistep"
}

Unsupported features

Feature	Reason
Image variation	Not supported via Replicate provider
Embeddings	Not offered
Speech/TTS	Not offered
Transcription/STT	Not offered
Batch	Not offered
Video list / remix / delete	Not supported by Replicate

Implementation caveats

Caveat	Impact	Severity
System prompt field support	Unsupported models prepend system text to user prompt	Medium
Input image field mapping	Flux models use image_prompt, input_image, or image	Medium
Image content in chat	Only non-base64 image URLs extracted to image_input	Low
Model-specific parameters	Each model has unique schema; use extra_params	Medium

Replicate Provider on Bifrost

Replicate provider summary

Model identification

Prediction modes

Supported operations

API reference

Extra parameters

Unsupported features

Implementation caveats

Authoritative references

Explore Bifrost Resources

Governance

Guardrails

MCP Gateway

Open Source & Enterprise

Try Bifrost Enterprise with a 14-day Free Trial

Drop-in replacement for any AI SDK

[ Features ]

[ Resources ]

[ Industries ]

[ Developers ]

[ Company ]