Compare llama-4-maverick-17b-128e-instruct-fp8 with other models

Select another model to compare pricing, limits, and capabilities with llama-4-maverick-17b-128e-instruct-fp8.

Models

llama-4-maverick-17b-128e-instruct-fp8

novita

Context Length

1049K

Max Output

Input Cost

$0.27/M

Output Cost

$0.85/M

Mode

Chat

Max Input Tokens

1049K

Max Tokens

Provider

Novita

System Messages

Yes

[ WE'RE OPEN SOURCE ]

Scale with the Fastest LLM Gateway

Built for enterprise-grade reliability, governance, and scale. Deploy in seconds.

or, Get started here

Comparison Insights

Comprehensive analysis based on the latest model metadata from the comparison table above.

What should I know about llama-4-maverick-17b-128e-instruct-fp8?

Overview

llama-4-maverick-17b-128e-instruct-fp8 is a chat model provided by Novita.
This model offers an exceptional context window of 1049K tokens, making it ideal for processing extensive documents, long conversations, or large codebases.

Pricing

Input processing costs $0.27 per million tokens.
Output generation costs $0.85 per million tokens.

Output Capabilities

The model can generate up to 8K tokens in a single response.

What capabilities does llama-4-maverick-17b-128e-instruct-fp8 support?

Includes vision capabilities to process and analyze images alongside text inputs.
Supports system messages for customizing model behavior and setting operational parameters.

llama-4-maverick-17b-128e-instruct-fp8 Pricing Overview

At $0.27 per 1M input tokens and $0.85 per 1M output tokens, llama-4-maverick-17b-128e-instruct-fp8 ranks 952 out of 2637 chat models by input cost. It is more affordable compared to the median of $0.50 for chat models, and is cheaper than 64% of models in this category.