Try Bifrost Enterprise free for 14 days.
Explore now

grok-2-vision Cost Calculator - xAI

Calculate the cost of using grok-2-vision from xAI for your AI applications

grok-2-vision Cost Calculator

Mode: Chat

Max: 32,768 tokens

Max: 32,768 tokens

Cost Breakdown

Input Cost$0.002000
Output Cost$0.010000
Image Cost$0.00000200
Total Cost$0.012002

Pricing Details

Input: $0.00000200 per token
Output: $0.00001000 per token
Input Image: $0.00000200 per image
[ WE'RE OPEN SOURCE ]

Scale with the Fastest LLM Gateway

Built for enterprise-grade reliability, governance, and scale. Deploy in seconds.

Model Specifications

Capabilities

Function Calling
Vision
Web Search

Limits

Max Input Tokens32,768
Max Output Tokens32,768
Max Tokens32,768

About grok-2-vision

grok-2-vision is a powerful chat AI model offered by xAI. This comprehensive guide provides detailed pricing information, technical specifications, and capabilities to help you understand the costs and features of using grok-2-vision in your applications.

Pricing Information

Input Cost$2.00 per 1M tokens
Output Cost$10.00 per 1M tokens
Image Input Cost$0.0000 per image

Note: Use the interactive calculator above to estimate costs for your specific usage patterns.

Technical Specifications

Maximum Input Tokens32,768
Maximum Output Tokens32,768
Maximum Total Tokens32,768

Pro Tip

Use the maximum token limits shown above to understand the model's capacity. This model can handle up to 32,768 input tokens. The maximum output length is 32,768 tokens.

Model Capabilities

Function Calling - Execute custom functions and tools
Vision - Process and understand images
Web Search - Access real-time web information
When should you use grok-2-vision?

grok-2-vision is best suited for the following scenarios:

  • Agentic systems with function or tool calling
  • Workflow automation and API orchestration
  • Multimodal applications requiring image or audio processing
  • Content analysis across multiple media types
When should you avoid grok-2-vision?
  • High-volume text generation where output cost dominates
  • Streaming or verbose response workloads
  • Complex multi-step reasoning or planning tasks
  • Very large documents or long conversational histories
How does grok-2-vision compare to similar models?

This model sits in the middle of its category in terms of pricing and capabilities, making it a balanced option for general workloads.

Understanding grok-2-vision pricing
  • grok-2-vision is a general-purpose AI model provided by xAI.
  • Input tokens are priced at $2.00 per 1M tokens.
  • Output tokens are priced at $10.00 per 1M tokens.
  • Image input is priced at $0.0000 per image.
  • The model supports a maximum input capacity of 32,768 tokens.
  • Maximum output length is 32,768 tokens.
  • For this model, input tokens are less expensive than output tokens, so optimizing your prompts can help manage costs.
  • The model includes vision capabilities for processing and analysing images.
  • Supports function calling for executing custom functions and tools.
  • Includes real-time web search capabilities for accessing current information.
  • xAI offers grok-2-vision for general-purpose AI workloads — general-purpose AI workloads.

How to Use This Calculator

Step 1: Enter the number of input tokens you expect to use. Input tokens include your prompt, system messages, and any context you provide to the model.

Step 2: Specify the number of output tokens you anticipate. Output tokens are the text generated by the model in response to your input.

Step 3: Review the cost breakdown to see the total estimated cost for your usage. The calculator automatically updates as you adjust the token counts.