Automatic Token and Cost Tracking
When you log LLM generations using the Maxim SDK, token usage and costs are captured automatically from the model response. When recording an LLM response, include the usage object with prompt_tokens, completion_tokens, and total_tokens:
generation.result({
id: "chatcmpl-123",
object: "chat.completion",
created: Date.now(),
model: "gpt-4o",
choices: [{
index: 0,
message: {
role: "assistant",
content: "Response content here"
},
finish_reason: "stop"
}],
usage: {
prompt_tokens: 100,
completion_tokens: 50,
total_tokens: 150
}
});
For framework integrations like LangChain, LlamaIndex, or Google ADK, token usage is tracked automatically when you use the Maxim tracer. For example, with LangChain:
from maxim.logger.langchain import MaximLangchainTracer
langchain_tracer = MaximLangchainTracer(logger)
response = llm.invoke(messages, config={"callbacks": [langchain_tracer]})
This automatically captures latency, token usage, and costs for all LLM calls without additional instrumentation.
Custom Metric Tracking via SDK
For more granular control, you can log token usage and cost metrics explicitly at different levels using the addMetric method.
Track metrics at the trace level:
trace.addMetric('cost_usd', 0.05);
trace.addMetric('tokens_total', 1420);
Track metrics at the generation level:
generation.addMetric('tokens_in', 312);
generation.addMetric('tokens_out', 87);
generation.addMetric('ttft_ms', 180.5); // Time to first token
generation.addMetric('tps', 15.8); // Tokens per second
Track metrics at the session level for aggregates across multiple interactions:
session.addMetric('traces_count', 4);
session.addMetric('user_messages_count', 2);
Configuring Custom Token Pricing
To ensure cost calculations reflect your actual expenses (such as negotiated enterprise rates), configure custom pricing structures:
Navigate to Settings > Models > Pricing
Enter a model name pattern (string or regex) that matches your model names
Input your token usage cost per 1,000 tokens for both input and output tokens
Custom pricing supports OpenAI, Microsoft Azure, Groq, HuggingFace, Together AI, Google Cloud, and Amazon Bedrock models
Applying Pricing to Model Configs
Go to Settings > Models > Model Configs
Select a model config to edit
Locate the Pricing structure section
Choose your pricing structure from the dropdown
Applying Pricing to Log Repositories
Open Logs from the sidebar
Select the log repository you want to configure
Find the Pricing structure section
Choose your pricing structure from the dropdown
When no custom rates exist, standard pricing applies by default.
Setting Up Cost and Token Alerts
Monitor token usage and costs in real-time by configuring alerts:
Navigate to your log repository and select the Alerts tab
Click Create alert and select Log metrics as the alert type
Configure thresholds for:
- Token Usage: Alert when consumption exceeds limits (e.g., trigger when hourly usage exceeds 1 million tokens)
- Cost: Alert when expenses exceed budgets (e.g., trigger when daily costs exceed $100)
Select notification channels (Slack or PagerDuty)
Dashboard Visibility
Once logging is set up, you can view aggregated token and cost data in your log repository dashboard, including:
- Total usage over time
- Cost per trace
- Token counts for each log entry (visible in the logs table)
- Latency and performance metrics