✨ MCP client, Live dashboard, Vertex AI evals, and more

✨ MCP client, Live dashboard, Vertex AI evals, and more

Feature spotlight

🔌 MCP Clients on Maxim

Maxim now supports the Model Context Protocol (MCP), enabling your agents to interact with external tools, access real-time data, and perform actions. Here's what you can do with MCP clients:

  • Connect to popular MCP providers like Composio and Gumloop, or use your own custom MCP server.
  • Automatically import tools from the MCP servers directly into your workspace.
  • Execute tool calls directly from MCP servers in your AI interactions and testing via the prompt playground.
  • Monitor connection status and logs for easy debugging.

Your AI agents can now send emails, create GitHub issues, search the web, and more - all through natural language.

Add MCP Client
Add MCP Client

📊 Live Dashboards

Monitor how your application's quality scores change across experiments and in production. Build dashboards with custom charts tailored to your needs, and gain full control over the analysis of your logs and performance metrics. Key features:

  • Custom logs dashboard: Visualize production logs using custom charts. Filter logs on errors, performance metrics (cost, latency, etc.), and quality metrics (clarity, tone, etc.)
  • Test run comparison: Create live dashboards for your test runs, allowing you to track the live trends of various evaluation metrics (bias, toxicity, etc.) across different runs.

This feature provides a centralized view of your application's performance for better analysis and decision-making.

Live Dashboards in action

🔄 Prompt Partials

Prompt partials are versioned, reusable text blocks you can directly reference in the prompt playground. Key benefits:

  • Reusability: Store commonly used content snippets into partials and reference them in prompts using {{partials.name.version}}, saving time and effort.
  • Independent iteration: Update partials independently without the need to modify every prompt across Maxim, ensuring consistency.

Learn how to use partials to make prompt iteration faster and cleaner.

🚀 Vertex AI provider and evals

Vertex AI is now supported as a provider on Maxim, bringing our total number of providers to 13! With this integration, we’ve added 15 new evaluators from Vertex AI to the Evaluator Store, making it easier to run more advanced and detailed evaluations across a range of tasks.

Use Vertex AI evals on Maxim

🧩 Snowflake data connector

We are excited to introduce our new Snowflake data connector, which enhances our 100% compatibility with OpenTelemetry. This feature allows you to seamlessly stream all incoming logs directly into your Snowflake cluster. Here's what you can expect:

  • Structured timeline: Get a well-organized timeline of your logs for easy tracking and analysis.
  • Full log fidelity: Access complete and detailed logs, ensuring no data is lost or overlooked.

Follow this video to start streaming your logs via Snowflake.

🧠 GPT-4.1

OpenAI's GPT-4.1 model is now available on Maxim. Leverage its improved reasoning and lower latency to design custom evaluators and run smarter prompt experiments.

Start using this model via the OpenAI or Azure provider: Go to Settings > Models > Select OpenAI or Azure provider > Add GPT 4.1

Customer story

👾 Thoughtful’s journey with Maxim AI

Thoughtful is redefining AI companionship with T, an AI-powered emotional support companion designed to help users navigate life’s challenges with clarity and confidence.

As T’s capabilities expanded, managing a growing network of prompts across teams became difficult. Iteration was slow, reliance on engineering was a bottleneck, and testing new updates lacked structure.

Using Maxim, Thoughtful centralized prompt management, streamlined dataset-driven evaluations, and enabled product teams to push updates without engineering support. The result: faster development, higher AI quality, and a smoother path from ideation to production. Read the full customer story here.

Upcoming releases

🤖 Agentic mode

This feature enables you to simulate full agent behavior directly in the playground and test runs, enabling automatic tool calling. This is ideal for testing multi-step agentic flows without needing to manually call the LLM every time.

📝 Human annotation flows: Revamped

We’ve redesigned the human annotation UI for both test runs and logs, making it faster and more intuitive to annotate your AI system’s output. Feedback from single or multiple annotators is consolidated into a single dashboard, eliminating the need to open each record individually. Curate your datasets using human feedback directly from this unified interface.

Knowledge nuggets

💡 Agentic evaluation series

AI agents are evolving from simple workflows to autonomous systems that interact with tools, plan actions, and adapt to user needs. Evaluating these systems requires looking beyond traditional metrics to measure real-world performance.

Our three-part series explores how to properly assess and improve AI agents through both pre-release simulations and post-release monitoring. We cover critical evaluation metrics like task success, agent trajectory, and tool usage accuracy, alongside practical frameworks for continuous improvement.

Whether you're building customer support, travel booking, or coding assistants, these evaluation approaches will help ensure your agents deliver consistent value. Check out our complete series on Agent Evaluation to build more reliable AI systems.

Build, test, and scale powerful AI systems using Maxim’s latest updates.