> ## Documentation Index
> Fetch the complete documentation index at: https://www.getmaxim.ai/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Measure Tool Call Accuracy

> Ensuring your prompt selects the accurate tool call (function) is crucial for building reliable and efficient AI workflows. Maxim’s playground allows you to attach your tools (API, code or schema) and measure tool call accuracy for agentic systems.

export const MaximPlayer = ({url}) => {
  return <iframe className="border-background-highlight-secondary h-full w-full rounded-md border-2 aspect-video" src={url} allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowFullScreen></iframe>;
};

<MaximPlayer url="https://www.youtube.com/embed/LGItsF0y5qk" />

Tool call usage is a core part of any agentic AI workflow. Maxim's playground allows you to effectively test if the right tools are being chosen by the LLM and if they are getting successfully executed.

By experimenting in the [playground](/prompt-engineering/tool-calls), you can now make sure your prompt is calling the right tools in specific scenarios and that the execution of the tool leads to the right responses.

To test tool call accuracy at scale across all your use cases, run experiments using a dataset and evaluators as shown below.

## Measure Tool Call Accuracy Across Your Test Cases

<Steps>
  <Step title="Prepare your dataset">
    Set up your dataset with `input` and `expected tool calls` columns.

    <img src="https://mintcdn.com/maximai/YdQNCf1tftKyYOR4/images/docs/evaluate/how-to/evaluate-prompts/run-tool-call/create-dataset-tool.png?fit=max&auto=format&n=YdQNCf1tftKyYOR4&q=85&s=e1756a7c037d81fba224234d8ef324b8" alt="Dataset creation" width="1668" height="1472" data-path="images/docs/evaluate/how-to/evaluate-prompts/run-tool-call/create-dataset-tool.png" />
  </Step>

  <Step title="Define expected tool calls">
    For each input, add the JSON of one or more expected tool calls and arguments you expect from the assistant.

    <img src="https://mintcdn.com/maximai/YdQNCf1tftKyYOR4/images/docs/evaluate/how-to/evaluate-prompts/run-tool-call/dataset-tool.png?fit=max&auto=format&n=YdQNCf1tftKyYOR4&q=85&s=31ddda282789df397c462d68435470a7" alt="Dataset example" width="2312" height="792" data-path="images/docs/evaluate/how-to/evaluate-prompts/run-tool-call/dataset-tool.png" />
  </Step>

  <Step title="Initiate prompt testing">
    Trigger a test on the prompt which has the tools attached.

    <img src="https://mintcdn.com/maximai/YdQNCf1tftKyYOR4/images/docs/evaluate/how-to/evaluate-prompts/run-tool-call/prompt-with-tools-test.png?fit=max&auto=format&n=YdQNCf1tftKyYOR4&q=85&s=4792b2400f0d97a8fea09ce187445fad" alt="Trigger test" width="2314" height="1570" data-path="images/docs/evaluate/how-to/evaluate-prompts/run-tool-call/prompt-with-tools-test.png" />
  </Step>

  <Step title="Select your test dataset">
    Select your dataset from the dropdown.

    <img src="https://mintcdn.com/maximai/YdQNCf1tftKyYOR4/images/docs/evaluate/how-to/evaluate-prompts/run-tool-call/select-dataset-with-tools.png?fit=max&auto=format&n=YdQNCf1tftKyYOR4&q=85&s=30b2b37d96a6fad9123791b999a9f7c6" alt="Prompt with tool instructions" width="1164" height="1390" data-path="images/docs/evaluate/how-to/evaluate-prompts/run-tool-call/select-dataset-with-tools.png" />
  </Step>

  <Step title="Choose the accuracy evaluator">
    Select the tool call accuracy evaluator under statistical evaluators and trigger the run. Add from evaluator store if not available in your workspace.

    <img src="https://mintcdn.com/maximai/YdQNCf1tftKyYOR4/images/docs/evaluate/how-to/evaluate-prompts/run-tool-call/tool-call-accuracy-eval.png?fit=max&auto=format&n=YdQNCf1tftKyYOR4&q=85&s=abc5ea6703e700b1b7414ba1ab9158cf" alt="Tool call accuracy evaluator" width="1164" height="956" data-path="images/docs/evaluate/how-to/evaluate-prompts/run-tool-call/tool-call-accuracy-eval.png" />
  </Step>

  <Step title="Review accuracy scores">
    Once the test run is completed, the tool call accuracy scores will be 0 or 1 based on assistant output.
  </Step>

  <Step title="Analyze detailed message logs">
    To check details of the messages click on any entry and click on the `messages` tab.

    <img src="https://mintcdn.com/maximai/YdQNCf1tftKyYOR4/images/docs/evaluate/how-to/evaluate-prompts/run-tool-call/message-details-with-tool.png?fit=max&auto=format&n=YdQNCf1tftKyYOR4&q=85&s=f47baa95e542892120f74d9d7dde3ba5" alt="Messages including tool" width="1510" height="1650" data-path="images/docs/evaluate/how-to/evaluate-prompts/run-tool-call/message-details-with-tool.png" />
  </Step>
</Steps>
