> ## Documentation Index
> Fetch the complete documentation index at: https://www.getmaxim.ai/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Agent Evals

> Test Agents using datasets to evaluate performance across examples

After testing in the playground, evaluate your Agents across multiple test cases to ensure consistent performance using the test runs.

<Steps>
  <Step title="Create a Dataset">
    Add test cases by creating a [Dataset](/library/datasets/import-or-create-datasets#create-datasets-using-templates). For this example, we'll use a Dataset of product images to generate descriptions.

    <img src="https://mintcdn.com/maximai/xP8Yv_VjZfosz7nU/images/docs/evaluate/how-to/prompt-chains/prompt-chain-test-run-dataset.png?fit=max&auto=format&n=xP8Yv_VjZfosz7nU&q=85&s=b0ecbe447207041aa33b304e91b1f696" alt="Dataset with product images for testing" width="2894" height="1638" data-path="images/docs/evaluate/how-to/prompt-chains/prompt-chain-test-run-dataset.png" />
  </Step>

  <Step title="Build your Agent">
    Create an Agent that processes your test examples. In this case, the agent generates product descriptions, translates them to multiple languages, and formats them to match specific requirements.

    <img src="https://mintcdn.com/maximai/xP8Yv_VjZfosz7nU/images/docs/evaluate/how-to/prompt-chains/product-description-generator-and-translator-chain.png?fit=max&auto=format&n=xP8Yv_VjZfosz7nU&q=85&s=d525855e99de9d1a452bd70a9bbfe865" alt="Agent for product description generation" width="2886" height="1632" data-path="images/docs/evaluate/how-to/prompt-chains/product-description-generator-and-translator-chain.png" />
  </Step>

  <Step title="Start a test run">
    Open the test configuration by clicking `Test` in the top right corner.
  </Step>

  <Step title="Select your dataset">
    Select your dataset from the dropdown.

    <img src="https://mintcdn.com/maximai/ycSaiGU7LF8Hg48q/images/docs/evaluate/how-to/prompt-chains/prompt-chain-test-run-trigger-sheet.png?fit=max&auto=format&n=ycSaiGU7LF8Hg48q&q=85&s=c63d89bebe17b8de3ab2365632861986" alt="Test configuration with dataset and evaluator options" width="1398" height="1722" data-path="images/docs/evaluate/how-to/prompt-chains/prompt-chain-test-run-trigger-sheet.png" />
  </Step>

  <Step title="Configure your test">
    Select [Evaluators](/library/evaluators/pre-built-evaluators/overview) to measure the quality of outputs and map the evaluator variables to the dataset columns.

    You can read more about mapping evaluator variables [here](/library/evaluators/variables-mapping#prompt-variable-mapping).

    <img src="https://mintcdn.com/maximai/M3QOzTePU6PF_7d4/images/docs/evaluate/how-to/prompt-chains/prompt-chain-select-evaluator.png?fit=max&auto=format&n=M3QOzTePU6PF_7d4&q=85&s=430ccb601760953949f092a566991a93" alt="Test configuration with dataset and evaluator options" width="1394" height="1722" data-path="images/docs/evaluate/how-to/prompt-chains/prompt-chain-select-evaluator.png" />

    <Info>
      You can use create and use [Presets](/offline-evals/via-ui/advanced/presets) for your test runs to save time and avoid repeating the same configuration.
    </Info>
  </Step>

  <Step title="Review results">
    Monitor the [test run](/offline-evals/concepts#test-runs) to analyze the performance of your Prompt Chain across all inputs.

    <img src="https://mintcdn.com/maximai/xP8Yv_VjZfosz7nU/images/docs/evaluate/how-to/prompt-chains/prompt-chain-test-run-report.png?fit=max&auto=format&n=xP8Yv_VjZfosz7nU&q=85&s=bb17d96da4ba35b0de8afd7935a07e79" alt="Test run results showing performance metrics" width="2314" height="806" data-path="images/docs/evaluate/how-to/prompt-chains/prompt-chain-test-run-report.png" />
  </Step>
</Steps>
