Why Run Comparison Experiments
- Make decisions between Prompt versions and models by comparing output differences.
- Analyze scores across all test cases in your Dataset for the evaluation metrics that you choose.
- Side by side comparison views for easy decision making and detailed view for every entry.
Run a Comparison Report
Select Prompt versions
Select the Prompt versions you want to compare it to. These could be totally different Prompts or another version of the same Prompt.

Choose test Dataset
Select your Dataset to test it against. learn more about how to create a dataset

Configure context evaluation
Optionally, select the context you want to evaluate if there is a difference in retrieval pipeline that needs comparison.

Map evaluators variables
Once evaluators are selected, you can map the variable values based on your needs. All built-in variables will be automatically mapped, but you can change the mapping if necessary.Learn more about mapping evaluator variables

Review summary results
Once the run is completed, you will see summary details for each Evaluator. Below that, charts show the comparison data for latency, cost and tokens used.




