1
Create and publish a Prompt
Start by creating a new Prompt in the playground. Configure your messages and settings, then publish the version when you’re ready for testing.

2
Configure and trigger a test
Choose a Dataset with your test cases and add Evaluators to measure response quality. You can mix and match evaluators to check for accuracy, toxicity, and more.

3
Review test results
Once the test is complete, you’ll get a comprehensive report to understand your Prompt’s performance. You’ll see:
- Overall quality scores across your test cases
- Which inputs performed best and worst
- Side-by-side comparisons of expected vs. actual outputs
- Detailed evaluator feedback on specific responses
