1

Create and publish a Prompt

Start by creating a new Prompt in the playground. Configure your messages and settings, then publish the version when you’re ready for testing.

2

Configure and trigger a test

Choose a Dataset with your test cases and add Evaluators to measure response quality. You can mix and match evaluators to check for accuracy, toxicity, and more.

3

Review test results

Once the test is complete, you’ll get a comprehensive report to understand your Prompt’s performance. You’ll see:

  • Overall quality scores across your test cases
  • Which inputs performed best and worst
  • Side-by-side comparisons of expected vs. actual outputs
  • Detailed evaluator feedback on specific responses

This helps you quickly identify where your Prompt shines and where it needs improvement.