Creating golden datasets is essential for scaling your application effectively. Maxim allows you to curate high-quality datasets directly from human annotations as your application evolves.

Follow these steps to curate dataset entries from human annotations:

1

Set up a test run

Set up a test run on a prompt or workflow and send the results to human raters for annotation. Learn more about human-in-the-loop evaluation in our evaluation guide.

2

Access test run report

Navigate to the test run report after collecting human ratings.

3

Find human evaluation card

Locate the human evaluation card in the summary section, which shows rater emails and completion status.

4

View detailed ratings

Click the “View Details” button next to completed raters’ emails to access their detailed ratings.

5

Review evaluation data

Review the ratings, comments, and human-corrected outputs where available.

6

Select entries to preserve

Select the entries you want to preserve using the row checkboxes, then click the “Add to Dataset” button at the top.

7

Map data to dataset columns

Select your target Dataset and map the relevant data to appropriate columns. For example, map human-corrected outputs to ground truth columns in your golden dataset.

  • Uncheck any columns you don’t want to include in the dataset.