Human annotation is critical to improve your AI quality. Getting human raters to provide feedback on various dimensions can help measure the present status and be used to improve the system over time. Maxim’s human-in-the-loop pipeline allows team members as well as external raters like subject matter experts to annotate AI outputs.
Creating human evaluators
Select evaluators for test runs
Configure evaluation settings
Collect ratings
Review rating results
Trigger test run
button, if any human evaluators were chosen, you will see a popover to set up the human evaluation.
select rating
button in the relevant evaluator column. A popover will show with all the evaluators that need ratings. Add comments for each rating. In case the output is not upto the mark, submit a re-written output.
Pending
to In-progress
on the test run summary.
Human raters can go through the query, retrieved context, output and expected output (if applicable) for each entry and then provide their ratings for each evaluation metric. They can also add comments or re-write the output for a particular entry. On completion of a rating for a particular entry they can save and proceed and these values will start reflecting on the Maxim test run report.
Completed
status next to this rater’s email. To view the detailed ratings by a particular individual, click the View details
button and go through the table provided.