Use human evaluation or rating to assess the quality of your logs and evaluate them.
Navigate to repository
Access evaluation configuration
Configure evaluation
in the top right corner of the page and choose the Setup evaluation configuration
option. This will open up the evaluation configuration sheet.Select human evaluators
Human Evaluation
section below. Here we will see a dropdown under Select evaluators
, we need to choose Human Evaluators to use for our evaluation from here.This will setup what evaluation we want to do upon our logs. Now we need to setup filtering criteria to determine which logs should be evaluated as evaluating all logs by hand can get out of hand very fast.Save configuration
Save configuration
button.Access annotation queue
Configure evaluation
in the top right corner of the page again but choose the View annotation queue
option this time. You will be taken to the annotation queue page.Set up queue logic
Set up queue logic
button, click on it to setup the logic for the queue and click on the Save queue logic
button finally to save.<Icon icon="notebook-pen" /> Add to annotation queue
button and you’re done!Save and next
button to move to the next log/entry and score it.
Details
and Evaluation
tab. The Evaluation
tab here would display all the evaluations on that happened on the trace. We will focus on the Human Evaluators here but in order to make sense of other evaluators in this sheet you can refer to Auto Evaluation -> Making sense of evaluations on logs
The trace evaluation overview tab shows the average score of each Human Evaluator and Rewritten Outputs, if present, by each individual user.
Score
(avg.) and Result
(whether the particular evaluator’s evaluation passed or failed). We also see a breakdown of the scores and their corresponding comments, if any, given by each user in this tab, thus giving you a granular view of the evaluation as well.