Set Up Human Annotation on Logs

Why Human Evaluation Matters

Automated evaluators are useful for consistent, large-scale assessments, but they often miss nuance, such as context, intent, tone, and human judgment. Human evaluation complements automation by adding:

Qualitative feedback and detailed comments
Context-aware judgments
Rewritten outputs that demonstrate better responses

Human evaluators configured for logs are the same evaluators used in test runs, ensuring consistency across your evaluation workflow.

Ways to Annotate Logs

Maxim supports two approaches to human annotation:

Internal annotation (in-app)

Team members annotate logs directly within the Maxim UI after human evaluators are configured on a log repository.

External annotation (via email)

Invite external raters (outside your organization) to annotate selected logs through an email-based workflow.

Before you start

Ensure logging is set up to capture interactions between your LLM and users.
Integrate the Maxim SDK into your application.
Make sure at least one Human Evaluator exists in your workspace.
Create one from the Evaluators () tab in the sidebar.

Setting Up Self Annotation

Navigate to the repository

Open the log repository where you want to enable human evaluation.

Open evaluation configuration

Click Configure evaluation in the top-right corner of the page. This opens the evaluation configuration sheet.

Select human evaluators

In the Human evaluation section, choose the evaluators you want to enable:

Session evaluators – For multi-turn interactions
Trace evaluators – For single-response evaluations

Select evaluators from the dropdown. The same evaluators used in test runs are available here.

Save the configuration

Click Save configurations at the bottom of the sheet.

Once configured, the logs table automatically adds a column for each human evaluator. Team members can start annotating logs immediately.

Annotating Logs

You can annotate logs from two locations.

From the logs table

When human evaluators are enabled, corresponding columns appear in the logs table:

Click a cell in a human evaluator column
Provide a rating in the annotation form
Optionally add comments or a rewritten output
Click Save

The column displays the average score across all annotations for that log.

Screenshot of the logs table with human evaluator columns

From trace details

Open a trace from the logs table
Click Annotate in the top-right corner of the trace details panel
Provide ratings for all configured evaluators
Optionally add comments or rewritten outputs
Save your annotations

Screenshot of the trace details sheet with the annotate button highlighted

Using Saved Views as Annotation Queues

Saved views help organize annotation work by creating filtered queues of logs:

Apply filters (e.g. unannotated logs, time ranges, specific attributes)
Save the filtered view
Share the view with raters to work through annotations systematically

This ensures evaluators focus on the most relevant logs and keeps annotation workflows organized.

Inviting External Raters

Navigate to the repository

Open the log repository where you want to trigger external human evaluation.

Select traces or sessions

Select one or more traces or sessions from the table, then use the floating action panel to click Add evaluators.Tips:

Select individual logs using row checkboxes
Use the top checkbox to select all logs within the current filters and time range

Choose evaluators

Select one or more Human Evaluators from the dropdown.Optionally, include other evaluators if you want to retroactively evaluate existing logs.

Invite external raters via email

Click Trigger to open the Human Evaluation dialog.In the dialog:

Enter external rater email addresses
Add instructions for raters
Choose what data they can access

Visibility options:

Only trace-level data (input, output, tags, metadata)
Entire trace tree, including nested steps

Click Add evaluators to send invitations.

Screenshot of the human evaluation dialog

Start annotation

Invited external raters receive an email with a link to the annotation dashboard.

Annotation dashboard

External raters use the dashboard to review data and submit annotations.

Viewing Annotations

Annotations are visible in two places.

Logs table

Human evaluator scores appear as columns
Scores are averaged across all annotators
Click any cell to add or edit an annotation

Internal and external annotations are treated identically; the only difference is the annotation source.

Trace details (Evaluation tab)

The Evaluation tab shows:

Average scores per evaluator
Individual annotations, including scores, comments, and rewritten outputs
Pass/fail status, based on evaluator criteria

Understanding Annotation Scores

Average scores – Mean score across all annotators for a log
Individual breakdown – View each annotator’s scores, comments, and rewrites
Pass/fail evaluation – Determined by evaluator configuration
Rewritten outputs – Multiple rewritten versions may exist and are all preserved

Human annotation provides insights that automated systems cannot capture, enabling continuous improvement of your AI applications based on real-world usage.

Introduction

Prompt Engineering

Offline Evals

Online Evals

Tracing

Simulations

Library

Dashboards

Integrations

Settings

Set Up Human Annotation on Logs

Why Human Evaluation Matters

Ways to Annotate Logs

Internal annotation (in-app)

External annotation (via email)

Before you start

Setting Up Self Annotation

Annotating Logs

From the logs table

From trace details

Using Saved Views as Annotation Queues

Inviting External Raters

Viewing Annotations

Logs table

Trace details (Evaluation tab)

Understanding Annotation Scores

Introduction

Prompt Engineering

Offline Evals

Online Evals

Tracing

Simulations

Library

Dashboards

Integrations

Settings

​Why Human Evaluation Matters

​Ways to Annotate Logs

​Internal annotation (in-app)

​External annotation (via email)

​Before you start

​Setting Up Self Annotation

​Annotating Logs

​From the logs table

​From trace details

​Using Saved Views as Annotation Queues

​Inviting External Raters

​Viewing Annotations

​Logs table

​Trace details (Evaluation tab)

​Understanding Annotation Scores

Why Human Evaluation Matters

Ways to Annotate Logs

Internal annotation (in-app)

External annotation (via email)

Before you start

Setting Up Self Annotation

Annotating Logs

From the logs table

From trace details

Using Saved Views as Annotation Queues

Inviting External Raters

Viewing Annotations

Logs table

Trace details (Evaluation tab)

Understanding Annotation Scores