Curating from Production Logs
The production log curation workflow in Maxim follows these steps:- Select relevant logs: Navigate to your log repository (preferably production) and use filters to identify high-quality examples, edge cases, or specific scenarios you want to preserve for testing
- Initiate dataset creation: Select the logs you want to curate and click the “Add to Dataset” button in the top right corner
- Choose or create dataset: Either add to an existing dataset or create a new one using Maxim’s pre-built templates (like “Dataset testing”) or custom column structures
- Map log fields to dataset columns: Configure how log data maps to your dataset structure (e.g., Input field to Input column, Output to Output column, custom fields to reference data columns)
- Finalize and access: Click “Add to Dataset” and receive a notification when processing is complete
Benefits of Production-Based Datasets
Curating from production logs provides several advantages:- Real user queries and interactions rather than hypothetical scenarios
- Edge cases and failure modes discovered in production
- Distribution of queries that matches actual usage patterns
- Continuously evolving test coverage as your application grows
Curating from Human Annotations
For creating golden datasets with verified correct outputs:- Set up test runs and send results to human raters for annotation
- Review completed ratings including comments and human-corrected outputs
- Select high-quality annotated entries using row checkboxes
- Map human-corrected outputs to ground truth columns in your golden dataset
- Selectively include only the columns relevant to your evaluation needs