Fully flexible data curation flows

While curating and refining test datasets from logs and test runs, you can now reference and modify any data point from a trace or test run entry; without being limited to predefined fields like input or output. Use Maxim’s DSL in the selection dropdown to:

Map any value from traces or sessions, including tags, tool calls, retrieval steps, generations, or other nodes, directly to columns in your test dataset.
Curate datasets using test run metadata such as evaluation scores, evaluator reasoning, human rater comments, and corrected outputs.

For advanced use cases, you can also write custom code snippets to extract specific information from log and test run parameters, ensuring only high-quality, relevant data is added to your datasets.

Refined human evaluation flow on test runs and logs

Synthetic Data Generation

⌘I