Explore key concepts in AI evaluation, including evaluators, datasets, and custom tools for assessing model performance and output quality.
Evaluator type | Description |
---|---|
AI | Uses AI models to assess outputs |
Programmatic | Applies predefined rules or algorithms |
Statistical | Utilizes statistical methods for evaluation |
Human | Involves human judgment and feedback |
API-based | Leverages external APIs for assessment |
validJson
, validURL
, etc., that help validate your responses.{{input}}
, {{output}}
, and {{expectedOutput}}
variables, which pull relevant data from the dataset column or the response of the run to execute the evaluator.
{{input}}
: Input from the dataset{{expectedOutput}}
: Expected output from the dataset{{expectedToolCalls}}
: Expected tool calls from the dataset{{scenario}}
: Scenario from the dataset{{expectedSteps}}
: Expected steps from the dataset{{output}}
: Generated output of the endpoint/prompt/no-code agent{{context}}
: Context to evaluate