A comprehensive guide to supported third-party evaluation metrics for assessing AI model outputs
Required: Actual Output Score Range: 0 (safe) to 1 (flagged)
Required: input, output, expected output
Required: input, output, retrieved context
Required: input, output, expected output
Required: input, expected output, retrieved context
Required: input, output, expected output, retrieved context
Required: input, output, expected output, retrieved context
Required: input, retrieved context
Required: input, output, retrieved context
prediction
: Generated answerquestion
: Original questioncontext
: Relevant contextprediction
: Generated answerquestion
: Original questioncontext
: Relevant contextprediction
: Generated answerquestion
: Original questioncontext
: Relevant contextprediction
: Generated answerquestion
: Original questioncontext
: Relevant contextprediction
: Summary outputcontext
: Source contentprediction
: Summary outputcontext
: Source contentprediction
: Candidate summarybaselinePrediction
: Baseline summarycontext
: Source contentinstruction
: PromptNote: Some evaluators may have usage limits or require specific subscription levels with the respective providers.