Input
- Required Inputs:
session
: Complete interaction log between user and agent showing all steps taken
Output
Result
: Binary score (0 or 1)Reasoning
: Detailed explanation
Interpretation
- 1: Task successfully accomplished intended goal
- 0: Task failed or couldn’t be completed