Input
- Required Inputs:
session
: Complete interaction log between user and agent showing all steps takenexpected_steps
: List of required steps (order flexible)
Output
Result
: Binary score (0 or 1)Reasoning
: Detailed explanation of step completion
Interpretation
- 1: All steps completed (order flexible)
- 0: Missing steps or dependency violations