Evaluates whether an agent has completed all required steps in exactly the specified order, ensuring strict sequential compliance.
session
: Complete interaction log between user and agent showing all steps takenexpected_steps
: Ordered list of required steps to be verified in sequenceResult
: Binary score (0 or 1)Reasoning
: Detailed explanation of step completion