Evaluates a generated SQL query’s correctness by comparing its structure, semantics, and execution plan against a reference query. It goes beyond simple string matching.
output
(str): The generated SQL query.expectedOutput
(str): The reference (gold standard) SQL query.Result
(float): A score between 0 and 1.1
: The generated query is semantically equivalent to the reference query.0
: The generated query is completely different or invalid.