Input
output
(str): The generated structured text (e.g., code, XML).expectedOutput
(str): The reference structured text.
Output
Result
(float): A normalized distance score between 0 and 1.
Interpretation
0
: The tree structures are identical.1
: The tree structures are completely different.
Captures syntactic and structural similarity, often more important than lexical similarity for code or structured data.
Formula
This is a distance metric for structured text. Lower scores indicate greater structural similarity.
How It Works
Both texts are parsed into trees (e.g., ASTs for code). The metric computes the minimum number of node edit operations needed to transform one tree into the other, optionally normalized by tree size.Use Cases
- Evaluating code generation models
- Assessing structural correctness of generated XML, JSON, or other structured data
- Plagiarism detection in code