Input
output
(str): The generated multi-sentence textexpectedOutput
(str): The reference multi-sentence text
Output
Result
(float): A score between 0 and 1.
Interpretation
- Higher scores (closer to 1): Stronger document-level structural similarity
- Lower scores (closer to 0): Weak structural similarity across sentences
How It Works
ROUGE-Lsum computes the LCS for each sentence in the reference against each sentence in the generated text and sums the results, capturing matching subsequences across the entire document.Example (Conceptual)
- Reference has 3 sentences; candidate has 3 sentences
- Compute LCS per sentence pair and sum normalized scores
- Final score reflects overall structural similarity across sentences
This is a Similarity Metric
Use Cases
- Evaluating multi-sentence abstractive summaries
- Document-level machine translation assessment