RAG Evaluation: A Complete Guide for 2025
TL;DR
* RAG systems combine retrieval and generation; evaluation must assess both components.
* Retrieval quality hinges on recall, precision, relevance, and timeliness of sources.
* Generation quality requires grounding, faithfulness, clarity, and low hallucination rates.
* Judge reliability improves with mixed methods: LLM-as-a-judge, programmatic checks, and human review.
* Use Maxim’s offline