F1 Score

Input
Output
Interpretation
Formula
Use Cases

Input

output (str): The generated text (set of items)
expectedOutput (str): The reference text (set of items)

Output

Result (float): A score between 0 and 1.

Interpretation

Higher scores (closer to 1): Strong balance of precision and recall
Lower scores (closer to 0): Either precision or recall (or both) are weak

Formula

\mathrm{F1} = \frac{2 \cdot \mathrm{Precision} \cdot \mathrm{Recall}}{\mathrm{Precision} + \mathrm{Recall}}

Where:

\mathrm{Precision} = \frac{\mathrm{TP}}{\mathrm{TP} + \mathrm{FP}},\quad\mathrm{Recall} = \frac{\mathrm{TP}}{\mathrm{TP} + \mathrm{FN}}

This is a Similarity Metric

Use Cases

Evaluating classifiers with class imbalance
Information extraction and NER
Scenarios where both precision and recall matter

BLEU

Precision

⌘I

Introduction

Prompt Engineering

Offline Evals

Online Evals

Tracing

Simulations

Library

Dashboards

Integrations

Settings

CI/CD

Input

Output

Interpretation

Formula

Use Cases

Introduction

Prompt Engineering

Offline Evals

Online Evals

Tracing

Simulations

Library

Dashboards

Integrations

Settings

CI/CD

​Input

​Output

​Interpretation

​Formula

​Use Cases

Input

Output

Interpretation

Formula

Use Cases