BLUE score

BLEU (BiLingual Evaluation Understudy) is precision-focused score predominantly used in machine translation.

It quantifies the quality of the machine-generated text by comparing it with a set of reference translations. The crux of the BLEU score calculation is the precision of n-grams (continuous sequence of n words in the given sample of text) in the machine-translated text.

However, to prevent the overestimation of precision due to shorter sentences, BLEU includes a brevity penalty factor. Despite its widespread use, it’s important to note that BLEU mainly focuses on precision, and lacks a recall component.

More details here:

https://aman.ai/primers/ai/evaluation-metrics/#bleu
https://aclanthology.org/P02-1040.pdf
https://en.wikipedia.org/wiki/BLEU