CIDEr score

CIDEr (Consensus-based Image Description Evaluation) is used for evaluating image captioning tasks.

Based on BLUE score, it measures the similarity between a generated caption and the reference captions, and it is based on the concept of consensus: the idea that good captions should not only be similar to the reference captions in terms of word choice and grammar, but also in terms of meaning and content.

More details here:

https://arxiv.org/abs/1411.5726
https://www.youtube.com/watch?v=3nZF99Z4CIc
https://oecd.ai/en/catalogue/metrics/consensus-based-image-description-evaluation-(cider)