Glossary

ChrF3 (Character F-score metric)

ChrF3 (Character F-score) is an automatic evaluation metric used to measure the quality of machine translation (MT) output. Unlike word-based metrics such as BLEU, ChrF3 compares text at the character level, offering a more language-independent and sensitive assessment of translation quality.

Description

The ChrF3 metric calculates the F-score – a harmonic mean of precision and recall – between the machine-translated output and one or more human reference translations. Because it works at the character level, ChrF3 is especially useful for morphologically rich languages or languages with flexible word order, where exact word matches may not fully capture translation accuracy.

This approach helps evaluators identify subtle differences in spelling, morphology or structure that word-based metrics might miss. As a result, ChrF3 is often used alongside BLEU score, TER (Translation Edit Rate) and COMET to provide a more balanced and comprehensive view of machine translation quality. Within RWS Language Weaver, automated evaluation metrics like ChrF3 are part of a Human + Technology framework – combining statistical precision with expert human insight to improve translation quality continuously.