ChrF3 (Character F-score metric)
Description
The ChrF3 metric calculates the F-score – a harmonic mean of precision and recall – between the machine-translated output and one or more human reference translations. Because it works at the character level, ChrF3 is especially useful for morphologically rich languages or languages with flexible word order, where exact word matches may not fully capture translation accuracy.
This approach helps evaluators identify subtle differences in spelling, morphology or structure that word-based metrics might miss. As a result, ChrF3 is often used alongside BLEU score, TER (Translation Edit Rate) and COMET to provide a more balanced and comprehensive view of machine translation quality. Within RWS Language Weaver, automated evaluation metrics like ChrF3 are part of a Human + Technology framework – combining statistical precision with expert human insight to improve translation quality continuously.