- MT-Telescope is an open source tool that provides unique granularity and quantitative insights into the quality performance of MT systems
- Typically, MT quality measurement metrics such as COMET, BLEU, or METEOR provide an overall quality score for a data set
Unbabel, an AI-powered Language Operations platform that helps businesses deliver multilingual support at scale announced the launch of MT-Telescope – a new tool that enables developers and users of Machine Translation (MT) systems to deeply analyze and understand MT quality performance. Building on Unbabel’s automated quality measurement framework COMET, MT-Telescope is an open source tool that provides unique granularity and quantitative insights into the quality performance of MT systems.
Alon Lavie, VP of Language Technologies at Unbabel said, “At Unbabel, we constantly work on developing, training, maintaining, and deploying MT systems at a rapid pace and to high quality standards. This challenging need drives our research and development objectives, especially in the domain of quality analysis and evaluation. MT-Telescope helps our LangOps specialists and development teams make smarter decisions for customers about which MT system better suits their needs, and enables the MT research community to easily use best practice analysis methods and tools to rigorously benchmark their advances.”
Typically, MT quality measurement metrics such as COMET, BLEU, or METEOR provide an overall quality score for a data set. MT-Telescope takes this quality scoring a step further by exposing the underlying factors behind performance, and zooms into a fine-granularity analysis of translation accuracy down to individual words, terminology and sentences.
“Our research shows that one of the biggest needs in applying machine translation is insight into its usability, an area where current methods fall short,” comments Dr. Arle Lommel, senior analyst at CSA Research. “Guidance-focused evaluation that focuses on how well MT suits particular use cases will help extend the technology to new areas and increase acceptance of machine translation-based workflows.”
MT-Telescope has an intuitive visual browser interface that lets non-technical users to compare two MT systems and assess which is the better fit to meet their objectives. MT-Telescope’s visualizations provide comparison across three key areas. They are a comparison of quality scores for subsets in the data, such as named entities (i.e. product or brand names), terminology (i.e. distinct phrases) or segment length (i.e. the length of the translated sentence). A side-by-side error analysis of each overall MT system, allowing for substantive contrastive comparisons. A visualization of the distribution of quality scores between the two systems