WWW pages of 3rd European Master School on Language and Speech

Using Language Models to Assist in the Correction of Machine Translation Output

Beatrice Alex
(University of Edinburgh)

Machine translation (MT) systems are renowned for making many translation errors. Spotting such errors can be a time- and labor-consuming process which makes automatic evaluation and correction of MT output very desirable to both system developers and end users. Papineni et al. (2001) have developed a MT evaluation method which automatically measures the perplexity of a target text against n-gram language models of ideal translations. Based on this novel approach of using statistical language models for MT evaluation, the main aim of this project is to automatically spot sentences containing translation errors in the output of a commercial MT system (translating English input text into German) by means of n-grams built from a German newspaper corpus. This method aims to differentiate between good and bad quality translated sentences by assigning probabilities to them which are learned from data. The probabilities assigned to a set of known good translations (produced by human translators) will be used as a reference point. Issues such as sentence length and the occurrence of unseen events in the test data will be addressed.