The Operation Sequence Model – Combining N-Gram-based and Phrase-based Statistical Machine Translation
Abstract
In this article, we present a novel machine translation model, the operation sequencemodel (OSM), that combines the benefits of phrase-based and N-gram-based SMT and remediestheir drawbacks. The model is based on a joint source channel probability model which representsthe translation process as a linear sequence of operations. The sequence includes not only translationoperations but also reordering operations. As in N-gram-based SMT, the model is: i) basedon minimal translation units, ii) takes both source and target information into account, iii) doesnot make a phrasal independence assumption and iv) avoids the spurious phrasal segmentationproblem. As in phrase-based SMT, the model i) has the ability to memorize lexical reorderingtriggers, ii) builds the search graph dynamically, and iii) decodes with large translation unitsduring search. The unique properties of the model are i) its strong coupling of reordering andtranslation where translation and reordering decisions are conditioned on n previous translationand reordering decisions, ii) the ability to perform long range reorderings. Using BLEU as ametric of translation accuracy, we found that our system performs significantly better thanstate-of-the-art phrase-based systems (Moses and Phrasal) and N-gram-based systems (Ncode)on standard translation tasks. We also conduct a study where we look only at the reorderingcomponent of our model, comparing it to the Moses lexical reordering model, by integrating theOSM model into Moses. We found that the OSM model outperforms lexicalized reordering onall translation tasks and that using both of the models in tandem gives the best results.Published
2024-12-05
Issue
Section
Long Paper