Lecture Notes in Computer Science, 2008, Volume 5351/2008, 809-818, DOI: 10.1007/978-3-540-89197-0_75

A Syntactic-based Word Re-ordering for English-Vietnamese Statistical Machine Translation System

Hong-Nhung Nguyen Thi and Dien Dinh

View Related Documents

Abstract

In machine translation, the re-ordering of word from source to target language is one of the major steps that affect mainly the performance of the system. Among many approaches for this type of problem, syntactic is an effective method for handling word-order in a statistical machine translation (SMT) system. In this paper, we introduce a word re-ordering approach that makes use the syntactic rules extracted from parse tree for the English-Vietnamese SMT system. Our word re-ordering rule set includes rules in noun phrase, verb phrase and adjective phrase. According to the experiment result, the noun phrase rules are the most significant rules of all. Compared with the MOSES phrase-based SMT system [1], these rules can improve BLEU score of 3.24 on our testing corpus. Moreover, we also conduct other experiments by using different combinations of rules to study their effectiveness. And we find that the translation performance for each corpus can be tuned by different ways of combination.

Keywords  Statistical machine translation - word re-ordering - parse tree - syntactic-based word re-ordering rule

Fulltext Preview

Image of the first page of the fulltext document