This paper proposes a novel method for phrase-based statistical machine translation based on the use of a pivot language.
To translate between languages
L
s
and
L
t
with limited bilingual resources, we bring in a third language,
L
p
, called the
pivot language. For the language pairs
L
s
−
L
p
and
L
p
−
L
t
, there exist large bilingual corpora. Using only
L
s
−
L
p
and
L
p
−
L
t
bilingual corpora, we can build a translation model for
L
s
−
L
t
. The advantage of this method lies in the fact that we can perform translation between
L
s
and
L
t
even if there is no bilingual corpus available for this language pair. Using BLEU as a metric, our pivot language approach
significantly outperforms the standard model trained on a small bilingual corpus. Moreover, with a small
L
s
−
L
t
bilingual corpus available, our method can further improve translation quality by using the additional
L
s
−
L
p
and
L
p
−
L
t
bilingual corpora.
Keywords Pivot language - Phrase-based statistical machine translation - Scarce bilingual resources