The accurate translation of collocations, or multi-word units, is essential for high quality machine translation. However,
many collocations do not translate compositionally, thus requiring individual entries in the bilingual lexicon. We present
a technique for collocation extraction from large corpora that takes into account the dispersion of the collocations throughout
the corpus. Collocations are ranked to more accurately reflect how likely they are to occur in a wide variety of texts; collocations
which are specific to a particular text are less useful for lexicon development. Once the collocations are extracted, appropriate
bilingual lexical entries can be developed by lexicographers.