View Related Documents

Abstract

In spite of intensive research on linguistic techniques in information retrieval, there are still few large-scale search engines that have taken full advantage of these techniques. This paper presents the integration of various linguistic techniques in one of the largest search engines on the Internet. The techniques include language identification, offensive content filtering, phrasing and anti-phrasing, normalization, and clustering. We go into some of the challenges of Internet search and discuss our experiences with these techniques.

Fulltext Preview

Image of the first page of the fulltext document