Conventional document search techniques are constrained by attempting to match individual keywords or phrases to source documents.
Thus, these techniques miss out documents that contain semantically similar terms, thereby achieving a relatively low degree of recall. At the same time, processing capabilities and tools for syntactic and semantic analysis of language have advanced to the
point where an index-time linguistic analysis of source documents is both feasible and realistic. In this paper, we introduce
document dimensions, a means of classifying or grouping terms discovered in documents. Using an enhanced version of Jakarta Lucene[1], we demonstrate
that supplementing keyword analysis with some syntactic and semantic information can indeed enhance the quality of information
retrieval results.