Lecture Notes in Computer Science, 2007, Volume 4509/2007, 476-488, DOI: 10.1007/978-3-540-72665-4_41

Fuzzy Clustering for Topic Analysis and Summarization of Document Collections

René Witte and Sabine Bergler

View Related Documents

Abstract

Large document collections, such as those delivered by Internet search engines, are difficult and time-consuming for users to read and analyse. The detection of common and distinctive topics within a document set, together with the generation of multi-document summaries, can greatly ease the burden of information management. We show how this can be achieved with a clustering algorithm based on fuzzy set theory, which (i) is easy to implement and integrate into a personal information system, (ii) generates a highly flexible data structure for topic analysis and summarization, and (iii) also delivers excellent performance.

Fulltext Preview

Image of the first page of the fulltext document