Welcome!
To use the personalized features of this site, please log in or register.
If you have forgotten your username or password, we can help.
My Menu
Saved Items

Clustering and Classification

Improving Quality of Search Results Clustering with Approximate Matrix Factorisations

Stanislaw OsinskiContact Information

(1)  Poznan Supercomputing and Networking Center, ul. Noskowskiego 10, 61-704, Poznan, Poland
Abstract
In this paper we show how approximate matrix factorisations can be used to organise document summaries returned by a search engine into meaningful thematic categories. We compare four different factorisations (SVD, NMF, LNMF and K-Means/Concept Decomposition) with respect to topic separation capability, outlier detection and label quality. We also compare our approach with two other clustering algorithms: Suffix Tree Clustering (STC) and Tolerance Rough Set Clustering (TRC). For our experiments we use the standard merge-then-cluster approach based on the Open Directory Project web catalogue as a source of human-clustered document summaries.

Contact Information Stanislaw Osinski
Email: stanislaw.osinski@man.poznan.pl
Fulltext Preview (Small, Large)
Image of the first page of the fulltext


Export this chapter
Export this chapter as RIS | Text
 
Remote Address: 38.107.191.111 • Server: mpweb04
HTTP User Agent: CCBot/1.0 (+http://www.commoncrawl.org/bot.html)