14. Pattern-Oriented Hierarchical Clustering
Tadeusz Morzy6
, Marek Wojciechowski6
and Maciej Zakrzewicz6 
| (6) |
Institute of Computing Science, Poznan University of Technology, ul. Piotrowo 3a, 60-965 Poznan, Poland |
Abstract
Clustering is a data mining method, which consists in discovering interesting data distributions in very large databases.
The applications of clustering cover customer segmentation, catalog design, store layout, stock market segmentation, etc.
In this paper, we consider the problem of discovering similarity-based clusters in a large database of event sequences. We
introduce a hierarchical algorithm that uses sequential patterns found in the database to efficiently generate both the clustering
model and data clusters. The algorithm iteratively merges smaller, similar clusters into bigger ones until the requested number
of clusters is reached. In the absence of a well-defined metric space, we propose the similarity measure, which is used in
cluster merging. The advantage of the proposed measure is that no additional access to the source database is needed to evaluate
the inter-cluster similarities.
Abstract This work was partially supported by the grant no. KBN 43-1309 from the State Committee for Scientific Research (KBN), Poland.
References secured to subscribers.