Lecture Notes in Computer Science, 2007, Volume 4702/2007, 479-486, DOI: 10.1007/978-3-540-74976-9_49

Matching Partitions over Time to Reliably Capture Local Clusters in Noisy Domains

Frank Höppner and Mirko Böttcher

View Related Documents

Abstract

When seeking for small clusters it is very intricate to distinguish between incidental agglomeration of noisy points and true local patterns. We present the PAMALOC algorithm that addresses this problem by exploiting temporal information which is contained in most business data sets. The algorithm enables the detection of local patterns in noisy data sets more reliable compared to the case when the temporal information is ignored. This is achieved by making use of the fact that noise does not reproduce its incidental structure but even small patterns do. In particular, we developed a method to track clusters over time based on an optimal match of data partitions between time periods.

Fulltext Preview

Image of the first page of the fulltext document