We describe a novel approach for clustering collections of sets, and its application to the analysis and mining of categorical
data. By “categorical data,” we mean tables with fields that cannot be naturally ordered by a metric – e.g., the names of
producers of automobiles, or the names of products offered by a manufacturer. Our approach is based on an iterative method
for assigning and propagating weights on the categorical values in a table; this facilitates a type of similarity measure
arising from the co-occurrence of values in the dataset. Our techniques can be studied analytically in terms of certain types
of non-linear dynamical systems.
Key words:Clustering – Data mining – Categorial data – Dynamical systems – Hypergraphs
Received February 15, 1999 / Accepted August 15, 1999