Unsupervised Learning: Self-aggregation in Scaled Principal Component Space
*
Chris Ding
, Xiaofeng He4
, Hongyuan Zha5
and Horst Simon4 
| (4) |
NERSC Division, Lawrence Berkeley National Laboratory University of California, 94720 Berkeley, CA |
| (5) |
Department of Computer Science and Engineering, Pennsylvania State University, 16802 University Park, PA |
Abstract
We demonstrate that data clustering amounts to a dynamic process of self-aggregation in which data objects move towards each other to form clusters, revealing the inherent pattern of similarity. Selfaggregation
is governed by connectivity and occurs in a space obtained by a nonlinear scaling of principal component analysis (PCA). The
method combines dimensionality reduction with clustering into a single framework. It can apply to both square similarity matrices
and rectangular association matrices.
LBNL Tech Report 49048, October 5, 2001. Supported by Department of Energy (Office of Science, through a LBNL LDRD) under
contract DE-AC03-76SF00098
References secured to subscribers.