Principal component analysis (PCA) is a statistical technique to identify the dependency structure of multivariate stochastic
observations. PCA is frequently used in data mining applications. This paper considers PCA in the context of the emerging
network-based computing environments. It offers a technique to perform PCA from distributed and heterogeneous data sets with
relatively small communication overhead. The technique is evaluated against different data sets, including a data set for
a web mining application. This approach is likely to facilitate the development of distributed clustering, associative link
analysis, and other heterogeneous data mining applications that frequently use PCA.