Welcome!
To use the personalized features of this site, please log in or register.
If you have forgotten your username or password, we can help.
|
 |
Self-Tuning Clustering: An Adaptive Clustering Method for Transaction Data
| |
|
Self-Tuning Clustering: An Adaptive Clustering Method for Transaction Data
Ching-Huang Yun7 , Kun-Ta Chuang8 and Ming-Syan Chen7 
| (7) |
Department of Electrical Engineering, National Taiwan University, Taipei, Taiwan, ROC |
| (8) |
Graduate Institute of Communication Engineering, National Taiwan University, Taipei, Taiwan, ROC |
Abstract
In this paper, we devise an efficient algorithm for clustering market-basket data items. Market-basket data analysis has been
well addressed in mining association rules for discovering the set of large items which are the frequently purchased items
among all transactions. In essence, clustering is meant to divide a set of data items into some proper groups in such a way
that items in the same group are as similar to one another as possible. In view of the nature of clustering market basket
data, we present a measurement, called the small-large (SL) ratio, which is in essence the ratio of the number of small items
to that of large items. Clearly, the smaller the SL ratio of a cluster, the more similar to one another the items in the cluster
are. Then, by utilizing a self-tuning technique for adaptively tuning the input and output SL ratio thresholds, we develop
an efficient clustering algorithm, algorithm STC (standing for Self-Tuning Clustering), for clustering market-basket data. The objective of algorithm STC is “Given a database of transactions, determine a clustering such that the average SL ratio is minimized.” We conduct several experiments on the real data and the synthetic workload for performance studies. It is shown by our
experimental results that by utilizing the self-tuning technique to adaptively minimize the input and output SL ratio thresholds,
algorithm STC performs very well. Specifically, algorithm STC not only incurs an execution time that is significantly smaller
than that by prior works but also leads to the clustering results of very good quality.
Keywords Data mining - clustering market-basket data - small-large ratios - adaptive self-tuning
Fulltext Preview (Small, Large)
 References secured to subscribers.
|
|
|
|
|
|