Lecture Notes in Computer Science, 1999, Volume 1574/1999, 54-64, DOI: 10.1007/3-540-48912-6_8

LGen — A Lattice-Based Candidate Set Generation Algorithm for I/O Efficient Association Rule Mining

Chi Lap Yip, K. K. Loo, Ben Kao, David Cheung and C. K. Cheng

View Related Documents

Abstract

Most algorithms for association rule mining are variants of the basic Apriori algorithm [2]. One characteristic of these Aprioribased algorithms is that candidate itemsets are generated in rounds, with the size of the itemsets incremented by one per round. The number of database scans required by Apriori-based algorithms thus depends on the size of the largest large itemsets. In this paper we devise a more general candidate set generation algorithm, LGen, which generates candidate itemsets of multiple sizes during each database scan. We show that, given a reasonable set of suggested large itemsets, LGen can significantly reduce the number of I/O passes required. In the best cases, only two passes are sufficient to discover all the large itemsets irrespective of the size of the largest ones.

Keywords  Data mining - association rules - lattice - Apriori - LGen

Fulltext Preview

Image of the first page of the fulltext document