Most algorithms for association rule mining are variants of the basic Apriori algorithm [2]. One characteristic of these Aprioribased algorithms is that candidate itemsets are generated in rounds, with the size of
the itemsets incremented by one per round. The number of database scans required by Apriori-based algorithms thus depends
on the size of the largest large itemsets. In this paper we devise a more general candidate set generation algorithm, LGen,
which generates candidate itemsets of multiple sizes during each database scan. We show that, given a reasonable set of suggested
large itemsets, LGen can significantly reduce the number of I/O passes required. In the best cases, only two passes are sufficient
to discover all the large itemsets irrespective of the size of the largest ones.
Keywords Data mining - association rules - lattice - Apriori - LGen