View Related Documents

Abstract

Decision tree learner inspect marginal class distributions of numerical attributes to infer a predicate that can be used as a decision node in the tree. Since such discretization techniques examine the marginal distribution only, they may fail completely to predict the class correctly even in cases for which a decision tree with a 100% classification rate exists. In this paper, an objective function-based clustering algorithm is modified to yield a discretization of numerical variables that overcomes these problems. The underlying clustering algorithm is the fuzzy c-means algorithm, which is modified to (a) take the class information into account and (b) to organize all cluster prototypes in a regular grid such that the grid rather than the individiual clusters are optimized.

Fulltext Preview

Image of the first page of the fulltext document