We studied three methods to improve identification of difficult small classes by balancing imbalanced class distribution with
data reduction. The new method, neighborhood cleaning rule (NCL), outperformed simple random and one-sided selection methods
in experiments with ten data sets. All reduction methods improved identification of small classes (20–30%), but the differences
were insignificant. However, significant differences in accuracies, true-positive rates and true-negative rates obtained with
the 3-nearest neighbor method and C4.5 from the reduced data favored NCL. The results suggest that NCL is a useful method
for improving the modeling of difficult small classes, and for building classifiers to identify these classes from the real-world
data.