View Related Documents

Abstract

Handling missing attribute values is a important issue for classifier learning, since missing attribute values in either training data or test (unseen) data affect the prediction accuracy of learned classifiers. In many real KDD applications, attributes with missing values are very common. This paper studies the robustness of four recently developed committee learning techniques, including Boosting, Bagging, SASC and SASCMB, relative to C4.5 for tolerating missing values in test data. Boosting is found to have a similar level of robustness to C4.5 for tolerating missing values in test data in terms of average error in a representative collection of atural domains under investigation. Bagging performs slightly better tha Boosting, while SASC and SASCMB perform better than them in this regard, with SASCMB performing best.

Fulltext Preview

Image of the first page of the fulltext document