Welcome!
To use the personalized features of this site, please log in or register.
If you have forgotten your username or password, we can help.
My Menu
Saved Items

A Case Study for Learning from Imbalanced Data Sets

Aijun AnContact Information, Nick CerconeContact Information and Xiangji HuangContact Information

(3)  Department of Computer Science, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada
Abstract
We present our experience in applying a rule induction technique to an extremely imbalanced pharmaceutical data set. We focus on using a variety of performance measures to evaluate a number of rule quality measures. We also investigate whether simply changing the distribution skew in the training data can improve predictive performance. Finally, we propose a method for adjusting the learning algorithm for learning in an extremely imbalanced environment. Our experimental results show that this adjustment improves predictive performance for rule quality formulas in which rule coverage makes positive contributions to the rule quality value.

Keywords  Machine learning - Imbalanced data sets - Rule quality


Contact Information Aijun An
Email: aan@uwaterloo.ca

Contact Information Nick Cercone
Email: ncercone@uwaterloo.ca

Contact Information Xiangji Huang
Email: jhuang@uwaterloo.ca
Fulltext Preview (Small, Large)
Image of the first page of the fulltext

References secured to subscribers.



Export this chapter
Export this chapter as RIS | Text
 
Remote Address: 38.107.191.107 • Server: mpweb18
HTTP User Agent: CCBot/1.0 (+http://www.commoncrawl.org/bot.html)