Welcome!
To use the personalized features of this site, please log in or register.
If you have forgotten your username or password, we can help.
|
 |
Distributed Pasting of Small Votes
| |
|
Distributed Pasting of Small Votes
N. V. Chawla6 , L. O. Hall6 , K. W. Bowyer7 , T. E. Moore Jr.6 and W. P. Kegelmeyer8 
| (6) |
Department of Computer Science and Engineering, University of South Florida, 4202 E. Fowler Avenue, Tampa, Florida 33620, USA |
| (7) |
Department of Computer Science and Engineering, University of Notre Dame, 384 Fitzpatrick Hall, Notre Dame, IN 46556, USA |
| (8) |
Biosystems Research Department, Sandia National Labs, P.O. Box 969, MS 9951, Livermore, CA 94551-0969, USA |
Abstract
Bagging and boosting are two popular ensemble methods that achieve better accuracy than a single classifier. These techniques
have limitations on massive datasets, as the size of the dataset can be a bottleneck. Voting many classifiers built on small
subsets of data (“pasting small votes”) is a promising approach for learning from massive datasets. Pasting small votes can
utilize the power of boosting and bagging, and potentially scale up to massive datasets. We propose a framework for building
hundreds or thousands of such classifiers on small subsets of data in a distributed environment. Experiments show this approach
is fast, accurate, and scalable to massive datasets.
Fulltext Preview (Small, Large)
 References secured to subscribers.
|
|
|
|
|
|