Welcome!
To use the personalized features of this site, please log in or register.
If you have forgotten your username or password, we can help.
My Menu
Saved Items

Part I Invited Papers

When are k-nearest neighbor and back propagation accurate for feasible sized sets of examples?

Eric B. Baum1

(1)  NEC Research Institute, 4 Independence Way, 08540 Princeton, NJ
Abstract
We first review in pedagogical fashion previous results which gave lower and upper bounds on the number of examples needed for training feedforward neural networks when valid generalization is desired. Experimental tests of generalization versus number of examples are then presented for random target networks and examples drawn from a uniform distribution. The experimental results are roughly consistent with the following heuristic: if a database of M examples is loaded onto a W weight net (for MGtW), one expects to make a fraction epsiv=W/M errors in classifying future examples drawn from the same distribution. This is consistent with our previous bounds, but if reliable strengthens them in that: (1) the bounds had large numerical constants and log factors, all of which are set equal one in the heuristic, (2) previous lower bounds on number of examples needed were valid only in a distribution independent context, whereas the experiments were conducted for a uniform distribution, and (3) the previous lower bound was valid for nets with one hidden layer only. These experiments also seem to indicate that networks with two hidden layers have Vapnik-Chervonenkis dimension roughly equal to their total number of weights.
We then consider the convergence of the k-nearest neighbor algorithm to a classifier making a fraction epsiv of errors when examples are drawn from the uniform distribution on S n , the unit sphere in n dimensions, and classified according to a simple target function. We prove that if the target function is a single half space, then for k appropriately chosen (k ap n/epsiv 2 ln(epsiv–1)), k nearest neighbor yields an epsiv accurate classifier using a database of M=O(n/epsiv2 ln(epsiv–1)) classified examples. However, when the target function is a union of two half spaces, k nearest neighbor requires a number of examples exponential in n to achieve high accuracy.
Fulltext Preview (Small, Large)
Image of the first page of the fulltext

References secured to subscribers.



Export this chapter
Export this chapter as RIS | Text
 
Remote Address: 38.107.191.113 • Server: mpweb06
HTTP User Agent: CCBot/1.0 (+http://www.commoncrawl.org/bot.html)