Welcome!
To use the personalized features of this site, please log in or register.
If you have forgotten your username or password, we can help.
My Menu
Saved Items

Fast Outlier Detection in High Dimensional Spaces

Fabrizio AngiulliContact Information and Clara PizzutiContact Information

(4)  ISI-CNR, c/o DEIS, Universitá della Calabria, 87036 Rende, CS, Italy
Abstract
In this paper we propose a new definition of distance-based outlier that considers for each point the sum of the distances from its k nearest neighbors, called weight. Outliers are those points having the largest values of weight. In order to compute these weights, we find the k nearest neighbors of each point in a fast and efficient way by linearizing the search space through the Hilbert space filling curve. The algorithm consists of two phases, the first provides an approximated solution, within a small factor, after executing at most d + 1 scans of the data set with a low time complexity cost, where d is the number of dimensions of the data set. During each scan the number of points candidate to belong to the solution set is sensibly reduced. The second phase returns the exact solution by doing a single scan which examines further a little fraction of the data set. Experimental results show that the algorithm always finds the exact solution during the first phase after d- 《 d + 1 steps and it scales linearly both in the dimensionality and the size of the data set.

Contact Information Fabrizio Angiulli
Email: angiulli@isi.cs.cnr.it

Contact Information Clara Pizzuti
Email: pizzuti@isi.cs.cnr.it
Fulltext Preview (Small, Large)
Image of the first page of the fulltext

References secured to subscribers.



Export this chapter
Export this chapter as RIS | Text
 
Referenced by
3 newer articles

  1. Angiulli, Fabrizio (2007) . IEEE Transactions on Pattern Analysis and Machine Intelligence 29(10)
    [CrossRef]
  2. Xiong, H. (2006) . IEEE Transactions on Knowledge and Data Engineering 18(3)
    [CrossRef]
  3. Angiulli, F. (2005) . IEEE Transactions on Knowledge and Data Engineering 17(2)
    [CrossRef]
Remote Address: 38.107.191.105 • Server: mpweb20
HTTP User Agent: CCBot/1.0 (+http://www.commoncrawl.org/bot.html)