Detecting outliers which are grossly different from or inconsistent with the remaining dataset is a major challenge in real-world
KDD applications. Existing outlier detection methods are ineffective on scattered real-world datasets due to implicit data
patterns and parameter setting issues. We define a novel Local Distance-based Outlier Factor (LDOF) to measure the outlier-ness of objects in scattered datasets which addresses these issues. LDOF uses the relative
location of an object to its neighbours to determine the degree to which the object deviates from its neighbourhood. We present
theoretical bounds on LDOF’s false-detection probability. Experimentally, LDOF compares favorably to classical KNN and LOF
based outlier detection. In particular it is less sensitive to parameter values.
Keywords local outlier - scattered data - k-distance - KNN - LOF - LDOF