Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Toward Generalizing the Unification with Statistical Outliers: The Gradient Outlier Factor Measure

Toward Generalizing the Unification with Statistical Outliers: The Gradient Outlier Factor Measure Toward Generalizing the Unification with Statistical Outliers: The Gradient Outlier Factor Measure FABRIZIO ANGIULLI and FABIO FASSETTI, University of Calabria, Italy In this work, we introduce a novel definition of outlier, namely the Gradient Outlier Factor (or GOF), with the aim to provide a definition that unifies with the statistical one on some standard distributions but has a different behavior in the presence of mixture distributions. Intuitively, the GOF score measures the probability to stay in the neighborhood of a certain object. It is directly proportional to the density and inversely proportional to the variation of the density. We derive formal properties under which the GOF definition unifies the statistical outlier definition and show that the unification holds for some standard distributions, while the GOF is able to capture tails in the presence of different distributions even if their densities sensibly differ. Moreover, we provide a probabilistic interpretation of the GOF score, by means of the notion of density of the data density. Experimental results confirm that there are scenarios in which the novel definition can be profitably employed. To the best of our knowledge, except for distance-based outlier, no other data mining outlier definition has a so http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png ACM Transactions on Knowledge Discovery from Data (TKDD) Association for Computing Machinery

Toward Generalizing the Unification with Statistical Outliers: The Gradient Outlier Factor Measure

Loading next page...
 
/lp/association-for-computing-machinery/toward-generalizing-the-unification-with-statistical-outliers-the-q4fW0AOSKg

References

References for this paper are not available at this time. We will be adding them shortly, thank you for your patience.

Publisher
Association for Computing Machinery
Copyright
Copyright © 2016 by ACM Inc.
ISSN
1556-4681
DOI
10.1145/2829956
Publisher site
See Article on Publisher Site

Abstract

Toward Generalizing the Unification with Statistical Outliers: The Gradient Outlier Factor Measure FABRIZIO ANGIULLI and FABIO FASSETTI, University of Calabria, Italy In this work, we introduce a novel definition of outlier, namely the Gradient Outlier Factor (or GOF), with the aim to provide a definition that unifies with the statistical one on some standard distributions but has a different behavior in the presence of mixture distributions. Intuitively, the GOF score measures the probability to stay in the neighborhood of a certain object. It is directly proportional to the density and inversely proportional to the variation of the density. We derive formal properties under which the GOF definition unifies the statistical outlier definition and show that the unification holds for some standard distributions, while the GOF is able to capture tails in the presence of different distributions even if their densities sensibly differ. Moreover, we provide a probabilistic interpretation of the GOF score, by means of the notion of density of the data density. Experimental results confirm that there are scenarios in which the novel definition can be profitably employed. To the best of our knowledge, except for distance-based outlier, no other data mining outlier definition has a so

Journal

ACM Transactions on Knowledge Discovery from Data (TKDD)Association for Computing Machinery

Published: Jan 29, 2016

There are no references for this article.