Outlier Detection

Posted on 2022-06-16 | In paper note

Statistical methods

use a model (e.g., Gaussian) to fit the distribution of all data
use two models to fit the distributions of non-outliers and outliers separately
Grubbs’ test

Distance based methods

the density within a neighborhood
the distance from a nearest neighbor

Learning based method

clustering, the smallest cluster is likely to contain outliers
one-class classifier (e.g., one-class SVM)
binary classifier (e.g., naive bayes for spam filtering, weighted binary SVM)