Outlier
Outlier
According to [[Douglas Hawkins]]
An outlier is an observation which deviates so much from the other observations as to arouse suspicions that it was generated by a different mechanism.
Types of outliers
Mild outlier
Severe outlier
Outlier detection
Univariate and multivariate outlier detection
When a data sample is given, we can often assumed it to be generated by one basic generating mechanism (ex. a probability distribution like the logistic one). Often times, the same data can be generated by 2 or more mechanisms. This could cause one set of data to have observations in 2 completely different classes of similar objects. When looking at the data through one generating mechanism, the other class would appear as a big group of outliers.
Univariate outlier detection
Outliers in boxplots
The boxplot is a graphical display for Exploratory Data Analysis, where the outliers appear tagged. In the boxplot we are able to show both #Types of outliers.
Multivariate outlier detection
Outliers can be detected by computing the distance between the central point of data, by means of an iterative algorithm: