Open In App

Grid-Based Method For Distance-Based Outlier Detection in Data Mining

Outlier detection is currently thought to be a crucial data mining work with a variety of applications, including the detection of credit card fraud, criminal activity, and remarkable trends in datasets.The goal of outlier Detection, a crucial area of data mining, is to find unusual behaviour in a given data collection. Anomalies can be utilised to forecast upcoming events to clarify a situation’s consequences or to improve the appropriate system.

Distance Based Outlier:

Statistics is one of the fields where outlier identification research is currently being done. Outlier can be intuitively characterised as according to Hawkins. An outlier is an observation that differs so significantly from other observations that it raises the possibility that it was produced by a different mechanism, according to definition (Hawkins-Outlier).



Grid-Based Method for Distance-Based Outlier Detection:

Using a grid-based outlier detection algorithm which help us to prunes away the portion of dataset which is safe and known to be non-outliers,This can locate the points that differ from the rest of the data points at a later stage with the aid of the nearest neighbour strategy. In turn, this lowers the overall cost of computation. This solution uses a straightforward grid-based structure to filter out the safe sections rather than using the distance-based closest neighbour algorithm to all of the given data.

Instead of applying the distance-based nearest neighbour over the entire available data, we can use a simple grid based structure to prune out the safe regions.



Algorithm for Grid-Based Mining Stream Outlier:

This algorithm can be broken down into three simple steps.

Requirements for algorithm:

Let, the N number of grid cells and the following:

 

Steps:

So, these are the steps to follow to know the outlier degree detection.

Article Tags :