Data Mining For Intrusion Detection and Prevention

Last Updated : 22 Sep, 2021

The security of our computer systems and data is at continual risk. The extensive growth of the Internet and the increasing availability of tools and tricks for intruding and attacking networks have prompted intrusion detection and prevention to become a critical component of networked systems.

Intrusion

Unauthorized access by an intruder involves stealing valuable resources and misuse those resources, e.g. Worms and viruses. There are intrusion prevention techniques such as user authentication, and sharing encrypted information that is not enough to operate because the system is becoming more complex day by day so, we need a layer of security controls.

Intruder

It is an entity that is trying to gain unauthorized access over a system or a network. Moreover, the data present in that system will be corrupted along with an imbalance in the environment of that network.

Intruders are of majorly two types

Masquerader (Outside Intruder) – No authority to use the network or system
Misfeasor (Inside Intruder) – authorized access to limited applications

Intrusion detection system

An Intrusion Detection System is a device or an application that detects unusual indication and monitors traffic and report its results to an administrator, but cannot take action to prevent unusual activity. The system protects the confidentiality, integrity, and availability of data and information systems from internet attacks. We see that the network extended dynamically, so too are the possibilities of risks and chances of malicious intrusions are increasing.

Types of attacks detected by Intrusion detection systems majorly:

Scanning attacks
Denial of service (DOS) attacks
Penetration attacks

Fig.2 Architecture of IDS

Intrusion prevention system

It is basically an extension of the Intrusion Detection System which can protect the system from suspicious activities, viruses, and threats, and once any unwelcome activity is identified IPS also takes action against those activities such as closing access points and prevent firewalls.

The majority of intrusion detection and prevention systems use either signature-based detection or anomaly-based detection.

1. Signature-Based – The signature-based system uses some library of signatures of known attacks and if the signature matches with the pattern the system detects the intrusion take prevention by blocking the IP address or deactivate the user account from accessing the application. This system is basically a pattern-based system used to monitor the packets on the network and compares the packets against a database of signature from existing attacks or a list of attack patterns and if the signature matches with the pattern the system detect the intrusion and alert to the admin. E.g.Antiviruses.

Advantage

Worth detecting only Known attacks.

Disadvantage

Failed to identify new or unknown attacks.
Regular update of new attacks

2. Anomaly-Based – The anomaly-based system waits for any abnormal activity. If activity is detected, the system blocks entry to the target host immediately. This system follows a baseline pattern first we train the system with a suitable baseline and compare activity against that baseline if someone crosses that baseline will be treated as suspicious activity and an alert is triggered to the administrator.

Advantage

Ability to detect unknown attacks.

Disadvantage

Higher complexity, sometimes it is difficult to detect and chances of false alarms.

As we know that data mining is the system of extracting patterns from huge datasets through combining techniques from statistician artificial intelligence with database management. In intrusion detection (ID) and intrusion prevention device (IPS) we recollect a few things which might be utilized in data mining for intrusion detection systems (IDS) and intrusion prevention devices (IPS).

How does data mining help in Intrusion detection and prevention

Modern network technologies require a high level of security controls to ensure safe and trusted communication of information between the user and a client. An intrusion Detection System is to protect the system after the failure of traditional technologies. Data mining is the extraction of appropriate features from a large amount of data. And, it supports various learning algorithms, i.e. supervised and unsupervised. Intrusion detection is basically a data-centric process so, with the help of data mining algorithms, IDS will also learn from past intrusions, and improve performance from experience along with find unusual activities. It helps in exploring the large increase in the database and gather only valid information by improving segmentation and help organizations in real-time plan and save time. It has various applications such as detecting anomalous behavior, detecting fraud and abuse, terrorist activities, and investigating crimes through lie detection. Below list of areas in which data mining technology can be carried out for intrusion detection.

Using data mining algorithms for developing a new model for IDS: Data mining algorithm for the IDS model having a higher efficiency rate and lower false alarms. Data mining algorithms can be used for both signature-based and anomaly-based detection. In signature-based detection, training information is classified as either “normal” or “intrusion.” A classifier can then be derived to discover acknowledged intrusions. Research on this place has included the software of clarification algorithms, association rule mining, and cost-sensitive modeling. Anomaly-primarily based totally detection builds models of normal behavior and automatically detects massive deviations from it. Methods consist of the software of clustering, outlier analysis, and class algorithms, and statistical approaches. The strategies used have to be efficient and scalable, and able to dealing with community information of excessive volume, dimensional, and heterogeneity.
Analysis of Stream data: Analysis of stream data means is analyzing the data in a continuous manner but data mining is basically used on static data rather than Streaming due to complex calculation and high processing time. Due to the dynamic nature of intrusions and malicious attacks, it is more critical to perform intrusion detection withinside the records stream environment. Moreover, an event can be ordinary on its own but taken into consideration malicious if regarded as a part of a series of activities. Thus, it’s far essential to look at what sequences of activities are regularly encountered together, locate sequential patterns, and pick out outliers. Other data mining strategies for locating evolving clusters and constructing dynamic class models in records streams also are essential for real-time intrusion detection.
Distributed data mining: It is used to analyze the random data which is inherently distributed into various databases so, it becomes difficult to integrate processing of the data. Intrusions may be launched from numerous distinctive places and focused on many distinctive destinations. Distributed data mining strategies can be used to investigate community data from numerous network places to detect those distributed attacks.
Visualization tools: These tools are used to represent the data in the form of graphs which helps the user to get a visual understanding of the data. These tools are also used for viewing any anomalous patterns detected. Such tools may encompass capabilities for viewing associations, discriminative patterns, clusters, and outliers. Intrusion detection structures must actually have a graphical user interface that permits safety analysts to pose queries concerning the network data or intrusion detection results.