Mathematical Algorithms & Anomaly DETECTION Peer Reviewed Journal

Pages: 19 (5790 words)  ·  Style: IEEE  ·  Bibliography Sources: 25  ·  File: .docx  ·  Level: Doctorate  ·  Topic: Mathematics

Buy full Download Microsoft Word File paper
for $19.77

[. . .] Which means if an attacker gains access to the systems, they can disable or mislead the intrusion detection systems. Moreover, the data used by such systems is usually context-rich but there are the added expenses of needing host access, the need to configure distributed clients, and the need to gather and manage huge and potentially critical datasets from the hosts [1]. In contrast, network-based IDSs are often located on separate devices. They are usually upstream in the network architecture and are usually designed to monitor many separate systems on the same network. Network-based IDSs are usually fully isolated from the systems they are monitoring. Therefore, there is a much lower likelihood that they can be accessed and interfered with as is the case with HIDSs. However, because such systems are isolated, they gather very little information on the systems they monitor, making it difficult to detect changes or abnormal events.

Hybrid distributed intrusion detection systems combine data (both network and host-based data) into a single system. This provides more visibility of the monitored system. The combined data is usually fed into a single decision making algorithm or alerting process.

Some hybrid distributed intrusion systems utilize virtual resources (cloud computer systems) to monitor multiple components of a network at the same time. This comes at the cost of visibility and capability. However, it provides isolation in case the host is compromised. Moreover, the monitoring of a network at an upstream level allows for the monitoring of multiple systems under a network giving a clear picture of what is happening across the network. Virtual intrusion detection systems, or more commonly, the virtual machine monitor IDSs (VMM IDSs) are generally located on an external location but on the very same actual physical machine [11].

Some of the intrusion detection systems discussed above have been used in cloud systems. The cloud systems generally included the use of both host-based and network-based data. And while the systems generally rely on virtual machines, they do not necessarily utilize the same methods as virtual machine monitor IDSs.

Hybrid intrusion detection systems can be classified as either OS-level intrusion detection systems or program-level intrusion detection systems. The program level ones generally monitor one application utilizing information such as dynamic or static control flow, systems calls invoked, byte code, source code, or any other information found on the state of the application. Most program-level hybrid intrusion detection systems focus on malware detection and vulnerability detection. They also, by extension, detect anomalies and intrusions in applications.

The OS-level intrusion detection systems monitor entire networks and exist to identify abnormal patterns at the Operating system level. This often entails gathering data at the from file system monitoring, system calls invoked, Windows Registry data, system logs and/ or other sources. Many such systems utilize system calls very efficiently to detect abnormal behavior. While patterns of system calls relating to one application have been utilized in program-level intrusion detection systems, the multiple patterns for different applications are usually combined to enable the detection of system calls of all applications/ processes in OS-level intrusion detection systems. The traces of system calls are utilized to identify repeated patterns and enable the detection of anomalies [12].

Lastly, with regards to IDS location, it should be noted that the use of side channel intrusion detection systems utilizing physical characteristics including timings, vibrations, electromagnetic radiation, and power consumption is gaining popularity in cybersecurity intrusion detection systems research [13] and particularly focuses on utilizing physical host level data [14, 15]. In the case of side-channel detection systems, the main advantage is the isolation of such systems from the hosts which hinders attackers from tampering with the IDSs.

1.3. Detection Techniques

Signature matching detection identifies attacks through matching data packets with predefined attack signature samples. The matching process usually takes a lot of time providing attackers with ample time to do their damage [16]. Moreover, only known and predefined attack signatures can be matched/ detected utilizing this detection technique, which is a big problem considering the growing number of self-modifying malware and pattern detection evasion methods utilized by attackers.

Anomaly based detection is an upgrade of signature matching detection in the sense that it reduces need for detection and updating predefined attack signature databases in the IDSs. This is because it utilizes statistical methods to identify normal behavior and any variations, which allows for the detection of previously undefined or unknown attack signatures. However, while they are good conceptually, they have not been adopted widely since they usually report relatively high numbers of false alarms which makes them difficult to use [17, 16]. Reducing the number of false positives is, thus, key to making anomaly based detection IDSs practical and convenient to use.

According to Rehak [18] anomaly based detection false positives can be divided into unstructured and structured false positives. The former are random noise resulting from network traffic stochasticity, while the latter are the result of regular but abnormal behavior particularly found in smaller network hosts such as DNS and mail servers. This work proposes the use of mathematical algorithms to reduce false positives (unstructured ones) and to, therefore, assist with anomaly detection.

1.4. Classi?cation of False Positives

For the purpose of this work, the assumption that the network anomaly IDS monitors several network events (for example, HTTP connections, NetFlow [19], etc) generated by different hosts in a network is held. The IDS system maintains internal scoring systems that assign either a one or zero to each event, with zero showing that an event is normal while one shows that an event is a possible attack. The assumption that malicious activities have different statistical characteristics from ordinary ones is held [16]. As indicated before, anomaly detection IDSs produce a huge number of false positives because most rare events are usually not the result of attacks. According to Rehak [18] false positives are categorized into structured and unstructured false positives.

· Unstructured false positives are often short term and are usually distributed across network hosts based on traffic volume share. They usually triggered by behaviors that are uniformly distributed (e.g. browsing the web) and are in this work modeled as white noise (finite variance and zero mean) added to the output of an anomaly detector. Thus, if an event xihas an anomaly score of yito calculate the score one would need to use the formula below


Where the value g(xi) represents the real anomaly score for the event and ?ii represents the white noise that hides the true anomaly value.

· Structured false positives are often long term, regular but abnormal behavior particularly found in smaller network hosts. They are flagged as anomalies because they are not normal. Examples of structured false positives include uncommon network APIs making regular calls, and software updates by uncommon applications [16]. Because structured false positives often come from only a few network hosts and their behavior is regular, they can usually be identified and eliminated utilizing white lists. However, the white lists usually have to be very specific for the structured false positives. This makes them hard to create prior to deployment. The mixed distributions formula below defines these false positives


Where the value ?jis the false positive. Its weight is ?j. Each structured false positive value has variance when compared to unstructured false positive values but the means of each component are generally different from one another.

2. Proposed Method

The proposed LAMS (Local adaptive multivariate smoothing) method aims to substitute the anomaly detector’s output by the mean anomaly score of comparable previous events, whereby the comparability/ similarity/ context between 2 events is captured as . This successfully smooths anomaly detector output and, thus, significantly reduces the rate of unstructured false alarms. Mathematically, the smoothing can be shown as follows


In which case {xi}ni=1 is the value which represents network events, is the value which shows the anticipated event x anomaly, and this value {yi}ni=1 represents the corresponding set of anomaly detector outputs. The space Xwhere the smoothing is defined could be arbitrary. However, Euclidean’s space Gaussian kernel, defined as , is the most often used combination. In the definition, the h parameterizes the kernel’s width.

The estimator (3) is commonly referred to as the Nadaraya-Watson estimator and it is a non-parametric one of a random variable’s conditional expectation [9, 10, 16].

The statements below explain how smoothing reduce unstructured and structured false alarms:

· Unstructured false alarms are significantly cut by the estimator defined above through averaging similar events. Because unstructured false alarms are in the following form yi=g(xi) + ?i, under multiple assumptions, it has been proven that they converge to g(xi). This is as per Devroye et al. [20]. Below is its mathematical expression



Two Ordering Options:

Which Option Should I Choose?
1.  Buy full paper (19 pages)Download Microsoft Word File

Download the perfectly formatted MS Word file!

- or -

2.  Write a NEW paper for me!✍🏻

We'll follow your exact instructions!
Chat with the writer 24/7.

Consumption Problem Introduction to the Amtrak Trains Term Paper

Design of an Automotive Control System to Follow a Drive Cycle A2 Coursework

Consumption Problem Selection of a Planning Horizon Term Paper

Music Education or Cross Platform Development Term Paper

Supply Chain Planning Under Uncertainty a Real Options Approach Research Proposal

View 27 other related papers  >>

Cite This Peer Reviewed Journal:

APA Format

Mathematical Algorithms & Anomaly DETECTION.  (2019, May 31).  Retrieved January 27, 2020, from

MLA Format

"Mathematical Algorithms & Anomaly DETECTION."  31 May 2019.  Web.  27 January 2020. <>.

Chicago Format

"Mathematical Algorithms & Anomaly DETECTION."  May 31, 2019.  Accessed January 27, 2020.