Designing Light Weight Intrusion Detection Systems

This chapter presents some of the challenges in designing efficient ad light weight intrusion detection systems, with a goal of providing high accuracy, low false alarm rate, and reduced number of features.

EN: Smells like an "in the weeds" approach to a very specific task ... will probably elide much.


An "intrusion detection system" (IDS) for a computer is very much like a security guard at a physical location: its task is to permit access to those authorized to enter, but raise an alarm when access is attempted by an unauthorized party. An IDS may also detect instances in which a recognized user engages in activities for which he is not authorized (in that sense, he is the same as an unauthorized user)

Preventing unauthorized access or reacting to it after the fact, are often closely linked to the IDS, and some solutions include these functions as well, but they are complementary to the act of detecting intrusion.

A core assumption of IDS is that intrusive activities are noticeably different than normal activities, and thus detectable. For example, a legitimate user will typically enter his password once to access the system, though he may fat-finger it and have to try a couple of times, but after the third failed attempt, there is an increasing likelihood that the individual is not an authorized user, and is attempting to intrude into a system (function or data) to which he is not authorized.

Intrusion is of concern because the actions of unauthorized users often cause loss of integrity to the system or data, breach of confidentiality, and potential damage to the organization (for example, embezzlement or fraud).

An IDS may be a combination of software and hardware. Most operate in real time to detect breaches as soon as they occur, though some are designed to run in arrears to discover breaches that occurred in the past (for forensic reasons, or because real time analysis would debilitate system performance).

A number of desirable qualities of IDS are listed: effective detection, flexible tolerances, automation, minimal system impact, efficiency, no false alarms, etc.


Anomaly detection assumes that an intrusion will always reflect some deviations from normal patterns. Anomalies can be detected in the data and files on the system (static) or in the functions that are executed by the user (dynamic). This technique depends upon establishing a baseline profile of normal behavior and comparing any specific incident to that profile to identify it as abnormal. The disadvantage lies in the definition of the baseline - if too tight, there will be many false alarms; if too lose, intrusion may not be detected.

Misuse detection is based on a knowledge of system vulnerabilities and known attach patterns: a person who has several failed login attempts, who attempts to gain access to certain system files or execute certain commands, would seem to be attempting an attack on the system. The primary disadvantage to this method is that it is based on known attacks, and does not account for the inventiveness of hackers in coming up with new ways to attach a system.


The author goes into a painful amount of detail as to the difference between a host-based system, (which monitors activity on a single server) versus a network-based system (which monitors all the systems on a given network). He seems to lean toward the latter, as it provides a consistent and comprehensive approach, but notes that host-based systems can be more precise in detecting specific activity on a specific system, and are worthwhile to have in place for highly sensitive systems (payroll, accounting, marketing, etc.)


Data mining uses "semi-automated" methods to discover patterns and associations and develop a statistical model of normal behavior that can be "different" (more precise or comprehensive?) than the model that a human being could conceive or perceive.

This approach ahs become more widely used as information systems have grown and the amount of data (both resident on systems and collected as users navigate among systems) has become unwieldy for human analysis.


More granular details about the importance of selecting features and developing algorithms that are efficient - do the least amount of processing on the least amount of data, yet sufficient in both regards to do an accurate and reliable job.

I'm skipping the mathematical stuff - the provide an equation, and then refer to an experiment performed on about a terabyte of data (over 5 million records), applying 41 separate factors. The result was an execution time of about 14 seconds with 98% accuracy.

They don't go into detail about what the factors were, so I'm not sure what's to be learned by this - except perhaps to use their figures as a benchmark or suggestion.


The author concludes that the best approach to ID would employ a combination of the two techniques involved, although data mining produces the best "empirical results."