Misuse and Anomaly Detection

Misuse and Anomaly Detection Sampath Kannan Wenke Lee Insup Lee Diana Spears Oleg Sokolsky William Spears Linda Zhao

Network Intrusion Detection Systems (NIDS) • Important defense to protect sensitive information and resources on the network. • Usually have the following functionalities. • Observe traffic and extract features • Pattern match with database of “attack signatures” to detect misuse(intrusion) • Observe statistical properties and check against specifications of correct behavior to detect anomalies

Shortcomings of Current NIDS • New attack strategies arise constantly and attack signature databases become obsolete rapidly. • Volume and interleaving of traffic at backbone of network makes complex signature recognition infeasible.

Shortcomings cont’d • Anomaly detection algorithms are primitive. We want more scalable yet moresophisticated techniques. • Want to reduce the number of false positives in anomaly detection to make it useful.

Our Approach • Use Machine Learning, Data Mining, and Case-Based Reasoning techniques to learn new intruder models on the fly. • Build a taxonomy of possible anomalies; extract relevant features; use statistical and machine learning techniques to reduce false-alarm rate.

Our Approach – Cont’d • Apply sophisticated algorithms designed inthe resource-constrained data stream model to NIDS. • Integrate all of these modules into a MaC-based system architecture.

Existing Infrastructure • Monitoring and Checking (MaC) architecture for run-time monitoring • User specified instrumentation of running programs to extract important state changes.(Primitive Event Definition Language (PEDL)). • User specified conversion of these low-level events to abstract events relevant to properties (MEDL). • Checker for processing abstract event streamto monitor correctness.

Existing Infrastructure – Cont’d • An experimental test-bed to test performance of Intrusion and Anomaly Detection Systems. • Enhancement of a similar set-up from MIT Lincoln Labs from the 90’s. • Models hacker profiles and taxonomy of attacks and generates “realistic” normal and attack traffic. • Metrics for evaluating potency of attacks.

Using MaC for NIDS • Need multiple Primitive Event Definition Languages (PEDLs) to model different algorithmic techniques for extracting abstract events. • Need dynamically changeable properties as machine learning approaches discover new attack signatures. • Need integration module that combines the results of various modules.

Inferring Mixtures of Markov Chains A theoretical result ... Batu, Guha, Kannan

An example • Network traffic log … each party behaves like a Markov Chain • Some parties are malicious • Can you tease out the malicious chains from a single common log?

Another Example: Browsing habits • You read sports and cartoons. You’re equally likely to read both. You do not remember what you read last. • You’d expect a “random” sequence SCSSCSSCSSCCSCCCSSSSCSC…

Suppose there are two • I like health, entertainment, and fashion • I always read entertainment first, health next and fashion last • The sequence would be EHFEHFEHFEHFEHFEHFEHF…

Two readers, one log file • If there is one log file… • Assume there is no correlation between us SECHSSFECSHFESCSSHCFCESCHCCFSESHFESSHFE… Is there enough information to tell that there are two people browsing? What are they browsing? How are they browsing?

Clues in stream? • Yes! (under model assumptions). • H,E, F have special relationship. • They cannot belong to different (uncorrelated) people. • Not clear about S and C ... Could be 3 uncorrelated persons. SECHSSFECSHFESCSSHCFCESCHCCFSESHFESSHFE…

Markov Chains as Stochastic Sources .4 2 1 Output sequence: 1 4 7 7 1 2 5 7 ... .3 .4 .7 .2 4 6 .5 .8 .1 3 .5 .2 5 1 .9 7 .9 .1

1/2 1/2 C S 1/2 1/2 Markov chains on S,E,C,H,F Modeled by … 1 E H 1 1 F

Problem Statement (informal) • Two or more probabilistic processes • We are observing interleaved behavior • We do not know which state belongs to which process – cold start.

The Problem ... 1 3 2 5 1 4 MC1 ...2 6 1 3 2 7 5 3 1 4 1 MC2 ... 2 6 7 3 1 Observe ...2 6 1 3 2 7 5 3 1 4 1... Infer: MC1, MC2, & mixing parameters

For our problem we assume: • Stream is polynomially long in the number of states of each Markov chain (need perhaps long stream). • C : maximum cover time • Q : upper bound on the denominator of any probability • Nonzero probabilities are bounded away from 0. • Space available is some small polynomial in #states. • Under these assumptions, we can identify individual chains if their state spaces are disjoint.

Research Directions • Many exciting directions • Our research team has expertise in network security, machine learning, AI, real-time systems, and algorithm design • We expect interesting synergies between these strengths.

Misuse and Anomaly Detection

Misuse and Anomaly Detection

Presentation Transcript

Data Mining Anomaly Detection

Anomaly Detection

Data Mining Anomaly Detection

ANOMALY DETECTION AND CHARACTERIZATION: LEARNING AND EXPERIANCE

Population-Wide Anomaly Detection

Anomaly Detection

Misuse detection systems

Anomaly Detection and Mitigation

Anomaly Detection Systems

Sensor Fault and Patient Anomaly Detection

Single Pass Anomaly Detection

Traffic Anomaly Detection

Misuse detection systems

Artificial Intelligence Techniques for Misuse and Anomaly Detection

Locality, Network Control and Anomaly Detection

Anomaly Detection Systems

Volume Anomaly Detection

Anomaly Detection: A Tutorial

Global Anomaly Detection Market

Global Anomaly Detection Market

Example of Anomaly Detection

Anomaly Detection Industry