270 likes | 425 Vues
2008 IEEE International Conference on Network Protocols. A Model-Based Approach to Security Flaw Detection of Network Protocol Implementation. Yating Hsu, Guoqiang Shu and David Lee Ohio State University. 20090901 by Mike Hsiao. Papers.
E N D
2008 IEEE International Conference on Network Protocols A Model-Based Approach to Security Flaw Detection of Network Protocol Implementation Yating Hsu, GuoqiangShu and David Lee Ohio State University 20090901 by Mike Hsiao
Papers • Detecting Communication Protocol Security Flaws by Formal Fuzz Testing and Machine Learning • 28th IFIP WG6.1 International Conference on Formal Techniques for Networked and Distributed Systems (FORTE 2008) • Guoqiang, Yating Hsu, and David Lee • The authors show the feasibility of their approach in this paper. • A Model-Based Approach to Security Flaw Detection of Network Protocol Implementation • 16th IEEE International Conference on Network Protocols, 2008 • Yating Hsu, Guoqiang and David Lee • The authors investigate a general theory and algorithms for their approach and report experimental results.
Outline A Model-Based Approach to Security Flaw Detection of Network Protocol Implementation • Motivation • Formal Model • Formal Protocol Synthesis • Fuzz Testing Strategy • Experimental Results • Related Works • Conclusion and Future Work
Motivation • Faults can also be introduced during system implementation; it is indispensable to detect protocol implementation flaws. • Most of the approaches resort to random or manual testing. • And a lot of efforts have been devoted to the analysis of network protocol specification using formal techniques. • In this paper the authors propose a model-based approach for security flaw detection of protocol implementation. • 1) synthesize an abstract behavioral model (FSM) from a protocol implementation • 2) uses it to guide the testing process for detecting security and reliability flaws
Background • A major cause of flaws in protocol implementation is improper handling (e.g., incorrect assumption) of input data. • buffer overflow, enforcing lower (unsafe) version, illegitimate queries, … • (Given a model.) The authors want to construct input sequences that either trigger its insecure interactions or bring it to unreliable states. • Challenge 1: Black-box implementation • Software security testing methods often look for suspicious operations in source code and derive input data to reach them. • Challenge 2: Lack of formal specification • A complete and machine understandable protocol specification is rarely available, which makes formal protocol testing infeasible.
Fuzz Testing • Fuzz testing works by mutating a portion of the normal input data at the ingress interface of a protocol component in order to reveal unwanted behaviors. • Their key idea • 1) obtain an approximate behaviormodel of a protocol implementation under test • Such model is based on presumed knowledge of protocol messages while its primary function is to describe the states and transitions in a session. • 2) use it to guide input selection for fault coverage • They can design test sequences to achieve formally defined fault coverage criteria with regard to the specification, and meanwhile to intelligently choose mutated inputs based on special context in the model.
Workflow of their approach • After network traces are represented by abstract input/output message symbols, a formal behavioral model is synthesized for capturing the design aspects of the protocol. • Fault coverage criteria based on a behavioral model has fundamental advantages over criteria from protocol message syntax. • For instance, an ACK packet in TCP can trigger different processing logics depending on the receiver’s current state of congestion control mechanism. • Our tool automatically constructs a number of input sequences, which cause the implementations to crash under a set of input fuzzing functions.
Formal Model • All valid input messages: MSGI • All valid output messages: MSGO • An implementation B is defined by a function fB: MSGI*→MSGO* • Let AI and AO be finite sets of input and output symbol, respectively. • Denote two abstraction functions α: MSGI→AI and β: MSGO→AO, which map the messages to symbols. • Given AI and AO, an approximate model of B is a FSM Mx = <S, s0, AI, AO, fnext, foutput>. • S and s0 are the state set and initial state. • fnext: S×AI→S; foutput: S×AI→AO. • A trace produced by an FSM is a sequence of input/output message pairs, i.e., tr = {<x1,y1>,<x2,y2>, …}.
Flaw Detection Problem • The flaw detection process involves three parts • 1) A black-box protocol implementationB. • 2) A predefined predicateGOAL:MSGO*→{true, false} that indicates input sequence to B from its initial state results in a flaw, represented by true. • 3) An input sequence selection or construction strategy that reduce the search space to a subset Φ that belongs to MSGI*. • Intelligently and systematically using a message fuzzing functionZ:MSGI* →MSGI* to select a small number of seq belongs to Φ to make GOAL(fB(seq))=true. • In their experiment, the GOAL predicate checks weather a special output symbol of protocol system crash is found.
Passive Synthesis with Partial FSM Reduction • Given a set of traces, ideally the authors want to find a minimized FSM that contains all the traces and only these traces. • However, the problem is NP-hard. • They present an algorithm that construct in polynomial time a reduced FSM – not necessary minimized – that contains the given traces and only these traces. • A large number of traces are gathered and preprocessed by removing the session dependent fields. • Each trace is pre-processed for identifying possible loops. • Construct a tree FSM MT that accepts all the resulting traces and only these traces. • Merge equivalent states in the tree FSM and obtain a reduced and equivalent FSM Mx.
Construction Steps (1) • The execution of target protocol implementation B is monitored for a period of time and all input/output traces are recorded. • Each trace is a sequence of message pairs <xi,yi>, where xi ∈ MSGI and yi ∈ MSGO. • Some session related fields (timestamp, nonce, etc) should be abstracted from the messages.
Construction Steps (2) • Since the lengths of the monitored traces are finite, it is impossible to exactly identify loops. • For a trace uababcv, assume that • sub-trace ab is a loop and • the trace is s0us1as2bs1as2bs1cs3vs4 • it is repeated from state s1. • We process each trace and remove all the loops but record the location and content of the loops. • After Step (4), we restore all the loops and obtain an FSM model for flaw detection.
Construction Steps (3) • Given a loop-free set of traces from Step (2), construct a tree FSM.
Construction Steps (4) • Given a large number of monitored traces, the resulting tree FSM can be big and we want to reduce its size. • They merge equivalent states in a tree FSM MT to reduce its size.
Example (1/2) tr1={a,b,d,a} tr2={c,c,a,b,d,e} tr3={e,a,b,a,b,a,b,d,e} tr4={e,d,a} 2. remove loops tr1={a,b,d,a} tr2={a,b,d,e} tr3={e,d,e} tr4={e,d,a} 3. construct a tree FSM
Example (2/2) 4. bottom-up, for each sub-tree with height 1 mark equivalent states with same color then remove them Add loop DAG FSM b? Final d S5 a e b? S6 S7
Coverage Criteria • Given a fuzzing function Z: MSGI* →MSGI*. • In their definition, the fuzzing function only mutates a single input message from a sequence (typically the last one). • The mutated input corresponds to a transition in Mx. • Given a test suite, transition coverage measures how many transitions are tested using this fuzzing function. • Formally, given a set of input sequences of the following form whose last message is to be fuzzed: Seqk: Ik1Ik2…IkL(k)-1IkL(k), and a synthesized model Mx = <S, s0, AI=α(MSGI), AO, fnext, foutput>, the transition coverage function is
Experimental Results • MSNIM protocol (version MSNP9) • A text-based protocol. • Only login/logout procedure is modeled. • SOCKS v5 proxy is used to interceptincoming and outgoing messages. The tree FSM has 28 input symbols, 14 output symbols and 94 states. The reduced FSM has total 14 states and 48 transitions.
Fuzz Operators • The authors develop 26 single message fuzzing functions, which fall into four categories: • Fuzzing data fields (16 functions) • deleting some or all fields, inserting and repeating fields, and alerting the value of a field • Fuzzing message type (4 functions) • Change message type to a different (defined or undefined) one. • Repeat or remove the message type. • Intra-session message recording (3 functions) • Twist the order of normal message exchange with inserting, repeating, or dropping an intercepted message • Transition substitution (3 functions) • Substitute an input message with another one that is likely to trigger a different transition that goes from the current state to another state. • Almost all fuzzing functions find crash.
Result and Analysis -1 • They find total of 89 crash instances from aMSN program and 61 crash instances from GAIM. • aMSN and GAIM are open source clients of MSN. • Each of the instances represents a unique way of crashing the protocol implementation.
Result and Analysis -2 • They measure the progress of finding new crash instance with the increase of transition coverage metric. • Each pass of execution guarantee to test a new transition, it is not surprising to see such ladder-shaped progress. • P.S. Only 35% of the transitions have been covered, since the synthesized model is not restrict to login/logout phase.
Result and Analysis -3 • Certain types of messages can serve for multiple purposes and are processed by different logic (source code): some are immune to fuzzing function while others are not. • The messages with different roles are exactly reflected in Mx by multiple transitions with the same input symbol.
Result and Analysis -4 • Comparing with message type coverage (syntax based method)
Related Works • The idea of using input fuzzing to detect flaws of communicating systems has been investigated for more than 20 years and white-box approaches have been predominant. • symbolic execution and binary interception • Existing works on black-box fuzzing are mostly ad-hoc and each is coupled with a specific protocol. • D. Song [4] proposes the use of predicate formula to characterize implementation and to further use it for error detection and protocol fingerprinting. [4] Towards Automatic Discovery of Deviations in Binary Implementations with Applications to Error Detection and Fingerprint Generation, USENIX Security, 2007.
D. Song: Towards Automatic Discovery of Deviations in Binary Implementations with Applications to Error Detection and Fingerprint Generation, USENIX Security, 2007. • Their approach works on binaries directly, without access to the source code. • By automatically building symbolic formulas from the implementation, their approach is precisely faithful to the implementation. • By solving formulas created from two different implementations of the same specification, their approach significantly reduces the number of inputs needed to find deviations. • Automatic deviation discovery is a challenging task— deviations usually happen in corner cases. • At a high level, they build two formulas, f1 and f2, which capture how each program processes a single input. Then, we check whether the formula (f1 ∧ ¬f2) ∨ (¬f1 ∧ f2) is satisfiable.
Conclusion • This paper proposes a model-based methodology for detecting security and reliability flaws of network protocol implementations. • By using an automatically synthesized formal protocol behavioral model to guide input selection, their method changes the random and manual nature of black-box protocol security testing. • This approach is more effective than the existing syntax based security testing methods. • A key to flaw detection is to improve the quality of the synthesized specification.
Comments • I did not see the obvious guidance of the model to build the fuzzing functions. • They prove the syntax based security testing methods is less effective than model based method. • The structure of a protocol model is not necessary to be the diagram in the specification. • Detecting flaw is much more easily than detecting an attack. • Crash = flaw; than ? = attack.