Anomaly Detection Using Call Stack Information

Anomaly Detection Using Call Stack Information Security Reading GroupJuly 2, 2004Henry Feng, Oleg Kolesnikov, Prahlad Fogla, Wenke Lee, Weibo GongPresenter: Jonathan McCune

Overview • Introduction • Related work • VtPath • Experiments • Exploits • Comparisons • Conclusion

Introduction • Runtime, training-phase based, anomaly detection system • Uses call stack and program counter in addition to system call hooks • Novel features based on virtual paths • Compared experimentally and analytically with many other approaches

Related work • Static analysis • Wagner, et al. (abstract stack) • Callgraph • Run-time, training based: • This • Sekar, et al. (FSA) (last week’s srg) • N-gram, Var-gram

“Intrusion detection via static analysis.” Wagner, et al. • Static analysis of program source code • NDFA from control-flow graph • Cannot predict branches statically • Impossible path problem • Abstract stack model • Pushdown automaton • Large automata can be resource intensive • Claimed zero false positives

“Detecting Manipulated Remote Call Streams.” Giffin, et al. • Static analysis of binary executables • Tied to platform, not programming language • Insertion of “null calls” helps impossible path problem • Unusual performance analysis

“A fast automaton-based method for detecting anomalous program behavior.” Sekar, et al. • Compact deterministic FSA from system call analysis of running programs • Suffers from false positives, but not as a result of non-determinism • Impossible paths not addressed • DLLs not adequately addressed

Virtual Path (VtPath) • Run-time analysis • Learns correct behavior via training executions • Utilizes return address information • Generates abstract path and compares with learned behavior • Similar to abstract stack of Wagner, et al., but avoid pushdown automaton

Virtual Path Example main() main() foo() foo() fooA() foo1() fooB() foo2() fooC() foo3()

VtPath Training Phase • Two hash tables: • RA (return address) table • VP (virtual path) table • RAs and VPs gradually added during normal program execution • Use NULL entries at beginning and end of paths

VtPath Online Detection Phase • Stack anomaly – if virtual stack list unavailable (common during buffer overflow attacks) • Return address anomaly – if virtual stack list {a0, a1, … an} contains ai not in RA table • System call anomaly – if an does not have the correct system call • Virtual path anomaly – if virtual path at current system call is not in VP table

VtPath – Impossible Path Problem • Return addresses are part of virtual path information • Modifying return address saved on the stack will cause the program to return to a different location • The next system call is likely to trigger a virtual path anomaly

VtPath - Implementation Issues • Non-standard control flows • Signals (sigreturn system call) • Treat each one like a separate program invocation • setjmp()/longjmp()and function pointers • Hard to handle statically; handled at runtime if trained properly • Dynamically linked libraries • Relative loading positions can change, invalidating PC values from training runs • Use a “block” model during training to capture file name and block length, ignoring start address • Block anomaly – block lookup fails because attacker is trying to load a malicious DLL

Experiments – VtPath vs FSA • Convergence times similar, but • FSA generates more transitions than VtPath (less efficient, less precise) • In practice, multiple levels of DLL functions are called quite frequently – VtPath more effective • False Positives nearly identical • VtPath executes faster, uses more memory • Common exploits detected by both

Exploits • Authors developed two masked mimicry attacks detectable only by VtPath • Impossible Path Execution (IPE) Attack 1 • Exploits parallel structure of privileged / unprivileged code within same function • IPE Attack 2: • F() called twice from within same function with different arguments • Overflow local variable to change option passed in

Attack 1

Attack 2

Comparison of SysCall-based Anomaly Detection Schemes • State-based / Information captured • Data/heap values least useful (transient) • Code segment of some value • Syscalls and callstack most useful • More information = runtime overhead • False positives • How well is normal behavior captured? • Only relevant for dynamic systems • Proportional to resolution of program analysis • Detection capability • More granularity = better detection capability • SysCalls from invalid points vs. statistical regularity of training data vs. attacks’ deviation from perceived normal

Comparison of SysCall-based Anomaly Detection Schemes • Space requirement • System call sequences • Number of NDFA transitions (Wagner) • Transitions in automaton • Convergence (training) time • Cover most possible states / transitions • VtPath requires more data than FSA • Static techniques have big advantage here • Runtime overheads • SysCall interception • Hash lookup for valid state / return address / virtual path

Conclusions • Using call stack (VtPath) has value • There is no magic bullet • Everything is “complementary” • Impossible path execution (IPE) is an important class of attack • Use of FSA to analyze more than two consecutive SysCalls could improve VtPath

Discussion!

Anomaly Detection Using Call Stack Information