280 likes | 415 Vues
ON OPTIMAL AV SYSTEM STRATEGIES AGAINST OBFUSCATED MALWARE. Anshuman Singh*, Bin Mai^, Arun Lakhotia*, Andrew Walenstein* * University of Louisiana at Lafayette ^ Northwestern State University. Why Heterogeneous AV Systems ?. Precise viral detection is undecidable
E N D
ON OPTIMAL AV SYSTEM STRATEGIES AGAINST OBFUSCATED MALWARE Anshuman Singh*, Bin Mai^, Arun Lakhotia*, Andrew Walenstein* * University of Louisiana at Lafayette ^ Northwestern State University
ASIA 2009 Why Heterogeneous AV Systems ? • Precise viral detection is undecidable • Have to use approximate methods • String scanning • Near-exact identification • X-ray scanning • Code emulation • Heuristic analysis • Integrity checking • Need to use a combination of above methods to reduce false positives and true negatives
ASIA 2009 Composition Experiences • Need to choose among approximate methods due to efficiency considerations • Choices reflect composition ideas in AV • Which choices do we use? • We were experimenting with composition • We were building tools for analysis and detection • Could compose and integrate in variety of ways • Ran into some interesting questions along the way…
ASIA 2009 Normalizer/Filter Composition • Mutant Normalizer: • Converts mutants/variants to a single form • Perform ordinary scan on normalized form • Concern: • Relatively expensive • Solution: • Apply only on files likely to be mutants • We developed fast filter to identify likely mutants • Question: • How to integrate Normalizer/Filter into AV system?
ASIA 2009 Composition: Filter/Normalizer Input FILTER Possible mutant Non-mutated NORMALIZING SCANNER ORDINARYSCANNER Output Output
ASIA 2009 Commercial AV Systems Possible malware Heuristic Analyzer Packed Unpacked Code emulator Near exact identification Output Output
ASIA 2009 Composition Choices Parallel Sequence Select Input Input Input Selector Classifier 1 Classifier 1 Classifier 2 Classifier 2 Classifier 1 Classifier 2 Combiner Output Output Output Output
ASIA 2009 Important Questions • Compositions raise important questions • How do we know a composition is any good? • How do we tune the composed system for optimal performance? • If two classifiers are optimally tuned, will their composition also be optimal? • Game theory can yield insight into such composition problems
ASIA 2009 Game Theory: General Approach • Game Theory: • Aid to analyze strategic choices of adversaries • Basic idea: • Model adversary interaction as game • Associate payoffs (costs/benefits) to outcome • Maximize payoff to derive optimal strategies
ASIA 2009 MA-SA game: players and roles • Malware Author (MA) • Presents infected files to system • Wants to attack system to obtain positive payoff • Security Analyst (SA) • Attempts to provide optimal system, including: • Detect and thwart MA's malware • Minimize AV system's total cost
ASIA 2009 The MA-SA game: player’s strategies • SA’s strategies • C : Single classifier architecture • S2C : Selector and two classifier architecture • MA’s strategies • UM : Unobfuscated malware • OM : Obfuscated malware
ASIA 2009 Obfuscated malware Two types • Packed • Compressed : malicious code is hiding as compressed data • Encrypted : malicious code is hiding as encrypted data • Semantic code obfuscation • Used by metamorphic viruses • More effective but more expensive to implement than packing
ASIA 2009 MA-SA game in strategic form
ASIA 2009 MA-SA game: configurations • UMC • OMC • UMS2C • OMS2C • is the expected payoff to player x in configutaion y.
ASIA 2009 Classifier Parameter • Many classifiers have a tunable parameter • Parameter trades off between • PD : true positive rate • PF : false positive rate • ROC curve gives the relation between PD and PF • Typical ROC curves follow power function • PD = PFr, 0 < r < 1 • r can be used as model parameter, i.e., SA chooses r as part of the game strategy
ASIA 2009 Selector Parameters (tN and tM) Input (Malware) Input (Normal) Selector Selector tN 1-tN tM 1-tM Classifier 1 Classifier 2 Classifier 1 Classifier 2 Output Output Output Output
ASIA 2009 Outcomes, benefits and costs
ASIA 2009 SA’s payoff in UMC and OMC Payoff v - c Detected PF Classifier Normal 1-PF v Missed 1-λ Input file λ Malware -c Detected PD Classifier -d Missed 1-PD
ASIA 2009 SA’s expected payoff in UMC & OMC Payoff v - c Alarm ( v - c ) PF PF Classifier v Normal 1-PF No Alarm 1-λ ( 1-λ ) + E4 + E3 + E2 * * Input file -c λ Malware Alarm PD Classifier -d No Alarm 1-PD
ASIA 2009 Optimal solution • The expected payoffs for SA: v – (d+v)λ - c(1-λ)pF + (d-c)λpD • Maximizing above we obtain optimal solution for SA: Here is the ratio of normal files to malware and is the damage to cost ratio.
ASIA 2009 MA’s payoff in OMS2C Payoff µl-Δ Detected PDL Lenient Classifier tN 1-PDL µh-Δ Missed OM Selector µl-Δ Detected PDS 1-tN Stringent Classifier µh-Δ Missed 1-PDS
ASIA 2009 MA’s expected payoff in OMS2C ,i.e., expected payoff to MA is –ve then If MA will choose not to obfuscate This occurs when the cost of obfuscation is greater than a threshold
ASIA 2009 SA’s payoff in UMS2C ν-c Detected PFL Lenient Classifier Likely normal tN Missed 1-PFL ν Selector 1-λ Detected PFS Normal ν-c 1-tN Likely malware Stringent Classifier Missed 1-PFS ν Input file -c Detected PDL Lenient Classifier Likely normal 1-tM λ Malware -d Missed 1-PDL Selector tM -c PDS Detected Likely malware Stringent Classifier 1-PDS -d Missed
ASIA 2009 Optimal PDS and PDL for SA in UMS2C can be maximized w.r.t PFS and PFL and to obtain optimal values of PDS and PDL In UMS2C the optimal solutions are:
ASIA 2009 Optimal Soln for SA in OMS2C Maximizing subject to constraint gives optimal values of PDS and PDL that satisfy and These values ensure MA’s obfuscation is ineffective
ASIA 2009 Insights • When the selector’s accuracy increases, it is optimal for a security analyst to maintain a less stringent classifier for malware and a more stringent classifier for normal files. • AV designer is always advised to deter “gaming” of selector by decreasing the spread in the detection rates of the classifiers • The minimal cost for developing obfuscated malware sufficient to game the selector increases with the accuracy of the selector
ASIA 2009 Conclusions • Introduced way to analyse AV systems using Game Theory • Showed it may lead to interesting, possibly counter-intuitive results • Mathematically derived optimal configurations
ASIA 2009 Thanks!