1 / 44

Program Analysis and Testing using Satisfiability Modulo Theories Yandex 2 October 2012, Moscow

Program Analysis and Testing using Satisfiability Modulo Theories Yandex 2 October 2012, Moscow. Nikolaj Bjørner Senior Researcher Microsoft Research. Agenda. Context : Software Engineering Research @ Microsoft Application : Fuzzing and Test C ase G eneration

avak
Télécharger la présentation

Program Analysis and Testing using Satisfiability Modulo Theories Yandex 2 October 2012, Moscow

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Program Analysis and Testing using SatisfiabilityModulo TheoriesYandex2 October 2012, Moscow Nikolaj Bjørner Senior Researcher Microsoft Research

  2. Agenda Context: Software Engineering Research @ Microsoft Application: Fuzzing and Test Case Generation Application: Program Verification & Bit precise Analysis Application: String analysis - Formal Language Theory for Security Technology: Z3 – An Efficient SMT Solver - Basics and Research Propaganda: Software Engineering Research Tools

  3. Context Team An Efficient SMT Solver Leonardo de Moura, Nikolaj Bjørner, Christoph Wintersteiger

  4. Context Research in Software EngineeringImprove Software Development Productivity Group

  5. Context Biology Communication & Collaboration Computational Linguistics Systems and Networking Economics Education Gaming Graphics and Multimedia Theory Hardware and Devices Health and Well-being Human-computer Interaction Information Retrieval & Management Machine Learning Security and Privacy Social Science Software Engineering OrganizationMicrosoft Research

  6. Microsoft Research Labs Context Research :1% R & D ~40000 Company

  7. Application: Fuzzing and Testing Fuzzing and Test Case Generation Dr. Strangelove? Bug: ***433 “2/29/2012 3:41 PM Edited by ***** SubStatus -> Local Fix I think the fuzzers are starting to become sentient. We must crush them before it is too late. In this case, the fuzzer figured out that if [X was between A and B then Y would get set to Z triggering U and V to happen……] ….. And if this fuzzer asks for the nuclear launch codes, don’t tell it what they are …” SAGE Internal. For Security Fuzzing Runs on x86 instructions External. For Developers Runs on .NET code Try it on: http://pex4fun.com Finding security bugs before the hackers black hat

  8. Application: Fuzzing and Testing SAGE by numbers 100s CPU-years - largest dedicated fuzz lab in the world 100s apps - fuzzed using SAGE 100s previously unknown bugs found Billion+ computers updated with bug fixes Millions of $ saved for Users and Microsoft 10s of related tools (incl. Pex), 100s DART citations 3+ Billion constraints - largest usage for any SMT solver Adapted from [Patrice Godefroid, ISSTA 2010]

  9. Application: Fuzzing and Testing Test case generation (y0 > 0) and (m0 = x0 % y0) and not (m0 = 0) and (x1 = y0) and (y1 = m0) and (m1 = x1 % y1) and (m1 = 0) • x0 = 2 • y0 = 4 • m0 = 2 • x1 = 4 • y1 = 2 • m1 = 0 SSA Solver We want a trace where the loop is executed twice. unsigned GCD(x, y) { requires(y > 0); while (true) { unsigned m = x % y; if (m == 0) return y; x = y; y = m; } }

  10. Application: Fuzzing and Testing Test Case Generation Procedure Run Test and Monitor Path Condition Execution Path TestInputs KnownPaths seed New input Constraint System Solve Unexplored path

  11. Application: Scalable bit-precise analysis What is wrong here? -INT_MIN= INT_MIN (INT_MAX+1)/2 +(INT_MAX+1)/2 = INT_MIN void itoa(int n, char* s) { if (n < 0) { *s++ = ‘-’; n = -n; } // Add digits to s …. intbinary_search(int[] arr,intlow, inthigh, int key) while (low <= high) { // Find middle value int mid = (low + high) / 2; intval = arr[mid];if (val == key) return mid;if (val < key) low = mid+1; else high = mid-1; }return -1; } Book: Kernighan and Ritchie Function: itoa (integer to ascii) Package: java.util.Arrays Function: binary_search

  12. Application: Scalable bit-precise analysis Bit-precise analysis  +  0 0 1 1 0 1 0 1 0 0 1 1 0 0 0 0 1 0 0 1 1 1 1 1 1 1 1 0 1 1 0 0 0 0 0 0 0 1 0 0 1 0 1 0 0 1 0 1 0 0 1 1 1 1 1 1 0 1 1 1 = Vector Segments Bit-wise operations Concatenation Bit-wise and = 0 1 0 [4:2] = 1 0 1 0 1 1 = Vector Segments Modular arithmetic Extraction Addition

  13. Application: Verification Hypervisor Verification (2007 – 2010) with Hypervisor Hardware Partners: European Microsoft Innovation Center Microsoft Research Microsoft’s Windows Division Universität des Saarlandes co-funded by the German Ministry of Education and Research http://www.verisoftxt.de

  14. Application: Verification Microsoft Verifying C Compiler

  15. Application: Verification SAT/SMT progress driven by applications:VCC Performance Trends Nov 08 – Mar 09 Modification in invariant checking Switch to Z3 v2 Z3 v2 update Attempt to improve Boogie/Z3 interaction Switch to Boogie2

  16. Application: Verification Verification Attempt Time vs.Satisfaction and Productivity By Michal Moskal (VCC Designer and Software Verification Expert), Language quiz: “loose” or “lose” ?

  17. Application: Verification The Importance of Speed

  18. Application: Verification Building Verve Kernel.cs Source file Verification tool Compilation tool C# compiler Verified Nucleus.bpl (x86) Kernel.obj (x86) 9 person-months Boogie/Z3 TAL checker Translator/ Assembler Linker/ISO generator Verve.iso Safe to the Last Instruction / Jean Yang & Chris Hawbliztl PLDI 2010

  19. Application: String Analysis Why string analysis?(motivating scenario) Tomcat v. < 6.0.18 req = http://www.x.com/%c0%ae%c0%ae/%c0%ae%c0%ae/private/ • security check: reqmust not contain "../" • dir= utf8decode("%c0%ae%c0%ae/%c0%ae%c0%ae/private/") = "../../private/" Analysis question: Does utf8decode reject overlongutf8-encodings such as "%C0%AE" for '.'? access granted to "../../private/" Windows 2000 vulnerability: http://www.sans.org/security-resources/malwarefaq/wnt-unicode.php Apache Tomcat vulnerability: http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2008-2938

  20. Application: String Analysis Relativized Formal Language Theory string transformation SymbolicWord Transducers  Classical Word Transducers modulo Th() Classical Word Transducers (e.g. decoding automata, rational transductions) Classical I/O Automata (e.g. Mealy machine) Symbolic Word Acceptors ClassicalWord Acceptors (NFA, DFA)  Classical Word Acceptors modulo Th() regex matching

  21. Application: String Analysis Rex & Bek – Symbolic RegEx & Transducers Margus Veanes

  22. Application: String Analysis Symbolic Finite Transducer (SFT) • Classical transducer modulo a rich label theory • Core Idea: represent labels with guarded transformation functions • Separation of concerns: finite graph / theory of labels Concrete transitions: Symbolic transition: guard 1920transitions p p  x. 8016≤ x ≤ 7FF16/ [C016|x10,6, 8016|x5,0] … ‘\x7FF’/ “\xDF\xBF” ‘\x80’/“\xC2\x80” bitvector operations q q

  23. Technology SMT: Satisfiability Modulo Theories Solution/Model sat, unsat, Proof Is execution path P feasible? Is assertion X violated? WI TNESS SAGE Is Formula FSatisfiable (over Theory of Reals)?

  24. Technology SMT: Satisfiability Modulo Theories Array Theory Arithmetic Uninterpreted Functions

  25. Job Shop Scheduling Technology Machines Tasks Jobs P = NP? Laundry

  26. Technology Job Shop Scheduling Constraints: Precedence: between two tasks of the same job Resource: Machines execute at most one job at a time 3 1 2 4

  27. Technology Job Shop Scheduling Constraints: Encoding: Precedence: - start time of job 2 on mach 3 - duration of job 2 on mach 3 Resource: 3 1 2 4 Not convex

  28. Job Shop Scheduling Technology

  29. Job Shop Scheduling Technology Efficient solvers: - Floyd-Warshal algorithm - Ford-Fulkerson algorithm case split case split

  30. Technology Microsoft Tools using Z3 is used by many research groups More than 19k downloads Z3 places 1st in most categories in SMT competitions Z3 used to check Azure Firewall Policies HAVOC SAGE SecGuru Vigilante Z3 solved more than 3 billion constraints created by SAGE Checking Win8 and Office. Z3 ships in Windows Server with the Static Driver Verifier

  31. Technology Research Areas Practical problems often have structure that can be exploited. Undecidable (FOL + LIA) Algorithms Decidable Fragments Semi Decidable (FOL) Essentially Uninterpreted Formulas NEXPTIME (EPR) Data structures Heuristics PSPACE (QBF) NP (SAT) Generalized array theory Quantified Bit-Vector Logic Logic is “The Calculus of Computer Science” Zohar Manna

  32. Little Engines of Proof Technology Freely available from http://research.microsoft.com/projects/z3

  33. Research around Z3 Technology Decision Procedures • Modular Difference Logic is Hard TR 08 B, Blass Gurevich, Muthuvathi. • Linear Functional Fixed-points. CAV 09 B. & Hendrix. • A Priori Reductions to Zero for Strategy-Independent Gröbner Bases SYNASC 09 M& Passmore. • Efficient, Generalized Array Decision Procedures FMCAD 09 M & B • Quantifier Elimination as an Abstract Decision Procedure IJCAR 10, B • Cutting to the Chase CADE 11, Jojanovich, M • Polynomials IJCAR 12, Jojanovich, M Combining Decision Procedures • Model-based Theory Combination SMT 07 M & B. . • Proofs, Refutations and Z3 IWIL 08 M & B • On Locally Minimal Nullstellensatz Proofs. SMT 09 M & Passmore. • A Concurrent Portfolio Approach to SMT Solving CAV 09 Wintersteiger, Hamadi & M • Conflict Directed Theory Resolution Cambridge Univ. Press 12, M & B Quantifiers, quantifiers, quantifiers • Efficient E-matching for SMT Solvers. CADE 07 M & B. • Relevancy Propagation. TR 07 M & B. • Deciding Effectively Propositional Logic using DPLL and substitution sets IJCAR 08 M & B. • Engineering DPLL(T) + saturation. IJCAR 08 M & B. • Complete instantiation for quantified SMT formulas CAV 09 Ge & M. • On deciding satisfiability by DPLL(+ T) and unsound theorem proving. CADE 09 Bonachina, M & Lynch. • Generalized PDR SAT 12 Hoder & B.. • . • . • .

  34. Introductory Background Reading September 2011

  35. Technology Mile High: Modern SAT/SMT search Backjump Models literal assignments Proofs Conflict Clauses Conflict Resolution Propagate

  36. Core Engine in Z3: Modern DPLL/CDCL Technology One SAT expert to another: “It took me a year to understand the Mini-SAT FUIP code” Mate Soos to NiklasSörenson over ice-cream at SAT 2012 in Trento Model Proof Conflict Resolution [Nieuwenhuis, Oliveras, Tinelli J.ACM 06] customized

  37. Technology Mile High: Modern SMT procedures Efficiently Backtrack to equi-satisfiable state A way to certify satisfiability Backjump Models values to satisfy formula Learn new fact that prune as many dead branches as possible Proofs Conflict Lemmas Efficient indexing for propagating consequences Conflict Resolution A way to certify unsatisfiability Propagate

  38. Technology Research: Solving Horn Clauses mc(x) = x-10 if x > 100 mc(x) = mc(mc(x+11)) if x  100 assert (x ≤ 101  mc(x) = 91)  mc()  mc()  mc() mc() mc() Solver finds solution for mc KrystofHoder & Nikolaj Bjorner, SAT 2012 Bjorner, McMillan, Rybalchenko, SMT 2012

  39. Technology Research: SolvingR Efficiently A key idea: Use partial solution to guide the search Feasible Region Starting search Partial solution: What is the core? Can we extend it to ? DejanJojanovich & Leonardo de Moura, IJCAR 2012

  40. Propaganda .com

  41. Propaganda Core Expertise Empirical Software Engineering Foundations:Logic Program Analysis: Performance, Reliability, Security Programming Languages Design & Implementation

  42. http://rise4fun.com/z3py

  43. Propaganda Academic Interns

  44. Summary An outline of – an efficient SMT solver Efficient logic solver for SE tools tackling intractable problems http://research.microsoft.com/projects/z3 Software Engineering Research @ Microsoft http://rise4fun.com Academic internships http://research.microsoft.com/en-us/jobs/intern Contact http://research.microsoft.com/~nbjornernbjorner@microsoft.com

More Related