1 / 67

Modular Machine Code Verification

PhD Thesis Defense. Modular Machine Code Verification. Zhaozhong Ni Advisor: Zhong Shao Committee: Zhong Shao, Paul Hudak Carsten Sch ü rmann, David Walker Department of Computer Science, Yale University Nov. 29, 2006. 19 Lines of Code on Every PC. ; load new context

zita
Télécharger la présentation

Modular Machine Code Verification

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PhD Thesis Defense Modular Machine Code Verification Zhaozhong Ni Advisor: Zhong Shao Committee: Zhong Shao, Paul Hudak Carsten Schürmann, David Walker Department of Computer Science, Yale University Nov. 29, 2006

  2. 19 Lines of Code on Every PC ; load new context mov eax, [esp+8] mov esp, [eax+28] mov ebp, [eax+24] mov edi, [eax+20] mov esi, [eax+16] mov edx, [eax+12] mov ecx, [eax+8] mov ebx, [eax+4] mov eax, [eax+0] ret swapcontext: ; store old context mov eax, [esp+4] mov [eax+0], OK mov [eax+4], ebx mov [eax+8], ecx mov [eax+12], edx mov [eax+16], esi mov [eax+20], edi mov [eax+24], ebp mov [eax+28], esp

  3. 19 Lines of Code in Every ms swapcontext: • Runs thousands of time per second • Used by assembly, C, MSIL, JVML, etc. • Basis of multi-tasking, OS, and software • Safety and correctness taken for granted

  4. 19 Lines of Code Looks Simple swapcontext: … call swapcontext … eax a1 retp ebx OK a2 old b1 ecx a3 new edx a4 … b2 esi a5 b3 edi b4 a6 b5 ebp a7 esp a8 a8 b6 b7 b8 … … … retp’ …

  5. 19 Lines of Code Proven Hard swapcontext: • Simple code, complex reasoning! • stack / heap / memory mutation • procedure call / first-class code pointer • protection / polymorphism • Lack specification and verification that are • formal (machine checkable in sound logic) • general (allows all possible usage of context) • realistic (usable from assembly and C level)

  6. Outline • Introduction • The XCAP Framework • Mini Thread Library • Connect XCAP to TAL • Conclusion

  7. Software Reliability • Bugs are costly • Especially important for • mission-critical software • consumer electronics software • internet software

  8. Test-Patch Approach • Works most of the time • Gives no guarantee • Could make things worse test debug yes pre-release? no create patch

  9. Language-based Approach • Uses types and other formal specifications • Excludes all bugs in certain categories illegal command, overflow, dangling pointer, etc. • Successful and popular ML, Java, C#, etc. • Reached virtual machine code level JVML, MSIL, TIL, TAL, etc. • Meta-theorems can make guarantees

  10. Traditional Assumptions • Types are for application software you can not write OS without (void *) • Types are for high-level languages not much to talk about 89 84 24 07 5B CD 15 • Types are only for “no blue screen” how about “variable x is a prime number” • Type safety are bad for performance turn off array-bound checking before release

  11. Program Specification syntactic types bool prime (int n) { assert (n > 0); for (int i = 2; i < n; i ++) // n mod 2,…,i-1 ≠ 0 if (n % i == 0) return false; // n mod 2,…,n-1 ≠ 0 return true; } machine-logical specifications meta-logical specifications

  12. Machine Code Verification • Motivations • everything goes down to binary • high-level safety efforts lost in compilation • critical code directly written in low level • Challenges • Expressiveness • Modularity • Goals • both user and system level code • modular specification + certification

  13. Proof-Carrying Code • Proposed 10 years ago [Necula & Lee] • machine code • machine checkable proof Code Specification Proof Meta theory Checker

  14. Foundational PCC • Proposed by [Appel] Code Specification Proof Meta theory Checker mathematic logic theory mathematic logic checker

  15. Approaches to PCC • Type-based PCC • TAL [Morrisett98] • Touchstone PCC [Colby00] • Syntactic FPCC[Hamid02] • FTAL [Crary03] • LTAL[Chen03] • … • Modular • Generate proof easily • Type safety • Logic-based PCC • Original PCC [Necula98] • Semantic FPCC [Appel01] • CAP [Yu03] • Open Verifier [Chang05] • CCAP/CMAP [Yu04, Feng05] • … • Expressive • Advanced properties • Good interoperability

  16. PCC After 10 Years In principle, can verify any machine code! In reality, many programs are not verified. For some code, we do not know HOW! Code Specification Proof Meta theory Checker

  17. User-level Code: List Append Adapted from [Reynolds02] ……

  18. User-level Code: List Append Adapted from [Reynolds02] ……

  19. User-level Code: List Append Adapted from [Reynolds02]

  20. ECP Problem w. Hoare Logic • Embedded code pointers (ECP) Examples: computed GOTOs, higher-order functions, indirect jumps, continuations, return addresses “… are difficult to describe in … Hoare logic”[Reynolds02] • Previous approaches • Ignore ECP [Necula98, Yu04] • Limit ECP specifications to types [Hamid04] • Sacrifice modularity [Yu03] • Use complex indexed semantic models [Appel01]

  21. Outline • Introduction • The XCAP Framework • Mini Thread Library • Connect XCAP to TAL • Conclusion

  22. The XCAP Framework [POPL’06] • A logic-based PCC framework • modular verification of machine code • supports ECP without compromise • Support both system and user code • Consists of • target machine (not fixed) • assertion language (consistency) • inference rules (soundness)

  23. Target Machine

  24. Dynamic Semantics

  25. Certified Assembly Programming [Yu03, Hamid04, Yu04, Feng05] • Hoare logic in CPS • Use general predicate logic for assertions example: • Mechanized in a proof assistant (Coq) • Extensions made: CCAP, CMAP, etc.

  26. How CAP Certify Instructions

  27. How CAP Certify Programs

  28. The ECP Problem cptr(f, a) = ?

  29. Previous Approach • Internalize Hoare-derivation for ECP Circularity! • Stratification [OHearn97, Naumann01] • Works for simple case • Hard for assembly • Hard for polymorphism • Step-Indexing [Appel01, Appel02, Schneck03] • Works for polymorphism • Heavyweight • Not standard Hoare logic

  30. CAP’s Approach • Specify ECP by checking against code spec • Verify all code specs are indeed valid • Modularity problem

  31. The XCAP Approach • Specify ECP independent of code spec • Check ECP against global code spec • Verify global code spec is indeed valid

  32. Extended Propositions

  33. XCAP Rules

  34. How XCAP Works with ECP (SEQ) (ECP) (JMP) (JD)

  35. Verification of append()

  36. Impredicative Polymorphisms • Important for ECP • Naïve interpretation function fails

  37. New Interpretation Interpretation Soundness of interpretation Consistency

  38. Recursive Specification • Simple recursive data structures • linked list, queue, stack, tree, etc. • supported via inductive definition of Prop • Complex recursive structures with ECP • object (self refers to the entire object) • threading invariant (each thread assumes others) • Recursive specification

  39. Memory Mutation • Strong update • special conjunction (p * q) in separation logic • directly definable in Prop and PropX • explicit alias control, popular in system level • Weak update (general reference) • mutable reference (int ref) in ML • managed data pointers (int __gc*) in .NET • rely on GC to recycle memory • popular in user level

  40. Weak Update • Reference cell • Interpretation • Record macro

  41. Implementation in Coq • PropX can share similar tactics with Prop

  42. Outline • Introduction • The XCAP Framework • Mini Thread Library • Connect XCAP to TAL • Conclusion

  43. Why Thread Library? • Concurrent verification • primitives’ correctness is assumed • primitives are not really “primitive”! • poor portability due to lack of formal spec • Core of OS kernel • assignment 1 of OS course • written in C and Assembly • requires both safety and efficiency

  44. A Mini Thread Library • Modeled after Pth • Non-preemptive user level threads • Written in (subset of) x86 assembly

  45. Threading Model

  46. Modules and Interfaces

  47. Verify That 19 Lines of Code Step 1: specify machine context Step 2: specify function call/return Step 3: specify swapcontext() Step 4: prove it!

  48. Machine Context typedef struct mctx_st *mctx_t; struct mctx_st { int eax,int ebx,int ecx,int edx, int esi, int edi, int ebp,int esp }; mctx retv public bx cx private dx cs si di bp sp ret … … … …

  49. Function Call / Return excess space local storage esp return address argument 1 argument 2 … argument n caller frames

More Related