1 / 20

Efficient Memory Shadowing for 64-bit Architectures

Efficient Memory Shadowing for 64-bit Architectures. Qin Zhao (MIT) Derek Bruening (VMware) Saman Amarasinghe (MIT). ISMM 2010, Toronto, Canada June 6, 2010. Dynamic Program Analysis. Understand Program Behavior Optimization Debugging Security Memory management Shadow Memory Tools

jariah
Télécharger la présentation

Efficient Memory Shadowing for 64-bit Architectures

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Efficient Memory Shadowing for 64-bit Architectures Qin Zhao (MIT) Derek Bruening (VMware) Saman Amarasinghe (MIT) ISMM 2010, Toronto, Canada June 6, 2010

  2. Dynamic Program Analysis • Understand Program Behavior • Optimization • Debugging • Security • Memory management • Shadow Memory Tools • Maintain meta-data for every memory location • Update meta-data on every memory operation

  3. Examples • Memory Error Detection • MemCheck[VEE’07] • Purify [USENIX’92] • Dr. Memory • Dynamic Information Flow Tracking • LIFT [MICRO’39] • TaintTrace[ISCC’06] • Multi-threaded Program Analysis • Eraser [TCS’97] • Helgrind • Memory Usage Analysis • CETS[ISMM’10] • Staleness

  4. Shadow Memory System • Shadow Memory Manager • Meta-data for application memory • Memory mapping scheme (addrA addrS) • DMS (Direct Mapping) • SMS (Segmented Mapping) • Instrumentor • Every memory operation • Address calculation • Meta-data update • Expensive • MemCheck (~25x) • ~12x for addrA addrS a.out a.out heap heap libc libc stack stack Application Memory Shadow Memory

  5. Direct Mapping Scheme (DMS) • Single memory region for entire address space. • Translation: • Issue: address conflict between memAand memS lea [addr]  %r1 add %r1disp  %r1 Application Shadow Slowdown relative to native execution

  6. Segmented Mapping Scheme (SMS) • Shadow segment per application segment • Translation: • Segment lookup (address indexing) • Address translation App 1 lea [addr]  %r1 mov %r1  %r2 shr %r2, 16  %r2 add %r1, disp[%r2]  %r1 addrA Shd 2 Shd 1 Slowdown relative to native execution addrS App 2 Segment table

  7. Shadow Memory Mapping • Scaling to 64-bit Architecture • DMS • Infeasible due to memory layout a.out User space 247 stack Unusable space Kernel space 264 vsyscall

  8. Shadow Memory Mapping addrA • Scaling to 64-bit Architecture • DMS • Infeasible due to memory layout • Single-Level SMS • Too big (~4 billion entries)

  9. Shadow Memory Mapping addrA • Scaling to 64-bit Architecture • DMS • Infeasible due to memory layout • Single-Level SMS • Too big (~4 billion entries) • Multi-Level SMS • Even more expensive Slowdown relative to native execution

  10. Umbra (CGO’10) • Scaling to 64-bit Architecture • Single-Level SMS is too big but sparse • Umbra (CGO’10) • Eliminate empty entries • Compact table • Walk the table to find the entry

  11. Umbra (CGO’10) • Reference Uni-Cache • Software cache per instr per thread • Segment tag & displacement • Check uni-cache before table walk • 99.97% hit ratio tag = addrA & mask; if (cachetag != tag) { … // table walk} addrS = addrA + cachedisp Slowdown relative to native execution

  12. EMS64: Key Idea • Umbra • EMS64 • Speculatively use a disp without check • Smart shadow memory placement • Notified by memory access violation fault for incorrect displacement tag = addrA & mask; if (cachetag != tag) { … // table walk (0.03%)} addrS = addrA + cachedisp

  13. EMS64: Example 0: Application A0 2: Shadow S0 6: Shadow S1 7: Application A1 9: Reserved 10: Shadow S2 11: Application A2 12: Unavailable 13: Unavailable 13: Unavailable/Reserved 14: Unavailable 15: Unavailable 15: Unavailable/Reserved Displacement: {-1, 2}

  14. EMS64: Potential Problem 0: Application A0 2: Shadow S0 6: Shadow S1 7: Application A1 9: Reserved 10: Shadow S2 11: Application A2 12: Unavailable 13: Unavailable/Reserved 14: Unavailable 15: Unavailable/Reserved Displacement: {-1, 2}

  15. EMS64: Final Solution 0: Application A0 1: Reserved 2: Shadow S0 4: Reserved 5: Reserved 6: Shadow S1 7: Application A1 8: Reserved 9: Reserved 10: Shadow S2 11: Application A2 12: Unavailable 12: Unavailable/Reserved 13: Unavailable/Reserved 14: Unavailable 15: Unavailable/Reserved Displacement: {-1, 2}

  16. Slot Finding Problem • Given n slots: • k Application slots • x Empty slots • y Reserved slots • Find k S-slots. • For each slot Ai, there is one associated slot S with displacement di where di = Si - Ai. • For each slot Ai and each existing displacement dj where di≠dj, slot ((Ai + dj) mod n) is an R-slot or an E-slot. • For each slot S and any existing valid displacement di slot, slot ((S + di) mod n) is an R-slot or an E-slot. Application slot Ai Si Shadow slot Ei Empty slot Ri Reserved slot A0 A1 E0 E1 E2 E3 E4 R0 R1 R2 S0 S1

  17. Slot Finding Problem • Given n slots: • k Application slots • x Empty slots • y Reserved slots • Can We Find k S-slots? • Depends on layout! • Guarantee to find it, for 48-bit address space, if • Application memory < 250 GB • Proof • x ≥ 8k2+2k+1 • We can always find an Si for Ai if #E-slot > #conflicts Application slot Ai Si Shadow slot Ei Empty slot Ri Reserved slot

  18. Implementation & Optimization • Implementation • Shadow memory allocation • Add signal handler • Remove reference uni-cache check • Optimization • Restore uni-cache checks for instructions that access multiple segments, e.g., references from memcpy • When number of access violation exceed 2 lea [addr]  %r1 add %r1, unicachedisp  %r1

  19. Experimental Results Slowdown relative to native execution

  20. Thank You • Download • http://people.csail.mit.edu/qin_zhao/umbra/ • Q & A

More Related