1 / 34

Enhancing Parallel Programming Efficiency with Octet: A Framework for Runtime Support

The Octet framework introduces a robust runtime support system for parallel programming, addressing key challenges in shared memory environments, including performance, correctness, and concurrency errors. It employs mechanisms like atomicity checking, data race detection, and record/replay capabilities to enhance reliability and scalability in multi-threaded applications. By tracking and controlling cross-thread dependences through happens-before edges, Octet offers significant performance improvements while facilitating the enforcement of deterministic execution and transactional memory, thus enabling developers to express parallelism more effectively.

dylan
Télécharger la présentation

Enhancing Parallel Programming Efficiency with Octet: A Framework for Runtime Support

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Michael Bond MilindKulkarni Man Cao Minjia Zhang MeisamFathiSalmi SwarnenduBiswas AritraSengupta Jipeng Huang • Purdue Octet: Capturing and Controlling Cross-Thread Dependences Efficiently • Ohio State

  2. Parallel programming is mainstream Sharedmemory with locks Challenge: performance & correctness

  3. Need practical runtime support Help express parallelism better Eliminate concurrency errors Diagnose production bugs Deal with nondeterminism

  4. Need practical runtime support • Transactional memory • DRF/SC enforcement • Deterministic execution • Atomicity checking • Data race detection • Record & replay

  5. Need practical runtime support • Atomicity checking • Data race detection • Record & replay • Transactional memory • DRF/SC enforcement • Deterministic execution • Track dependences • Control dependences o.f = … … = o.f

  6. Need practical runtime support • Atomicity checking • Data race detection • Record & replay • Transactional memory • DRF/SC enforcement • Deterministic execution • Track dependences • Control dependences o.f = … … = o.f

  7. Need practical runtime support • Atomicity checking • Data race detection • Record & replay • Transactional memory • DRF/SC enforcement • Deterministic execution • Track dependences • Control dependences o.f = … … = o.f • Commodity (software-only) approaches • slow programs by several times

  8. Need practical runtime support • Atomicity checking • Data race detection • Record & replay • Transactional memory • DRF/SC enforcement • Deterministic execution • Track dependences • Control dependences check o.f = … check … = o.f • Commodity (software-only) approaches • slow programs by several times

  9. Need practical runtime support • Atomicity checking • Data race detection • Record & replay • Transactional memory • DRF/SC enforcement • Deterministic execution • Track dependences • Control dependences check o.f = … check … = o.f Any access could race add synchronization at every access

  10. Octet • Framework for runtimesupport • HB edges  all dependences • Atomicity of analysis & access • Concurrency control mechanism • Synchronization  cross-thread dependence •  Qualitative performance improvement

  11. Octet • Framework for runtimesupport • HB edges  all dependences • Atomicity of analysis & access • Concurrency control mechanism • Synchronization  cross-thread dependence •  Qualitative performance improvement Proofs!

  12. Octet tracks ownership • Each object’s stateЄ { WrExT, RdExT, RdShc }

  13. o’s state = WrExT1 T1 T2 write check wro.f Time

  14. o’s state = WrExT1 T1 T2 write check read check wro.f Time

  15. o’s state = WrExT1 T1 T2 write check read check wro.f Time

  16. o’s state = WrExT1 T1 T2 write check read check wro.f Time safe point

  17. o’s state = WrExT1 T1 T2 write check read check wro.f Implicit safe point Time safe point

  18. o’s state = WrExT1 T1 T2 write check read check wro.f Time safe point

  19. o’s state = RdExT2 T1 T2 write check read check wro.f Time safe point

  20. o’s state = RdExT2 T1 T2 write check read check wro.f Time safe point rdo.f

  21. o’s state = RdExT2 T1 T2 T3 T4 write check wro.f read check safe point rdo.f

  22. o’s state = RdExT2 T1 T2 T3 T4 write check wro.f read check safe point rdo.f read check

  23. o’s state = RdShc T1 T2 T3 T4 write check wro.f read check safe point rdo.f read check rdo.f

  24. o’s state = RdShc T1 T2 T3 T4 write check wro.f read check safe point rdo.f read check read check rdo.f

  25. o’s state = RdShc T1 T2 T3 T4 write check wro.f read check safe point rdo.f read check read check rdo.f rdo.f

  26. o’s state = RdShc T1 T2 T3 T4 Sharing detection [von Praun & Gross ’01] Comparison in our paper Distributed shared memory Shasta [Scales et al. ’96] Biased locking [Kawachiya et al. ’02] [Russell & Detlefs ’06] [Hindman & Grossman ’06] write check wro.f read check safe point rdo.f read check read check rdo.f rdo.f

  27. Practical runtime support • Atomicity checking • Data race detection • Record & replay • Transactional memory • DRF/SC enforcement • Deterministic execution • Track dependences • Control dependences Framework for runtime support Concurrency control mechanism Octet

  28. Dependence recorder records happens-before edges T1 T2 T3 T4 write check wro.f read check safe point rdo.f read check read check rdo.f rdo.f

  29. Implementation in Jikes RVM • Publicly available • http://jikesrvm.org/Research+Archive Parallel programs DaCapo Benchmarks 2006& 2009 SPEC JBB 2000& 2005 • Parallel platform • 32cores (AMD Opteron 6272)

  30. Octet helps enable practical runtime support for reliable, scalable concurrency • Framework for runtime support • HB edges  all dependences • Atomicity of analysis & access • Concurrency control mechanism • Synchronization  cross-thread dependence • Qualitative performance improvement

More Related