1 / 14

Transactional Memory Supporting Large Transactions

Hardware-based. Transactional Memory Supporting Large Transactions. Anvesh Komuravelli Abe Othman Kanat Tangwongsan. Concurrent Programs. handle with care . Thread 1. Thread 2. Deadlock. obj.x = 7; find_primes (); // intrusion test if ( obj.x != 7) fireMissiles () .

gracie
Télécharger la présentation

Transactional Memory Supporting Large Transactions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Hardware-based Transactional MemorySupporting Large Transactions AnveshKomuravelli Abe Othman KanatTangwongsan

  2. Concurrent Programs handle with care Thread 1 Thread 2 Deadlock obj.x = 7; find_primes(); // intrusion test if (obj.x != 7) fireMissiles() lock_acquire(critical_zone); lock_release(critical_zone); do_stuff(); obj.x = 42; Starvation Complex Program Lock-based Approaches

  3. Transactional Memory obj.x = 7; find_primes(); // intrusion test if (obj.x != 7) fireMissiles() x_begin(); x_finish(); Atomicity in the face of concurrency. Transactional Memory Consistency across the whole system. Isolation from other transactions. do_stuff(); obj.x = 42; Programmer: enclose instructions in a transaction. System: execute transactions concurrently, and if conflict, do something intelligent (e.g., abort, restart)

  4. Different strokes for different folks Challenges & Opportunities Common Case: 98% transactions fit in L1 => hardware • Fast… Easy conflict detection… Easy commit and abort What to do with the rest 2%? Goal: Hide platform/resource limitations from programmers

  5. VTM – Virtual Transactional Memory • On overflow, use process’s virtual memory • Tracking at cache-line granularity • Per process state (tag and store virtual addresses) • Flatten nested transactions • Implemented in specialized hardware (dedicated cache, search logic, …) • Drawbacks? • Modifications to hardware. Costly?

  6. XTM – eXtended Transactional Memory • “Complete TM Virtualization without complex hardware” • Page table per transaction • Allows arbitrary nesting – no flattening • The only hardware support – raise an exception on overflow • Drawbacks? • Page granularity on overflows • Potentially higher memory usage than VTM • Software commit is costlier than VTM’s hardware commit – can stall other xactions of the process

  7. Comparing the approaches

  8. An observation • Small transactions get things done in the hardware • Large transactions spill the buffers and TM switches to virtual mode • What about varyingly large transactions? • What if everything fits again in the buffers? • Can we switch back to hardware mode?

  9. Towards improving virtualization • Permissions-only cache – reduces the chance of overflowing buffers significantly • At the cost of a little extra hardware • The already less frequent (assumed to be!) large transactions are even lesser • Large transactions are serialized and handled one-at-a-time.

  10. Towards improving virtualization

  11. Do we always have only a few large transactions? • For now: yes • In the future: maybe not • I/O and blocking system calls might wish to be atomic • How do the earlier discussed approaches fare? • VTM – complex hardware • XTM – complications with OS and page granularity • OneTM – can lead to starvation!

  12. TokenTM • Uses tokens to monitor memory blocks • To read, you get a token • To write, you need to get every token • Rigorous bookkeeping – blocks are tracked in caches, memory and disk • Handles large transactions gracefully • Except for conflicts, transaction speed is unaffected by large transactions in other threads

  13. TokenTM Downsides • Small transactions suffer(?) • L1 cache sized transactions can work at hardware speed….BUT: • Need flash-clear and flash-OR circuits in L1 cache • Requires a very involved ad hoc representation • …or taking a 3% overhead hit • Optimizes the rare large case to the detriment of the frequent small case?

  14. Conclusion • Sun Research’s Transactional Memory Spotlight: More recent proposals for “unbounded” HTM aim to overcome these disadvantages, but Sun Labs researchers came to the conclusion that the proposals were sufficiently complex and risky that they were unlikely to be adopted in mainstream commercial processor designs in the near future.

More Related