1 / 11

Transactional Memory

Transactional Memory. The Transactional Koolaid Acid Test. Herlihy & Moss -or-. Presented by Chris Rossbach. Compulsory Outline Slide. Motivation Transactional Memory Concept/Design Evaluation Discussion Conclusion/Questions. Motivation--Lock-based synchronization is hard.

Télécharger la présentation

Transactional Memory

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Transactional Memory The Transactional Koolaid Acid Test Herlihy & Moss -or- Presented by Chris Rossbach

  2. Compulsory Outline Slide • Motivation • Transactional Memory Concept/Design • Evaluation • Discussion • Conclusion/Questions

  3. Motivation--Lock-based synchronization is hard • Priority Inversion • Convoying • Deadlock • Locks don’t compose • Course grain locks lose opportunities for parallelism • Fine-grain locking can be fast, but is hard to get right • More recently, impending CMP era increases need to exploit thread-level-parallelism

  4. void my_noncomposable_func() { acquire_lock(&bad_dangerous_lock); // enjoy the mutual exclusion! party_on_my_data_structure(); release_lock(&bad_dangerous_lock); } Transactional Memory as a programming model BECOMES: void my_new_improved_func() { begin_transaction(); // enjoy the mutual exclusion! party_on_my_data_structure(); end_transaction(); }

  5. Transactions in Hardware • Serializable • Atomic • Obey ACI properties (not D!) ISA Enhancements: LT - transactional load LTX - transactional load before ST STX - transactional store COMMIT - make transactional changes permanent VALIDATE - check current transaction status for violations/conflicts ABORT - discard transactional changes

  6. Shared Counter Example my_silly_func() { atomic { shared_counter++; } } my_silly_func() { spin_lock(&shrc_lock); shared_counter++; spin_unlock(&shrc_lock); } LOCK-BASED 1: lock; decb shrc_lock jns 3f 2: pause cmpb $0, shrc_lock jle 2b jmp 1b 3: mov $3,shared_counter add $3, 1 mov shared_counter,$3 movb $1, shrc_lock TRANSACTIONAL LTX R1, shared_counter VALIDATE ADD R1, 1 ST shared_counter, R1 COMMIT Is this actually better?

  7. Implementation • Generalization of LL/SC mechanisms • Extend cache coherence protocol: if you can detect access conflicts (think MSI or variants), you can detect TX conflicts • Need additional TX caches with augmented state [EMPTY, NORMAL,XCOMMIT,XABORT] exclusive w/non-TX cache • Per-processor TACTIVE and TSTATUS flags • Commit/Abort local to a cache, buffer writes in the cache until commit

  8. Evaluation Shared Counter Benchmark Performance / Bus Usage TTS = test-and-test and set, MCS = queuing lock, QOSB = queueing lock, LL/SC = load-locked/store-conditional

  9. Why Isn’t Everyone using this?

  10. Why Isn’t Everyone using this? • benchmarks: shared counter, linked list, producer consumer!? • How do you virtualize this? Cache overlflow? • What happens on a context switch? • What happens on an interrupt? • Is this really easier to program?

  11. Conclusion I hereby conclude. Questions?

More Related