1 / 75

Tornado: Maximizing Locality and Concurrency in a Shared Memory Multiprocessor Operating System

Tornado: Maximizing Locality and Concurrency in a Shared Memory Multiprocessor Operating System. Ben Gamsa , Orran Krieger, Jonathan Appavoo , Michael Stumm. By : Priya Limaye. Locality. What is Locality of reference? . Locality. What is Locality of reference?. sum = 0;

tyra
Télécharger la présentation

Tornado: Maximizing Locality and Concurrency in a Shared Memory Multiprocessor Operating System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Tornado: Maximizing Locality and Concurrencyin a Shared Memory Multiprocessor Operating System Ben Gamsa, Orran Krieger, Jonathan Appavoo, Michael Stumm By : Priya Limaye

  2. Locality • What is Locality of reference?

  3. Locality • What is Locality of reference? sum = 0; for (int i = 0; i < 10; i ++) { sum = sum + number[i]; }

  4. Locality • What is Locality of reference? Temporal Locality Recently accessed data and instruction are likely to be accessed in near future sum = 0; for (int i = 0; i < 10; i ++) { sum = sum + number[i]; }

  5. Locality • What is Locality of reference? Spatial Locality Data and instructions close to recently accessed data and instructions are likely to be accessed in the near future. sum = 0; for (int i = 0; i < 10; i ++) { sum = sum + number[i]; }

  6. Locality • What is Locality of reference? • Recently accessed data and instructions and nearby data and instructions are likely to be accessed in the near future. • Grab a larger chunk than you immediately need • Once you’ve grabbed a chunk, keep it

  7. Locality in multiprocessor • Computation depends on data local to processor • Each processor uses data from its own cache • Once data is brought in cache it stays there

  8. Locality in multiprocessor CPU CPU Cache Cache Memory Counter

  9. Counter: Shared CPU CPU Memory 0

  10. Counter: Shared CPU CPU 0 Memory 0

  11. Counter: Shared CPU CPU 1 Memory 1

  12. Counter: Shared Read : OK CPU CPU 1 1 Memory 1

  13. Counter: Shared Invalidate CPU CPU 2 Memory 2

  14. Comparing counter Scales well with old architecture Performs worse with shared memory multiprocessor

  15. Counter: Array • Sharing requires moving back and forth between CPU Caches • Split counter into array • Each CPU get its own counter

  16. Counter: Array CPU CPU Memory 0 0

  17. Counter: Array CPU CPU 1 Memory 1 0

  18. Counter: Array CPU CPU 1 1 Memory 1 1

  19. Counter: Array Read Counter CPU 2 CPU CPU 1 1 Add All Counters (1 + 1) Memory 1 1

  20. Counter: Array • This solves the problem • What about performance?

  21. Comparing counter Does not perform better than ‘shared counter’.

  22. Counter: Array • This solves the problem • What about performance? • What about false sharing?

  23. Counter: False Sharing CPU CPU Memory 0,0

  24. Counter: False Sharing CPU CPU 0,0 Memory 0,0

  25. Counter: False Sharing Sharing CPU CPU 0,0 0,0 Memory 0,0

  26. Counter: False Sharing Invalidate CPU CPU 1,0 Memory 1,0

  27. Counter: False Sharing Sharing CPU CPU 1,0 1,0 Memory 1,0

  28. Counter: False Sharing Invalidate CPU CPU 1,1 Memory 1,1

  29. Solution? • Use padded array • Different elements map to different locations

  30. Counter: Padded Array CPU CPU Memory 0 0

  31. Counter: Padded Array Update independent of each other CPU CPU 1 1 Memory 1 1

  32. Comparing counter Works better

  33. Locality in OS • Serious performance impact • Difficult to retrofit • Tornado • Ground up design • Object Oriented approach – Natural locality

  34. Tornado • Object Oriented Approach • Clustered Objects • Protected Procedure Call • Semi-automatic garbage collection • Simplified locking protocol

  35. Object Oriented Approach Process 1 Process 2 … Process Table

  36. Object Oriented Approach Process 1 Lock Process 2 Process 1 … Process Table

  37. Object Oriented Approach Process 1 Lock Process 2 Process 1 … Process Table Process 2

  38. Object Oriented Approach Process 1 Lock Process 2 Process 1 … Lock Process Table Process 2

  39. Object Oriented Approach Class ProcessTableEntry{ data lock code }

  40. Object Oriented Approach • Each resource is represented by different object • Requests to virtual resources handled independently • No shared data structure access • No shared locks

  41. Object Oriented Approach Process Page Fault Exception

  42. Object Oriented Approach Region Process Page Fault Exception Region

  43. Object Oriented Approach Region FCM Process Page Fault Exception Region FCM FCM File Cache Manager

  44. Object Oriented Approach Search for responsible region HAT Region FCM Process Page Fault Exception Region FCM HAT Hardware Address Translation FCM File Cache Manager

  45. Object Oriented Approach COR Region FCM Process DRAM Page Fault Exception Region FCM COR FCM File Cache Manager COR Cached Object Representative DRAM Memory manager

  46. Object Oriented Approach • Multiple implementations for system objects • Dynamically change the objects used for resource • Provides foundation for other Tornado features

  47. Clustered Objects • Improve locality for widely shared objects • Appears as single object • Composed of multiple component objects • Has representative ‘rep’ for processors • Defines degree of clustering • Common clustered object reference for client

  48. Clustered Objects

  49. Clustered Objects : Implementation

  50. Clustered Objects : Implementation • A translation table per processor • Located at same virtual address • Pointer to rep • Clustered object reference is just a pointer into the table • ‘reps’ created on demand when first accessed • Special global miss handling object

More Related