1 / 36

Heap Shape Scalability Scalable Garbage Collection on Highly Parallel Platforms

Heap Shape Scalability Scalable Garbage Collection on Highly Parallel Platforms. Kathy Barabash, Erez Petrank Computer Science Department Technion, Israel. Outline. Is tracing GC ready for the many-core? How the heap shape is related? Evaluating the heap shape scalability

corine
Télécharger la présentation

Heap Shape Scalability Scalable Garbage Collection on Highly Parallel Platforms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Heap Shape ScalabilityScalable Garbage Collection on Highly Parallel Platforms Kathy Barabash, Erez Petrank Computer Science Department Technion, Israel

  2. Outline • Is tracing GC ready for the many-core? • How the heap shape is related? • Evaluating the heap shape scalability • Idealized Trace Utilization • Improving the heap shape scalability • Solution 1: Reshaping with Shortcut References • Solution 2: Tracing with Speculative Roots • Related work & conclusion ISMM 2010

  3. Roots a b c d i g h e f j k m Heap l Is Tracing GC Ready for Many-core ? • GC tracing • Traverse lots of objects • Sequential trace • Each live object is touched (BFS, DFS) • Parallel trace • Load balancing • 1K cores really soon ISMM 2010

  4. Roots 1 2 3 Heap Can Heaps Spoil the Scalability? • 4M live objects • Single linked list • Sequential trace • 4M steps • Parallel trace • Not any faster 4M 4K ISMM 2010

  5. Object Depths 0 1 2 Heap 3 Deep Object Graphs Can be Evil Object Depth Length of the minimal path from some root object Object-Graph Depth Maximal live object depth Definition: Example: How deep are object graphs of Java programs? • SpecJVM, Dacapo, SpecJBB • Instrumented BFS trace ISMM 2010

  6. Object-Graph Depths of Java Benchmarks ISMM 2010

  7. Object-Graph Depths of Java Benchmarks ISMM 2010

  8. Object-Graph Depths of Java Benchmarks ISMM 2010

  9. Not all Deep Object Graphs are Evil • Object-graph • 1K same sized linked lists of 4K objects • Sequential trace • 4M steps • Parallel trace • Scales well for up to 1K processors Roots 1 2 3 … 4K 4K 4K Heap ISMM 2010

  10. Deep and Narrow Object Graphs are Evil Object DepthsDistribution Amount of objects at different depths Definition: Example: Graphical Representation (Object-graph shape): #objects 1 2 # objects 4 3 depth 1 Heap ISMM 2010

  11. Object-Graph Shapes of Java Benchmarks jython # objects depth xalan # objects depth ISMM 2010

  12. Object-Graph Shapes of Java Benchmarks db jython jess bloat # objects (log 10) jack javac lusearch mtrt hsqldb xalan antlr pmd depth (log 10) depth (log 10) ISMM 2010

  13. Total Scanned Objects *100% Total Processor Slots The Idealized Trace Utilization Simulate the idealized traversal by N threads • Perfect load balancing • Perfect cache behavior • BFS traversal • Single time tick object scan During the traversal, count • Objects available to be scanned at every time tick • Processor slots: some are busy and some are wasted At the end, report the utilization (ITU) ISMM 2010

  14. Total Scanned Objects *100% Total Processor Slots Idealized Trace Utilization Example Core 1 Core 2 4 Tracers Core 3 Core 4 Heap objects 1 2 2 5 3 9 4 11 5 12 6 13 7 14 8 15 Time ticks Scanned objects 15 47 % = *100% ITU = = 8*4 ISMM 2010

  15. Graphical Representation 1. Simulate and compute 2. Draw the graph # objects depth ISMM 2010

  16. Worst Case ITU for Java Benchmarks ISMM 2010

  17. Average ITU for Java Benchmarks ISMM 2010

  18. What’s Next? • Problematic heaps exist • javac, mtrt, pmd, bloat, xalan • Can we improve the trace scalability without modifying the benchmarks? • Reshape with Shortcut References • Trace with Speculative Roots ISMM 2010

  19. Reshape with Shortcut References • Sequential trace • 16K steps • New references are added • Invisible to the program • Useful for the tracers • Parallel trace • Scales for 4 processors Roots 1 16K 2 3 4 4K Heap ISMM 2010

  20. Evaluation Prototype • Devise a shortcut strategy • Where shortcuts are needed • When the program is stopped for GC • Compute the Idealized Trace Utilization • Run the shortcuts adding algorithm • Compute the ITU for the modified heap • Report • ITU improvement • Amount of shortcuts added ISMM 2010

  21. Shortcut Strategy and Parameters • Identify candidate subgraphs • With at least size objects • With depth-to-size ratio no less than ratio • Add shortcut to the root of the subgraph • Leading to the objects length pointers away • Next shortcut introduced not closer than distance pointers away Size=5 Depth=4 Ratio=0.8 Distance (2) Length (4) 1 2 3 4 5 6 7 8 9 ISMM 2010

  22. Results for SpecJVM mtrt Size=50 Ratio=0.2 ~ 500K of live objects Max shortcuts – 110 Avg shortcuts – 94 Length=50 Distance=25 ISMM 2010

  23. Results for DaCapo xalan Size=50 Ratio=0.2 ~ 400K of live objects Max shortcuts – 888 Avg shortcuts – 536 Length=50 Distance=25 ISMM 2010

  24. Results for DaCapo bloat Size=50 Ratio=0.2 ~ 400K of live objects Max shortcuts – 940 Avg shortcuts – 378 Length=50 Distance=25 ISMM 2010

  25. Results for DaCapo pmd Size=600 Ratio=0.1 ~ 434K of live objects Max shortcuts – 5,874 Avg shortcuts – 432 Length=120 Distance=40 ISMM 2010

  26. Results for SpecJVM javac Size=500 Ratio=0.1 ~ 383K of live objects Max shortcuts – 292 Avg shortcuts – 16 Length=100 Distance=50 ISMM 2010

  27. Trace with Speculative Roots • Sequential trace • 16M steps • Helper tracers • Pick random roots • Trace using custom colors • Parallel trace • Scales for 4 processors Roots 4M 4K Heap ISMM 2010

  28. Speculative Trace • Helper tracer • Pick up the root • Pick up the color, e.g. red • Trace; if blue object is discovered, mark blue as reachable from red • Regular trace • Trace from root; if blue object is discovered, mark blue as live • Complete trace • All colors reachable from live colors marked live • All objects marked by live colors survive the collection ISMM 2010

  29. Evaluation Prototype • 4 regular tracers, 4 helper tracers • Speculative roots – random unmarked objects • ITU before and after the colored trace a Useful helpers work • Live objects colored by live colors Wasted helpers work • Dead objects colored by dead colors Floating garbage • Dead objects colored by live colors b c d i g h e f j k m Heap l ISMM 2010

  30. Limit the floating garbage • Maximal amount of objects colored by a single color • Helpers must save discovered but not traced objects • Trace completion phase takes care of the saved fronts • Make the random roots choices smarter • To avoid choosing dead objects • To reach deeper parts of the live object graph • Filter for the recursive objects • Objects with referents of their own type ISMM 2010

  31. Results • Lots of floating garbage • Even with the filter • Hard to find good roots • Progressively harder as the live objects are getting marked • Trace completion phase is complex • Can defeat the purpose • Modest improvement in the Idealized Trace Utilization scores ISMM 2010

  32. Results for DaCapo xalan Worst case ITU improvement, with the random choices filter ISMM 2010

  33. Results for DaCapo bloat Worst case ITU improvement, with the random choices filter ISMM 2010

  34. Related Work • Parallel Garbage Collection Folklore • There are heap structures that can foil any clever load balancing scheme • Siebert (ISMM’08) • Reported object graph depths for SpecJVM benchmarks • Proposed upper bound on the worst case scalability as a way to compute RT guarantees for the GC tracing • Random tracing originally proposed by Click ISMM 2010

  35. Summary Studied the heap shape properties of Java benchmarks Out of twenty considered benchmarks, five had not scalable heap shapes during the run Devised a measure to quantify the heap shape scalability Idealized Trace Utilization Proposed, prototyped and evaluated two approaches to improve the tracing scalability Reshaping with Shortcuts appears to be more promising than Tracing from Speculative Roots ISMM 2010

  36. Thank You! ISMM 2010

More Related