1 / 20

Overview

Overview. Assignment 5: hints Garbage collection Assignment 4: solution. A5 Ex1 - Barriers. Explain the difference between a read and a write barrier Show the instrumented code generated by a compiler for p.next = q Which barrier to use for: Copying GC Mark & Sweep GC.

koto
Télécharger la présentation

Overview

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Overview • Assignment 5: hints • Garbage collection • Assignment 4: solution

  2. A5 Ex1 - Barriers • Explain the difference between a read and a write barrier • Show the instrumented code generated by a compiler for p.next = q • Which barrier to use for: • Copying GC • Mark & Sweep GC

  3. A5 Ex2 – Copying collectors Compacting and copying GC cause the object address to change at each collection step. • Show how to solve the movement problem (for the 2 GC types).

  4. A5 Ex2 – Copying collectors Mark & Sweep vs. Copying GCs: • Give a rough implementation of the collection and allocation for the Copying GC • Which collector has the fastest allocation? • Give an estimate of the collection cycle cost (M = heap size, R = live objects)

  5. A5 Ex3 - Mark & Sweep • Phase 1: • mark every reachable object • Phase 2: • remove non-reachable objects 4 1 5 o 2 6 3 7

  6. Pointer Rotation - Introduction • Recursive traversal is very expensive heap: list with 10’000 elements PROCEDURE Traverse(root: Node); VAR cnt: INTEGER; 10’000 * 16 = 160’000 Bytes Stack Size

  7. q q p p Pointer Rotation - Generic Case

  8. P Q R Pointer rotation example 0 1 2 3

  9. Pointer Rotation • Deutsch-Schorr-Waite (1967) • Stores information in the data structure • memory efficient • iterative • structures are temporary inconsistent • non-concurrent • non-incremental

  10. Input Grammar • EBNF:Graph := noOfNodes { Node }.Node := noOfEdges { destination }. • Implicit: each node is numbered starting from 0. • Example: 8 3 1 2 3 3 4 5 6 2 0 6 1 7 0 0 0 1 2 node

  11. Example 8 3 1 2 3 3 4 5 6 2 0 6 1 7 0 0 0 1 2 0 1 2 3 4 5 6 7 4 1 5 o 2 6 3 7

  12. Overview • Assignment 5: hints • Garbage collection • Assignment 4: solution

  13. A4 Ex.1 – Loading Page Tables • The whole process’ page table is loaded in hardware when the process is scheduled • Advantage: During the process execution, no more memory references are needed for the page table. • Disadvantage: If the page table is large, loading the whole page table at every context switch can also hurt performance, as shown in our example.

  14. Ex.1 – Loading Page Tables • Compute the fraction of the CPU time devoted to loading the page tables if • 32-bit address space, 8 KB pages • each process runs for 100 msec 8KB pages  13 bits for the offset  219 entries in the page table TLoad = 219 · 100nsec = 52.4288msec TLoad / T = 0.52 52% of the CPU time is devoted to loading the page tables.

  15. A4 Ex.2 – Using TLBs • The time to read a word from • page table is 50 nsec • TLB is 10 nsec • What hit rate is needed to have a mean access time of 20 nsec? 10nsec + (1 - p) · 50nsec = 20nsec p = 4 / 5 = 0.80 TLB hit rate = 80%

  16. Ex.2 – Using TLBs (cont) • How does a TLB function in a system with multiple processes? • Some systems have an instruction which clears all the validity bits. Linux uses this machine instruction to invalidate all TLB entries at a context switch. • Extend the TLB entries with a process identifier field, and add a register to hold the PID of the current process.

  17. A4 Ex.3 – Memory Size • The time to execute an instruction is 1 µsec or 2001 µsec if a page fault occurs • A program has 15.000 page faults and an execution time of 60 sec • We double the memory size • the interval between the page faults is doubled T = Ninstr · 1µsec + 15.000 · 2000µsec = 60sec Ninstr · 1µsec = 60.000.000 - 30.000.000µsec = 30.000.000µsec T0 = 30.000.000µsec + 7.500 · 2000µsec = 30.000.000 + 15.000.000µsec = 45.000.000µsec = 45sec

  18. A4 Ex.5 – The Aging Algorithm Page0: 01101110 Page1: 01001001 Page2: 00110111 Page3: 10001011 Problems with this algorithm? • Loose the ability to distinguish between references early in the tick interval from those occurring later. • Because the counters have a finite number of bits, it may happen that two pages have a counter value of 0 and we have no way of seeing which of these two pages was last referenced.

  19. A4 Ex.6 – Program Run Time • Application • TLB hit rate is 75% • number of memory access is 55.500.000 • Page fault rate 0.005 • System performance for this application • average TLB miss penalty is 130 nsec • average DRAM access time is 50 nsec • average disk access time is 9 msec • Which is the application run time • on this system? • on a system with a better disk with an access time of 6 msec?

  20. Ex.6 – Program Run Time T = pTLB · Nacc · TTLBmiss + Nacc · TDRAM + ·pPF · Nacc · TDisk T = 4.578.750.000nsec + 2.497.500msec T = 2.502.078,75msec = 2.507,07875sec = 41min T0 = 4.578,75msec + 0.005 · 55.500.000 · 6msec T0 = 4.578,75msec + 1.665.000msec T0 = 1.669.578,75msec = 1.669,57875sec = 27,8min An increase in disk performance of 33% results in a performance increase of 35% (for this scenario).

More Related