Concurrency case studies in UNIX

Concurrency case studies in UNIX John Chapin6.894October 26, 1998

OS kernel inherently concurrent • From 60s: multiprogramming • Context switch on I/O wait • Reentrant interrupts • Threads simplified the implementation • 90s servers, 00s PCs: multiprocessing • Multiple CPUs executing kernel code

Thread (-centric concurrency) control • Single CPU kernel: • Only one thread in kernel at a time • No locks • Disable interrupts to control concurrency • MP kernels inherit this mindset 1. Control concurrency of threads 2. Add locks to objects only where required

Case study: memory mapping • Background • Page faults • challenges • pseudocode • Page victimization • challenges • pseudocode • Discussion & design lessons

Other interesting patterns • nonblocking queues • asymmetric reader-writer locks • one lock/object, different lock for chain • priority donation locks • immutable message buffers

Segmentlist Page mapping --- background virtual addr space pfdat array physicalmemory

Life cycle of a page frame invalid IO_pending Allocate Read from disk unallocated Victimize valid Write todisk Modify Victimize pushout dirty

Page fault challenges • Multiple processes fault to same file page • Multiple processes fault to same pfdat • Multiple threads of same process fault to same segment (low frequency) • Bidirectional mapping between segment pointers and pfdats • Stop only minimal process set during disk I/O • Minimize locking/unlocking on fast path

Page fault stage 1 • vfault(virtual_address addr) • segment.lock(); • if ((pfdat = segment.fetch(addr)) == null) • pfdat = lookup(s.file, s.pageNum(addr)); • /* returns locked pfdat */ • if (pfdat.status == PUSHOUT) • /* do something complicated */ • install pfdat in segment; • add segment to pfdat owner list; • else • pfdat.lock();

Page fault stage 2 • if (pfdat.status == IO_PENDING) • segment.unlock(); • pfdat.wait(); • goto top of vfault; • else if (pfdat.status == INVALID) • pfdat.status = IO_PENDING; • pfdat.unlock(); • fetch_from_disk(pfdat); • pfdat.lock(); • pfdat.status = VALID; • pfdat.notify_all();

Page fault stage 3 • segment.insert_TLB(addr, pfdat.paddr()); • pfdat.unlock(); • segment.unlock(); • restart application

Page victimization challenges • Bidirectional mapping between segment pointers and pfdats • Stop no processes during batch writes • Deadlock caused by paging thread racing with faulting thread

Page victimization stage 1 • next_victim: • pfdat p = choose_victim(); • p.lock(); • if (! (p.status == valid • || p.status == dirty)) • p.unlock(); • goto next_victim;

Page victimization stage 2 • foreach segment s in p.owner_list • if (s.trylock() == ALREADY_LOCKED) • p.unlock(); • /* do something! (p.r.d.) */ • remove p from s; • /* also deletes any TLB mappings */ • delete s from p.owner_list; • s.unlock();

Page victimization stage 3 • if (p.status == DIRTY) • p.status = PUSHOUT; • schedule p for disk write; • p.unlock(); • goto next_victim; • else • unbind(p.file, p.pageNum, p); • p.status = UNALLOCATED; • add_to_free_list(p); • p.unlock();

Discussion questions (1) • Why have IO_PENDING state; why not just keep pfdat locked until data valid? • What happens when: • Some thread discovers IO_PENDING and blocks. Before it restarts, that page is victimized. • Page chosen as victim is being actively used by application code.

Discussion questions (2) • What mechanisms ensure that a page is only read from disk once despite multiple processes faulting at the same time? • Why is it safe to skip checking for PUSHOUT in fault stage 2? • Write out the invariants that support your reasoning.

Discussion questions (3) • Louis Reasoner suggests releasing the segment lock at the end of fault stg 1 and reacquiring it for stg 3. This will speed up parallel threads. What could go wrong? • At the point marked p.r.d. (victim stg 2), Louis suggestsgoto next_victim;What could go wrong?

Design lessons • Causes of complexity: • data structure traversed in multiple directions • high level of concurrency for performance • Symptoms of complexity • nontrivial mapping from locks to objects • invariants relating thread, lock, and object states across multiple data structures

Loose vs tight concurrency • Loose • Separate subsystems connected by simple protocols • Use often, for performance or simplicity • Tight • Shared data structures with complex invariants • Only use where you have to • Minimize code and states involved

Page frame sample invariants • All pfdat p: • (p.status == UNALLOCATED) • || lookup(p.file, p.pageNum) == p • ; all processes will find same pfdat • p.status != INVALID • ; therefore only 1 process will read disk • (p.status == UNALLOCATED • || p.status == PUSHOUT) • => p.owner_list empty • ; therefore no TLB mappings to PUSHOUT • ; avoiding cache consistency problems

Concurrency case studies in UNIX

Concurrency case studies in UNIX

Presentation Transcript

Case Studies

Case studies

Case studies

Case Studies in MEMS

Case Studies in Immunology

Segmentation Case studies MULTICS Pentium Unix Linux Windows

Case Studies

Case-studies

Case Studies

CASE STUDIES

Case Studies in Accounting

Case Studies

Case Studies

Case Studies:

Case Studies:

CASE STUDIES:

CASE STUDIES

CASE STUDIES IN IMMUNOLOGY

Case Studies in MEMS

Case Study: Virtual Memory in UNIX

Case Studies in LCA

Segmentation Case studies MULTICS x86 (Pentium) Unix Linux Windows