A Methodology for Creating Fast Wait-Free Data Structures

Alex Kogan and ErezPetrank Computer Science Technion, Israel A Methodology for Creating Fast Wait-Free Data Structures

Concurrency & (Non-blocking) synchronization • Concurrent data-structures require (fast and scalable) synchronization Non-blocking synchronization: • No thread is blocked in waiting for another thread to complete • no locks / critical sections

Lock-free (LF) algorithms Among all threads trying to apply operations on the data structure, one will succeed • Opportunistic approach • read some part of the data structure • make an attempt to apply an operation • when failed, retry • Many scalable and efficient algorithms • Global progress • All but one threads may starve

Wait-free (WF) algorithms • A thread completes its operation a bounded #steps • regardless of what other threads are doing • Particularly important property in several domains • e.g., real-time systems and operating systems • Commonly regarded as too inefficient and complicated to design

The overhead of wait-freedom • Much of the overhead is because of helping • key mechanism employed by most WF algorithms • controls the way threads help each other with their operations Can we eliminate the overhead? • The goal: average-case efficiency of lock-freedom and worst-case bound of wait-freedom

Why is helping slow? • A thread helps others immediately when it starts its operation • All threads help others in exactly the same order  contention  redundant work • Each operation has to be applied exactly once • usually results in a higher # expensive atomic operations

Reducing the overhead of helping Main observation: • “Bad” cases happen, but are very rare • Typically a thread can complete without any help • if only it had a chance to do that … Main ideas: • Ask for help only when you really need it • i.e., after trying several times to apply the operation • Help others only after giving them a chance to proceed on their own • delayed helping

Fast-path-slow-path methodology • Start operation by running its (customized) lock-free implementation • Upon several failures, switch into a (customized) wait-free implementation • notify others that you need help • keep trying • Once in a while, threads on the fast path check if their help is needed and provide help Fast path Slow path Delayed helping

Fast-path-slow-path generic scheme Do I need to help ? yes Start Help Someone Apply my opusing fast path(at most N times) no Success? Apply my op using slow path (until success) no Different threads may run on two paths concurrently! yes Return

Fast-path-slow-path: queue example Fast path (MS-queue) Slow path (KP-queue)

Fast-path-slow-path: queue exampleInternal structures 1 2 0 Thread ID state 4 9 9 phase pending true true false false enqueue false true null node null null

Fast-path-slow-path: queue exampleInternal structures Counts # ops on the slow path 1 2 0 Thread ID state 9 4 phase 9 pending false true true true false false enqueue node null null null

Fast-path-slow-path: queue exampleInternal structures Is there a pending operation on the slow path? 1 2 0 Thread ID state 9 4 phase 9 pending false true true true false false enqueue node null null null

Fast-path-slow-path: queue exampleInternal structures 1 2 0 Thread ID What is the pending operation? state 9 4 phase 9 pending false true true true false false enqueue node null null null

Fast-path-slow-path: queue exampleInternal structures Thread ID 0 1 2 helpRecords 0 1 0 curTid 5 4 9 lastPhase nextCheck 8 0 3

Fast-path-slow-path: queue exampleInternal structures ID of the next thread that I will try to help Thread ID 0 1 2 helpRecords 1 curTid 0 0 4 5 lastPhase 9 0 3 8 nextCheck

Fast-path-slow-path: queue exampleInternal structures Phase # of that thread at the time the record was created Thread ID 0 1 2 helpRecords 1 curTid 0 0 4 5 lastPhase 9 0 3 8 nextCheck

Fast-path-slow-path: queue exampleInternal structures HELPING_DELAY controls the frequency of helping checks Thread ID 0 1 2 helpRecords Decrements with every my operation. Check if my help is needed when this counter reaches 0 1 0 0 curTid 4 5 lastPhase 9 nextCheck 0 8 3

Fast-path-slow-path: queue exampleFast path 1. help_if_needed() 2. int trials = 0 while (trials++ < MAX_FAILURES) { apply_op_with_customized_LF_alg (finish if succeeded) } 3. switch to slow path • LF algorithm customization is required to synchronize operations run on two paths MAX_FAILURES controls the number of trials on the fast path

Fast-path-slow-path: queue exampleSlow path 1. my phase ++ 2. announce my operation (in state) 3. apply_op_with_customized_WF_alg (until finished) • WF algorithm customization is required to synchronize operations run on two paths

Performance evaluation • 32-core Ubuntu server with OpenJDK 1.6 • 8 2.3 GHz quadcore AMD 8356 processors • The queue is initially empty • Each thread iteratively performs (100k times): • Enqueue-Dequeue benchmark: enqueueand then dequeue • Measure completion time as a function of # threads

Performance evaluation

Performance evaluation MAX_FAILURES HELPING_DELAY

Performance evaluation

The impact of configuration parameters MAX_FAILURES HELPING_DELAY

The use of the slow path HELPING_DELAY MAX_FAILURES

Tuning performance parameters • Why not just always use large values for both parameters (MAX_FAILURES, HELPING_DELAY)? • (almost) always eliminate slow path • Lemma: The number of steps required for a thread to complete an operation on the queue in the worst-case is O(MAX_FAILURES + HELPING_DELAY * n2) • Tradeoff between average-case performance and worst-case completion time bound

Summary • A novel methodology for creating fast wait-free data structures • key ideas: two execution paths + delayed helping • good performance when the fast path is extensively utilized • concurrent operations can proceed on both paths in parallel • Can be used in other scenarios • e.g., running real-time and non-real-time threads side-by-side

Thank you! Questions?

A Methodology for Creating Fast Wait-Free Data Structures

A Methodology for Creating Fast Wait-Free Data Structures

Presentation Transcript

CS 61B Data Structures and Programming Methodology

Fast Trie Data Structures

CS 61B Data Structures and Programming Methodology

CS 61B Data Structures and Programming Methodology

CS 61B Data Structures and Programming Methodology

CS 61B Data Structures and Programming Methodology

CS 61B Data Structures and Programming Methodology

CS 61B Data Structures and Programming Methodology

Fast Garbage Collection without a Long Wait

CS 61B Data Structures and Programming Methodology

CS 61B Data Structures and Programming Methodology

CS 61B Data Structures and Programming Methodology

CS 61B Data Structures and Programming Methodology

CS 61B Data Structures and Programming Methodology

CS 61B Data Structures and Programming Methodology

CS 61B Data Structures and Programming Methodology

Wait-Free Consensus

Wait-Free Consensus

CS 61B Data Structures and Programming Methodology

Methodology for Creating Objectives

CS 61B Data Structures and Programming Methodology

CS 61B Data Structures and Programming Methodology