410 likes | 530 Vues
This paper explores the use of threads and events in interactive computing systems, highlighting their benefits and challenges. It analyzes two case studies regarding thread usage, revealing the ongoing struggles programmers face, such as limited scheduling support and priority inversion. The paper contrasts thread-based architectures with stage-driven event-based systems, evaluating their performance and ease of use. It also covers key concepts like thread switching, synchronization primitives, common concurrency problems, and categorization of thread usage, ultimately providing insights into the complexities of concurrent programming.
E N D
Concurrency,Threads, and Events Ken Birman (Based on a slide set prepared by Robbert van Renesse)
Summary Paper 1 • Using Threads in Interactive Systems: A Case Study (Hauser et al 1993) • Analyzes two interactive computing systems • Classifies thread usage • Finds that programmers are still struggling • (pre-Java) • Limited scheduling support • Priority-inversion
Summary Paper 2 • SEDA: An Architecture for Well-Conditioned, Scalable Internet Services (Welsh, 2001) • Analyzes threads vs event-based systems, finds problems with both • Suggests trade-off: stage-driven architecture • Evaluated for two applications • Easy to program and performs well
What is a thread? • A traditional “process” is an address space and a thread of control. • Now add multiple thread of controls • Share address space • Individual program counters and stacks • Same as multiple processes sharing an address space.
Thread Switching • To switch from thread T1 to T2: • Thread T1 saves its registers (including pc) on its stack • Scheduler remembers T1’s stack pointer • Scheduler restores T2’ stack pointer • T2 restores its registers • T2 resumes • Two models: preemptive/non-preemptive
Thread Scheduler • Maintains the stack pointer of each thread • Decides what thread to run next • E.g., based on priority or resource usage • Decides when to pre-empt a running thread • E.g., based on a timer • May need to deal with multiple CPUs • But not usually • “fork” creates a new thread • Blocking or calling “yield” lets scheduler run
Synchronization Primitives • Semaphores • P(S): block if semaphore is “taken” • V(S): release semaphore • Monitors: • Only one thread active in a module at a time • Threads can block waiting for some condition using the WAIT primitive • Threads need to signal using NOTIFY or BROADCAST
Uses of threads • To exploit CPU parallelism • Run two CPUs at once in the same program • To exploit I/O parallelism • Run I/O while computing, or do multiple I/O • Listen to the “window” while also running code, e.g. allow commands during an interactive game • For program structuring • E.g., timers • To avoid deadlock in RPC-based applications
Hauser’s categorization • Defer Work: asynchronous activity • Print, e-mail, create new window, etc. • Pumps: pipeline components • Wait on input queue; send to output queue • E.g., slack process: add latency for buffering • Sleepers & one-shots • Periodic activity & timers
Categorization, cont’d • Deadlock Avoiders • Avoid deadlock through ordered acquisition of locks • When needing more locks, roll-back and re-acquire • Task Rejuvenation: recovery • Start new thread when old one dies, say because of uncaught exception
Categorization, cont’d • Serializers: event loop • for (;;) { get_next_event(); handle_event(); } • Concurrency Exploiters • Use multiple CPUs • Encapsulated Forks • Hidden threads used in library packages • E.g., menu-button queue
Common Problems • Priority Inversion • High priority thread waits for low priority thread • Solution: temporarily push priority up (rejected??) • Deadlock • X waits for Y, Y waits for X • Incorrect Synchronization • Forgetting to release a lock • Failed “fork” • Tuning • E.g. timer values in different environment
Problems he neglects • Implicit need for ordering of events • E.g. thread A is supposed to run before thread B does, but something delays A • Non-reentrant code • Languages lack “monitor” features and users are perhaps surprisingly weak at detecting and protecting concurrently accessed data
Criticism of Hauser • He assumes superb programmers and seems to believe that “most” programmers won’t use threads (his example systems are really platforms, not applications) • Systems old but/and not representative • Pre-Java and C# • And now there are some tools that can help discover problems
What is an Event? • An object queued for some module • Operations: • create_event_queue(handler) EQ • enqueue_event(EQ, event-object) • Invokes, eventually, handler(event-object) • Handler is not allowed to block • Blocking could cause entire system to block • But page faults, garbage collection, …
Example Event System (Also common in telecommunications industry, where it’s called “workflow programming”)
Event Scheduler • Decides which event queue to handle next. • Based on priority, CPU usage, etc. • Never pre-empts event handlers! • No need for stack / event handler • May need to deal with multiple CPUs
Synchronization? • Handlers cannot block no synchronization • Handlers should not share memory • At least not in parallel • All communication through events
Uses of Events • CPU parallelism • Different handlers on different CPUs • I/O concurrency • Completion of I/O signaled by event • Other activities can happen in parallel • Program structuring • Not so great… • But can use multiple programming languages!
Hauser’s categorization ?! • Defer Work: asynchronous activity • Send event to printer, etc • Pumps: pipeline components • Natural use of events! • Sleepers & one-shots • Periodic events & timer events
Categorization, cont’d • Deadlock Avoiders • Ordered lock acquisition still works • Task Rejuvenation: recovery • Watchdog events?
Categorization, cont’d • Serializers: event loop • Natural use of events and handlers! • Concurrency Exploiters • Use multiple CPUs • Encapsulated Events • Hidden events used in library packages • E.g., menu-button queue
Common Problems • Priority inversion, deadlock, etc. much the same with events
Threads vs. Events • Events-based systems use fewer resources • Better performance (particularly scalability) • Event-based systems harder to program • Have to avoid blocking at all cost • Block-structured programming doesn’t work • How to do exception handling? • In both cases, tuning is difficult
Both? • In practice, many kinds of systems need to support both threads and events • Threaded programs in Unix are the common example of these, because window systems use events • The programmer uses cthreads or pthreads • Major problem: the UNIX kernel interface wasn’t designed with threads in mind!
Why does this cause problems? • Many system calls block the “process” • File read or write, for example • And many libraries aren’t reentrant • So when the user employs threads • The application may block unexpectedly • Limited work-around: add “kernel threads” • And the user might stumble into a reentrancy bug
Events as seen in Unix • Window systems use small messages… • But the “old” form of events are signals • Kernel basically simulates an interrupt into the user’s address space • The “event handler” then runs… • But can it launch new threads? • Some system calls can return EINTR • Very limited options to “block” signals in critical sections
How people work around this? • They try not to do blocking I/O • Use asynchronous system calls… or select… or some mixture of the two • Or try to turn the whole application into an event-driven one using pools of threads, in the SEDA model (more or less) • One dedicated thread per I/O “channel”, to turn signal-style events into events on the event queue for the processing stage
This can be hard, but it works • Must write the whole program and have a way to review any libraries it uses! • One learns, the hard way, that pretty much nothing else works • Unix programs built by inexperienced developers are often riddled with concurrency bugs!
SEDA • Mixture of models of threads and (small message-style) events • Events, queues, and “pools of event handling threads”. • Pools can be dynamically adjusted as need arises. • Similar to Javabeans and EventListeners?
Authors: “Best of both worlds” • Ease of programming of threads • Or even better • Performance of events • Or even better
Threads Considered Harmful • Like goto, transfer to some entry in program • In any scope • Destroys structure of programs • Primitive Synchronization Primitives • Too low-level • Too coarse-grained • Too error-prone • Prone to over-specification
Example: create file • Create file • Read current directory (may be cached) • Update and write back directory • Write file
Thread Implementations • Serialize: op1; op2; op3; op4 • Simplest and most common • Use threads • Requires at least two semaphores! • Results in complicated program • Simplified threads • Create file and read directory in parallel • Barrier • Write file and write directory in parallel • Over-specification!
Event Implementation • Create a dummy handler that awaits file creation and directory read events and then send an event to update the directory. • Not great…
GOP: Discussion • Specifies dependencies at a high-level • No semaphores, condition variables, etc • No explicit threads nor events • Can easily be supported by many languages • C, Java, etc. • Top-down specification • cmp with make, prolog, theorem prover • Exception handling easily supported
Conclusion • Threads still problematic • As a code structuring mechanism • High resource usage • Events also problematic • Hard to code, but efficient • SEDA and GOP address shortcomings • But neither can be said to have taken hold
Issues not discussed • Kernel vs. User-level implementation • Shared memory and protection tradeoffs • Problems seen in demanding applications that may launch enormous numbers of threads