Eliminating Receive Livelock in an Interrupt-Driven Kernel

1. Eliminating Receive Livelock in an Interrupt-Driven Kernel Mogul and Ramakrishnan

2. Questions What are buffers for? Amortizing overhead Smoothing load What are interrupts for? Compare to polling. Which has more latency? When is polling good? What is scalability?

3. Suppose you are an artist, with an agent who has some samples of your work. You periodically check with your agent to see if anyone wants to commission a work. What might be bad about this?

4. Ask the agent to call immediately whenever anyone expresses interest. What might be bad about this?

5. When done with one painting, poll for another. If no jobs waiting, then enable interrupts and wait. First interrupt disables interrupts.

6. Introduction

7. OSs originally designed to handle devices that interrupt only once every few milliseconds. Disks, slow network adapters. World has changed, network adapaters now interrupt much more often. Many network applications not flow-controlled. (Why?) Congestive collapse. No negative feedback loop. Maybe even a positive feedback loop. (Explain?) Example of a call center as positive feedback loop. Maybe can�t accommodate, but should respond gracefully. Interrupt-driven systems tend to respond badly under load. Tasks performed at interrupt-level, by definition, have higher-priority. If all time is spent responding to interrupts, nothing else will happen. This is receive livelock. Note that the definition of livelock is a little bit different than in other contexts. Can have livelock in a totally contained system. Just an infinite loop across two or more threads: s1, s2, s3, s1, s2, s3, � s1, t5, s3, t9, s1, t5, s3, t9, �

8. Livelock Any situation where you may have unbounded input rates, and non-zero cost will eventually livelock. Turn off interrupts: zero-cost. Hardware limited: bounds input rate.

9. But interrupts are very useful. Hybrid design: Polls only when triggered by interrupt, interrupts only when polling suspended. Then augment with feedback control to drop packets with least investment. Then connect scheduling subsystem to network subsystem to give some CPU time to user tasks even under overload.

10. Motivating Applications

11. Motivating Applications Host-based routing Many products based on Linux/UNIX. Experimentation also done on UNIX. Passive network monitoring Simpler/cheaper to do with a general-purpose OS. Network file service Can be swamped by NFS/RPC. High-performance networking Even though flow-controlled, livelock might still be an issue.

12. Requirements for Scheduling Network Tasks Ideally, handle worst-case load. Too expensive. Grace degradation. Constant overhead. If overhead increases as offered load increases, eventually consumes all CPU. Throughput Defined as rate delivered to ultimate consumer. Should keep up with offered load up to MLFRR, and never drop below. Also must allow transmission to continue. Latency and Jitter Even during high load, avoid long queues. Avoid bursty scheduling, which increases jitter. (Why jitter bad?) Fair allocation Must continue to process other tasks.

13. Interrupt-Driven Scheduling and Its Consequences

14. Problems Three kinds of problems: Receive livelock under overload Increased latency for packet delivery or forwarding Starvation of transmission What causes these problems? Arise from interrupt subsystem not being a component of the scheduler.

15. Description of an Interrupt-Driven System Based on 4.2 BSD, others similar. Network interface signals packet arrival by raising an interrupt. Interrupt handler in device driver: Performs some initial processing. Places packet on queue. Generates a software interrupt (at lower IPL) to do the rest. No scheduler participation. Some amortization is done by batching of interrupts. How is batching different from polling? But under heavy load, all time still spent at device IPL. Incoming packets given absolute priority. Design based on early adapters with little memory. Not appropriate for modern devices.

16. Explain

17. Receive Livelock System can behave in one of three ways as load increases: Ideal: throughput always matches offered load. Realizable: throughput goes up to MLFRR, then constant. Livelock: Throughput goes down with offered load. What is the effect of better performance? What is the effect of batching? Fundamental problem is not performance, but priorities/scheduling.

18. Receive Latency under Overload Interrupts usually thought of as way to reduce latency. Burst arrives: First, link-level processing of whole burst, Then higher-level processing of packet. May result in bad scheduling. NFS RPC requires disk. Experiment: Link-level processing at device IPL, including copying packet into kernel buffers (no DMA) Further processing following a software interrupt, locating process, queuing packet for delivery to this process Awakening user process, copy packet into its own buffer

19. Receive Latency under Overload Latency to deliver first packet to user application almost linear: one-packet burst: 1.23 ms two-packet burst: 1.54 ms four-packet burst: 2.02 ms 16-packet burst: 5.03 ms Can we expect total lack of effect of burst on latency?

20. Starvation of Transmits under Load Context is routers/forwarding Transmission is usually done at lower priority than receiving. Idea is to minimize packet loss during burst. However, under load, starvation can occur.

21. Avoiding Livelock Through Better Scheduling

22. Avoiding Livelock Through Better Scheduling Control rate of interrupts Polling-based mechanisms to ensure fair allocation of resources. Techniques to avoid unnecessary preemption of downstream packet processing.

23. Limiting Interrupt Rate Minimize work in packets that will be dropped. Disable interrupts when can�t handle load. When internal queue is full, disable. Re-enable when buffer space available, or after a delay. (Which is better, in general?) Guaranteeing some progress for user-level code. Time how long spent in packet-input code, disable if too much. Can simulate by using clock interrupt to sample state. Related question: How does the OS compute CPU usage? How about profiling?

24. Use of Polling When tasks behave unpredictably, use interrupts. When behave predictably, use polling. Also poll to get fair allocation by using RR.

25. Avoiding Preemption Livelock occurs because interrupts preempt everything else. Solution is to run downstream at same IPL: Run (almost) everything at a high IPL Run (almost) everything at low IPL Which is better? Interrupt handler only sets flag, and schedules the polling thread. Polling thread enables interrupts only when done.

26. Summary Avoid livelock by: Use interrupts only to initiate polling. Use RR polling to fairly allocate resources among sources. Temporarily disabling input when feedback from a full queue, or a limit on CPU usage indicates other important tasks are pending. Dropping packets early, rather than late, to avoid wasted work. Once we decide to receive a packet, try to process it to completion. Maintain high performance by: Re-enabling interrupts when no work is pending, to avoid polling overhead and to keep latency low. Letting the receiving interface buffer bursts, to avoid dropping packets. Eliminate the IP input queue, and associated overhead.

27. Livelock in BSD-Based Routers

28. Livelock in BSD-Based Routers IP packet router built using Digital UNIX. Goals Obtain highest possible maximum throughput. Maintain throughput even when overloaded. Allocate sufficient CPU cycles to user-mode tasks. Minimize latency. Avoid degrading performance in other apps.

29. Measurement Methodology Host-based router connecting two Ethernets. Source host generated UDP packets carrying 4 bytes of data. Used a slow Alpha host, to make livelock more evident. Tested both pure kernel, and kernel plus user-mode component (screend). Throughput (Y-axis) is output rate.

30. What�s MLFRR? Is it really? Where does livelock occur? Why is screend worse than pure kernel?

31. Why Livelock Occurs in the 4.2 BSD Model Should discard as early as possible.

32. Fixing the Livelock Problem Drivers register with polling system. Polling system notices which interfaces need processing, and calls the callbacks with quota. Received-packet callback calls the IP processing.

33. Results of Modifications Why is the slope gradual in one, not so gradual in the other?

34. Feedback from Full Queues Detect when screend queue is full. Quota was 10, screend queue was 32, 25% and 75% watermarks.

35. Choice of Quota Smaller quotas work better. (Why?)

36. Overlap

37. Sensitivity of Quota Peak rate slightly higher with larger quota.

38. With screend

39. Guaranteeing Progress for User-Level Processes

40. Modification Use performance counter to measure how many CPU cycles spent per period in packet-processing. If above some threshold, then disable input handling.

41. Why discrepancy? Why the dip?

42. Future Work Selective packet dropping Packets have different value Interactions with application-level scheduling Reduce latency for currently schedule process During overload favor packets destined for current process. Run process with most work to do.

43. Summary Must be able to discard input with 0 or minimal overhead. Balance interrupts and polling. Felt that the solutions were all a little ad hoc. Perhaps a more general, end-to-end system could be created. Might eliminate need for tuning.

Eliminating Receive Livelock in an Interrupt-Driven Kernel

Eliminating Receive Livelock in an Interrupt-Driven Kernel

Presentation Transcript

ELIMINATING HEALTH DISPARITIES IN AN URBAN AREA

Interrupt

Interrupt-Driven I/O

Deadlock and Livelock

Deadlock and Livelock

Deadlock and Livelock

Eliminating Receive Livelock in an Interrupt-driven Kernel

Interrupt

Eliminating doping in Sport: An impossible task ?

Interrupt driven I/O

Interrupt-Driven State Machine

Polled IO versus Interrupt Driven IO

Interrupt-Driven Input/Output

Interrupt Driven I/O on the Mano CPU

Interrupt

Interrupt

Random Testing of Interrupt-Driven Software

Dataflow Analysis for Interrupt-driven Microcontroller Software

TestRank: Eliminating Waste from Test-Driven Development

Interrupt