1 / 43

Timing-Predictable Systems - Reconciling Predictability with Performance -

Timing-Predictable Systems - Reconciling Predictability with Performance -. Lothar Thiele and Reinhard Wilhelm. Quantified: Time. Embedded controllers with hard real-time characteristics must be guaranteed to finish their tasks within deadlines .

donna-russo
Télécharger la présentation

Timing-Predictable Systems - Reconciling Predictability with Performance -

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Timing-Predictable Systems- Reconciling Predictability with Performance - Lothar Thiele and Reinhard Wilhelm

  2. Quantified: Time • Embedded controllers with hard real-time characteristics must be guaranteed to finish their tasks within deadlines. • (static) Schedulability test must be performed. • needs (upper) bounds on the execution times of all tasks • Timing Predictability provides for precise bounds

  3. Assumptions • Aiming at guarantees, i.e. need to consider all executions • Achieve Predictability not at (considerable) loss of Performance  completely (locally) deterministic systems are not the alternative • Systems too big for exhaustive approaches • Analytical approaches necessary

  4. Variability of Execution Times • is at the heart of timing unpredictability, • is introduced at all levels of granularity • Memory reference • Instruction execution • Function • Task • Distributed system of tasks • Service

  5. LOAD r2, _a LOAD r1, _b ADD r3,r2,r1 Access Times x = a + b; MPC 5xx PPC 755

  6. Timing Accidents and Penalties Timing Accident – cause for an increase of the execution time of an instruction Timing Penalty – the associated increase • Types of timing accidents

  7. Deriving Run-Time Guarantees • Static Program Analysis derives Invariants about all execution states at a program point. • Derive Safety Properties from these invariants : Certain timing accidents will never happen.Example:At program point p, instruction fetch will never cause a cache miss. • The more accidents excluded, the lower the upper bound. • (and the more accidents predicted, the higher the lower bound).

  8. History-Sensitivity of Execution Times- Problem and Chance - Contribution of the execution of an instruction to a program‘s execution time • depends on the execution state, i.e., on the execution so far, • can be bounded if strong invariants about all execution states at this instruction are available.

  9. lowerbound bestcase worstcase upperbound t interference influences causes non-determinism design design limitedanalysis limitedanalysis analysis techniques analysis techniques Bounds, Guarantees and Predictability design

  10. Worst-Case Predictability Best-Case Predictability Worst-case guarantee Lower bound Upper bound t Worst case Best case Basic Notions Uncertainty x Penalties • Message: • make systems analysable • control the penalties

  11. 200 Some published Results cache-miss penalty 60 25 30-50% 20-30% 15% 15-25% over-estimation 7-8% 4 2007 2002 2005 1995 Lim et al. Thesing et al. Souyris et al. Tan

  12. System Characteristics and Degrees of Overestimation • Airbus A380 code: • real code, synthesized from SCADE • complex processor, PPC 755 • 15 – 25% overestimation • ARTIST2 WCET Tool Challenge • small benchmark programs • simple processors, ARM7 • 7 – 8% overestimation

  13. The Goal Time predictability + performance = • minimize upper bound – lower bound • minimize WCET

  14. Compiler responsible: EPIC/VLIW Scratchpad memory Properties: Large focus Static information Complex algorithms: Heuristics required Heap Predictability Processor responsible: Superscalar Caches Properties: Small focus Dynamic information Complex hardware: High energy costs Adaptability Compiler vs. Processor – an old battle

  15. Troublesome Architectural Features • Interference between architecture components • Branch prediction – instruction cache • Shared resources • Unified caches • Register overlays • Implicit actions (memory mapped registers) • Non-predictable variability • Memory access • Operation timing • Concurrency in combination with shared resources • Superscalarity • Out-of-order execution • Multi-threading (dyn. scheduled)

  16. Penalties for Memory Accesses(in #cycles for PowerPC 755) Remember: Penalties have to be assumed for uncertainties! Tendency increasing, since clocks are getting faster faster than everything else

  17. Cache Impact of Language Constructs • Pointer to data • Function pointer • Dynamic method invocation • Service demultiplexing CORBA

  18. Cache Analysis How to statically precompute cache contents: Must Analysis:For each program point (and calling context), find out which blocks are in the cacheevery time program execution reaches this program point (through this context)

  19. Must-Cache Information Must Analysis determines safe information about cache hitsEach predicted cache hit reduces the upper bound

  20. “young” s z y x Age “old” s z x t z s x t { s } { x } { t } { y } { x } { } { s, t } { y } [ s ] Cache with LRU Replacement: Transfer for must concrete z y x t [ s ] abstract

  21. { a } { } { c, f } { d } { c } { e } { a } { d } “intersection + maximal age” { } { } { a, c } { d } Cache Analysis: Join (must) Join (must) Interpretation: memory block a is definitively in the (concrete) cache => always hit

  22. { } { x } { } {s, t } { x } { } { s, t} { y } Cache with LRU Replacement: Transfer for must under unknown access, e.g. unresolved data pointer Set of abstract cache [ ? ] If address is completely undetermined, same loss and no gain of information in every cache set! Analogously for multiple unknown accesses, e.g. unknown function pointer; assume maximal cache damage

  23. Dynamic Method Invocation • Traversal of a data structure representing the class hierarchy • Corresponding worst-case execution time and resulting cache damage • Efficient implementation [WiMa] with table lookup needs 2 indirect memory references; if page faults cannot be excluded: 2 x pf = 4000 cycles!

  24. System Layers • Distributed Operation • Inter-Task Level • Intra-Task Level • Hardware Platform Cross-LayerDependencies

  25. System-Level Performance Methods e.g. delay Worst-Case Best-Case Real System Measure-ment Simulation Analysis

  26. Difficulties ab acc b Input Stream Task Communication Task Scheduling Complex Input: - Timing (jitter, bursts, ...) - Different Event Types

  27. Processor Task ab acc b Buffer Difficulties Input Stream Task Communication Variable Resource Availability Task Scheduling Variable Execution Demand - Input (different event types) - Internal State (Program, Cache, ...) Complex Input: - Timing (jitter, bursts, ...) - Different Event Types

  28. Why is Performance Analysis of Distributed Systems Difficult? • non-deterministic environment- unpredictable input streams- data dependent behavior • interference between concurrent actions- multiple applications- sharing of limited resources- scheduling/arbitration mechanisms • local non-determinism- long-range dependencies- adaptive behavior (control loops)

  29. Case Study - Opportunities S1 6 Real-Time Input Streams - with jitter - with bursts - deadline > period 3 ECU’s with own CC’s 13 Tasks & 7 Messages - with different WCED 2 Scheduling Policies - Earliest Deadline First (ECU’s) - Fixed Priority (ECU’s & CC’s) Hierarchical Scheduling - Static & Dynamic Polling Servers Bus with TDMA - 4 time slots with different lengths (#1,#3 for CC1, #2 for CC3, #4 for CC3) S2 ECU1 CC1 S3 S6 CC3 ECU3 BUS S4 ECU2 CC2 S5 Total Utilization: - ECU1 59 % - ECU2 87 % - ECU3 67 % - BUS 56 %

  30. The Distributed System... ECU1 CC1 BUS (TDMA) S1 CC3 ECU3 FP FP S1 FP FP T1.1 PS C1.1 T1.2 PS T1.3 S2 T2.1 C1.2 EDF S3 T3.1 T2.2 C3.2 S3 T3.3 FP PS S6 T6.1 C2.1 T3.2 S6 C3.1 T4.2 ECU2 CC2 FP T5.2 C4.1 S4 T4.1 C5.1 S5 T5.1

  31. Input of Stream 3 ECU1 BUS CPU ECU3 CPU TDMA PS PS CC1 S1 T1.1 C1.1 T1.2 C1.2 T1.3 EDF S2 T2.1 C2.1 T2.2 CC3 PS C3.1 S3 T3.1 T3.3 C3.2 T3.2 S6 T6.1 ECU2 CPU CC2 T4.1 S4 C4.1 T4.2 T5.1 T5.2 S5 C5.1

  32. Output of Stream 3 ECU1 BUS CPU ECU3 CPU TDMA PS PS CC1 S1 T1.1 C1.1 T1.2 C1.2 T1.3 EDF S2 T2.1 C2.1 T2.2 CC3 PS C3.1 S3 T3.1 T3.3 C3.2 T3.2 S6 T6.1 ECU2 CPU CC2 T4.1 S4 C4.1 T4.2 T5.1 T5.2 S5 C5.1

  33. Output with Greedy Shapers ECU1 BUS CPU ECU3 CPU TDMA PS PS CC1 S1 T1.1 C1.1 T1.2 C1.2 T1.3 EDF S2 T2.1 C2.1 T2.2 CC3 PS C3.1 S3 T3.1 T3.3 C3.2 T3.2 S6 T6.1 ECU2 CPU CC2 T4.1 S4 C4.1 T4.2 T5.1 T5.2 S5 C5.1

  34. Open Cross-Layer Issues • Does it make sense to use preemptive-scheduling (intra task-level non-determinism increases, scheduling efficiency increases) ? • Uncoordinated scheduling (static and dynamic scheduling) • Distributed! control on several layers (control loops, adaptive behavior)

  35. New Threats • Trend towards adaptive systems • adapt to varying processing/communication loads • adapt speed /switch off units for energy saving • multiple levels of control and estimation! • Increases long-range timing dependencies with non-deterministic behavior

  36. System Layers • Hardware • Compiler • Task level (cf. talk offered by Sebastian Altmeyer) • Distributed operation Layering Principle: Separation of Concerns

  37. Separation of Concerns • is the Design Principle • Virtualization & Abstraction are the means: • One processor is virtualized as often as there are tasks • Limited physical memory is abstracted to almost unlimited virtual memory • Time is abstracted to #transitions of some very abstract model or even orders of magnitude • Services are abstracted from their actual location by middleware Very successful, but a disaster for predictability!

  38. Increasing Predictability • Architecture: reducing penalties, identifying architectures offering a good combination of predictability with performance • System layers: Resource-aware abstraction with resource interfaces • Development process: reducing uncertaintyMatching design with tools

  39. Resource-aware Abstraction with Resource Interfaces • Importing resource constraints into a layer • Slot assignment or available bandwidth for communication • Bounding resource consumption by design • RT CORBA limits service demultiplexing • Exporting information about resource consumption • Real-Time Scheduling needs upper bounds on tasks’ execution times and context-switch costs

  40. Dynamic System – Static Provisioning

  41. Architecture • Scratchpad memory • LRU caches • Statically Scheduled multi-threading • Parallelism instead of speculation • Static decisions instead dynamic decisions • Dealing with resources • Based on history

  42. Predictability of Memory Systems no cache scratchpad fully predictable SW-contr. cache partially frozen PRLU cache cache with LRU PRLU cache cache with FIFO, random unpredictable fully dynamic fully static cf. talk offered by Jan Reineke

  43. A New Research Agenda • Architecture design: Beyond EPIC • Programming languages/constructs • Schedulability analysis for distributed systems • Predictable real-time middleware

More Related