1 / 78

OS II: Dependability & Trust SWIFI-based OS Evaluations

OS II: Dependability & Trust SWIFI-based OS Evaluations. Prof. Neeraj Suri Stefan Winter Dept. of Computer Science TU Darmstadt, Germany. Dependable Embedded Systems & SW Group www.deeds.informatik.tu-darmstadt.de. So far: Verification & Validation Testing Techniques Static vs. Dynamic

zeke
Télécharger la présentation

OS II: Dependability & Trust SWIFI-based OS Evaluations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. OS II: Dependability & TrustSWIFI-based OS Evaluations Prof. Neeraj Suri Stefan Winter Dept. of Computer Science TU Darmstadt, Germany Dependable Embedded Systems & SW Group www.deeds.informatik.tu-darmstadt.de

  2. So far: Verification & Validation Testing Techniques Static vs. Dynamic Black-box vs. White-box Last time: Fault Injection (FI) Applications Techniques Some FI tools Today: Testing (SWIFI) of operating systems WHERE: Error propagation in OSs [Johansson’05] WHAT: Error selection for testing [Johansson’07] WHEN: Injection trigger selection [Johansson’07] Next lecture: Profiling the OS extensions (state change @ runtime) Fault Detection: Software Testing

  3. FI Recap Fault Injection (FI) is the process of either inserting bugs into your system or exposing your system to operational perturbations • FI applications for dependable system development • Defect Count Estimation (Fault Seeding) • Test Suite Evaluation (Mutation Testing) • Security Testing • Experimental Dependability Evaluations • FI techniques • Physical FI • HW FI • Simulated FI • SWIFI

  4. FI Recap (cont.) • Where to apply change (location, abstraction/system level) • What to inject (what should be injected/corrupted?) • Which trigger to use (event, instruction, timeout, exception, … ?) • When to inject (on first/second/… trigger event) • How often to inject (Heisen-/Bohrbugs) • … • What to record & interpret? For what purpose? • How is the system loaded at the time of the injection • Applications running and their load (workload) • System resources • Real  realistic  synthetic workload

  5. Outline for today‘s lecture • Drivers - a major dependability issue in commodity OSs • An error propagation view • FI-based robustness evaluations of the kernel • Black box assumption • Fault representativeness vs. failure relevance • Design and implementation issues of a suitable FI framework • Fault modeling • Failure modeling • Workloads

  6. The problem: Drivers! • Device drivers • Numerous: 250 installed (100 active) drivers in XP/Vista • Large & complex:70% of Linux code base • Immature: every day 25 new / 100 revised versions Vista drivers • Access Rights: kernel mode operation in monolithic OSs • Device drivers are thedominant causeof OS failuresdespite sustained testing efforts Causes of WinXP outages Causes of Win2k outages

  7. The problem (cont.) • Problem statement:Driver failures lead to OS API failures • Mitigation approaches • Harden OS robustness • Improve driver reliability

  8. The problem (cont.) The problem in terms of error propagation The effect of robustness hardening in terms of error propagation The effect of testing in terms of error propagation

  9. Issues with the driver testing approach What if the driver is not the root cause? What if we cannot remove defects (e.g. commercial OSs)?

  10. Issues with the hardening approach What if we cannot remove robustness vulnerabilities? More issues with the hardening approach in next week‘s lecture...

  11. FI-based robustness evaluations • Fault containment wrappers are expensive • Additional code is an additional source of bugs • Runtime overhead for error checks • Where should we add fault containment wrappers? • Where errors with critical effects are likely to occur • Where propagation is likely • Where critical errors propagate • How do we know where which errors propagate? • Propagation analysis (cf. PROPANE)

  12. A C Increasingly bad E B D F Robustness Evaluations A C E B D F ! !

  13. Robustness Evaluations • Experimental technique to ascertain “vulnerabilities” • Identify (potential) sources, error propagation & hot spots, etc. • Estimate their “effects” on applications • Component enhancement with “wrappers” • if (X > 100 && Y < 30) then Exception(); • Location of wrappers • Aspects • Metrics for error propagation profiles • Experimental analysis

  14. System Model ? Applications Operating System Drivers

  15. Exported Imported dsx.1 … dsx.m osx.1 … osx.n Driver X Hardware Device Driver • Model the interfaces (defined in C) • Export (functions provided by the driver) • Import (functions used by the driver)

  16. Metrics Three metrics for profiling • Propagation - how errors flow through the OS • Exposure - which OS services are affected • Diffusion - which drivers are the sources • Impact analysis • Metrics • Case study (WinCE) • Results

  17. Service Error Permeability 1. Service Error Permeability: • Measure one driver’s influence on one OS service • Used to study service-driverrelations

  18. OS Service Error Exposure 2. OS Service Error Exposure: • An application uses certain services • How are these services influenced by driver errors? • Used to compare services

  19. Driver Error Diffusion 3. Driver Error Diffusion: • Which driver affects the system the most? • Used to compare drivers

  20. Test App Case Study: Windows CE • Targeted drivers • Serial • Ethernet • FI at interface • Data level errors • Effects on OS services • 4 Test applications Manager Host OS Interceptor Drivers Drivers Drivers Target Driver

  21. Error Model • Data level errors in OS-Driver interface • Wrong values • Based on the C-type • Boundary • Special values • Offsets • Transient • First occurrence

  22. Impact Analysis • Impact ascertained via failure mode analysis • Failure classes: • Class NF: No visible effect • Class 1: Error, no violation • Class 2: Error, violation • Class 3: OS Crash/Hang ?

  23. Error Model LONG RegQueryValueEx([in] HKEY hKey, [in] LPCWSTR lpValueName, [in] LPDWORD lpReserved, [out] LPDWORD lpType, [out] LPBYTE lpData, [in/out] LPDWORD lpcbData);

  24. Service Error Permeability • Ethernet driver • 42 imported svcs • 12 exported svcs • Most Class 1 • 3 Crashes (Class 3)

  25. OS Service Error Exposure • Serial driver • 50 imported svcs • 10 exported svcs • Clustering of failures

  26. Higher diffusion for Ethernet Most Class NF Failures at boot-up Driver Error Diffusion

  27. Error Models: “What to Inject?” • FI’s effectiveness arises based on the chosen error model being (a) representative of actual errors, and (b) effectively triggering “vulnerabilities”. • Comparative evaluation of “effectiveness” of different error models: • Fewest injections? • Most failures? • Best “coverage”? • Propose a composite error model for enhancing FI effectiveness

  28. Chosen Drivers & Error Models Error Models: • Data-type (DT) • Bit-flips (BF) • Fuzzing (FZ)

  29. Error Models – Data-Type (DT) Errors int foo(int a, int b) {…} int ret = foo(0x45a209f1, 0x00000000);

  30. Error Models – Data-Type (DT) Errors int foo(int a, int b) {…} int ret = foo(0x45a209f1, 0x00000000); 0x80000000

  31. Error Models – Data-Type (DT) Errors int foo(int a, int b) {…} int ret = foo(0x80000000, 0x00000000); • Varied #cases depending on the data type • Requires tracking of the types for correct injection • Complex implementation but scales well

  32. Error Models – Data-Type (DT) Errors

  33. Error Models – Bit-Flip (BF) Errors int foo(int a, int b) {…} int ret = foo(0x45a209f1, 0x00000000);

  34. Error Models – Bit-Flip (BF) Errors int foo(int a, int b) {…} int ret = foo(0x45a209f1, 0x00000000); 1000101101000100000100111110001

  35. Error Models – Bit-Flip (BF) Errors int foo(int a, int b) {…} int ret = foo(0x45a209f1, 0x00000000); 1000101101000100000100111110001 1000101101000101000100111110001

  36. Error Models – Bit-Flip (BF) Errors int foo(int a, int b) {…} int ret = foo(0x45a289f1, 0x00000000); 1000101101000101000100111110001 • Typically 32 cases per parameter • Easy to implement

  37. Error Models – Fuzzing (FZ) Errors int foo(int a, int b) {…} int ret = foo(0x45a209f1, 0x00000000);

  38. Error Models – Fuzzing (FZ) Errors int foo(int a, int b) {…} int ret = foo(0x45a209f1, 0x00000000); 0x17af34c2

  39. Error Models – Fuzzing (FZ) Errors int foo(int a, int b) {…} int ret = foo(0x17af34c2, 0x00000000); • Selective #cases • Simple implementation

  40. Comparison Compare Error Models on: • Number of failures • Effectiveness • Experimentation Time • Identifying services • Error propagation

  41. Failure Classes & Driver Diffusion

  42. Failure Classes & Driver Diffusion Driver Diffusion: a measure of a driver’s ability to spread errors:

  43. Number of Failures (Class 3)

  44. Failure Classes & Driver Diffusion Driver Diffusion (Class 3)

  45. Experimentation Time

  46. Which OS services can cause Class 3 failures? Which error model identifies most services (coverage)? Is some model consistently better/worse? Can we combine models? Identifying Services (Class 3)

  47. Which OS services can cause Class 3 failures? Which error model identifies most services (coverage)? Is some model consistently better/worse? Can we combine models? Identifying Services (Class 3 + 2)

  48. Bit-Flips: Sensitivity to Bit Position? [MSB] [LSB]

  49. Bit-Flips: Bit Position Profile Cumulative #services identified

  50. Fuzzing – Number of injections?

More Related