1 / 70

Automated Test Case Generation to Validate Non-functional Software Requirements

Automated Test Case Generation to Validate Non-functional Software Requirements. Dissertation Proposal. Pingyu Zhang Spring 2013. Software Requirements. Software life cycle is bounded by requirements Functional – what a system must do

cindy
Télécharger la présentation

Automated Test Case Generation to Validate Non-functional Software Requirements

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Automated Test Case Generation to Validate Non-functional Software Requirements Dissertation Proposal Pingyu Zhang Spring 2013

  2. Software Requirements • Software life cycle is bounded by requirements • Functional – what a system must do • Non-functional – how well are the functional requirements satisfied

  3. Example: A Fitness Tracking App myTracks from Google • Feature List • routing • tracking • sharing

  4. Example: A Fitness Tracking App myTracks from Google • Feature List • routing • tracking • sharing

  5. Example Cont. Functional Requirements • Routing – e.g. Calculate a round route that includes the givenpoints of interests. • Functional Validation – run a test with 3 POIs, check if the app can generate a round route that includes all of them. Input: {Home, Avery Hall, Capitol}

  6. Example Cont. Non-functional Requirements • Performance Requirement • Non-functional Validation – check if the program can generate a round route in acceptable time Input: {Home, Avery Hall, Capitol} • response time < T • T = 5 seconds • Resp. Time = 3 seconds

  7. Example Cont. Non-functional Requirements Load Testing Input: {Home, Avery Hall, Capitol} Input: {Lincoln Children’s Zoo, Antelope Park, 48th & Normal, Holmes Park, 56th & Pioneer, 48th & Hwy2, 27th & Pioneer…}

  8. Example Cont. Load Testing: the Conventional Way • Goal: find performance faults • Process • Black box • Induce load through input rate, input size… • Check against performance requirement Load Testing Input: {Lincoln Children’s Zoo, Antelope Park, 48th & Normal, Holmes Park, 56th & Pioneer, 48th & Hwy2, 27th & Pioneer…} • response time < T

  9. Conventional Approach Missed: Highly Dependent On Inputs Values Test 1 Input: 20 geocache locs randomly chosen from Geocache.comin 68508 Test 2 Input: 20 locs generated with my load testing approach T=20 sec Response Time: 128 sec Response Time: 11 sec

  10. Conventional Approach Missed: Highly Dependent On Inputs Values • Response time to find a route for 20 POIs ranges from 11 seconds to 128 seconds (12X) (for 100 POIs the difference is 55X) • Depends on the location of the POIs – the particular inputs values can matter as much as the size Test 1 Input: 20 geocache locs randomly chosen from Geocache.comin 68508 Test 2 Input: 20 locs generated with my load testing approach Response Time: 128 sec Response Time: 11 sec

  11. It Missed More Than That… • Highly dependent on inputs values • Response time for 20 POIs ranges from 11sec to 128sec • Increasing inputs size is too expensive • Increase 30sec jzlibresp. time means going from a 1MB to a 75MB input • Increasing input size is doing more of the same • Increasing tables size does not reveal new behavior in query application • Missing other resources • Memory & energy constraints on mobile platform We want tests that induce high-loads by selecting the right values, that exercise a diversity of paths, that may target a variety of resources

  12. Example: A Fitness Tracking App myTracks from Google • Feature List • routing • tracking • sharing

  13. Example: A Fitness Tracking App myTracks from Google • Feature List • routing • tracking • sharing

  14. Example Cont. Functional Requirements • Tracking – e.g. workout with the app turned on and record the activity. • Functional Validation – check if the app captured the route, speed, elevation, etc. Input: locationdata obtained by calling GPS related APIs

  15. Example Cont. Non-functional Requirements • Tracking – e.g. workout with the app turned on and record the activity. • Non-functional Validation – check if the app can produce correct data under unusual conditions – tunnels, roofs, woods, etc. Input: locationdata obtained by calling GPS related APIs or

  16. Example Cont. Exception Handling for Unusual Conditions GPS API many more…

  17. Example Cont. Exception Handling for Unusual Conditions GPS API Exception Handling many more…

  18. Example Cont. To Validate Exception Handling Code GPS API Exception Handling

  19. Example Cont. To Validate Exception Handling Code Mocking Device • A mocking device to inject exceptions while executing tests GPS API Exception Handling

  20. Example Cont. To Validate Exception Handling Code Mocking Device • A mocking device to inject exceptions while executing tests • Capable of simulating the noisy nature of external resources GPS API Exception Handling simulate

  21. Mocking Support in Android SDK android.test.mock – throws exceptions on every invocation Example DIY DIY Official DIY Question Complaint Complaint Complaint

  22. Software Requirements • Software life cycle is bounded by requirements • Functional – what a system must do • Non-functional – how well are the functional requirements satisfied Why do non-functional requirements matter?

  23. Importance of Non-functional Requirements • Google – people move away from you if your web is loading 250 milliseconds slower than your competitor [New York Times, Feb 2012]. • Netflix – the entire API is re-designed to improve performance [Netflix Report, Mar 2012]. • Oracle – the cost of fixing a performance problem at the end of development cycle account for 25% of total cost [Oracle Report, Jan 2013].

  24. Importance of Non-functional Requirements Cont What if exception handling is not done correctly? corrupted data crashing app

  25. Exception Handling IsNot A Trivial Problem 27% - poor exception handling code; 17% - interactions with external resources

  26. Non-functional ValidationState of practice • Not Enough Testing Resource? • Functional Only! • Enough Testing Resource? • Functional First! • Why It That? • No Cost-effective Ways!

  27. We Propose to Improve Non-functional Validation • For load testing • automatically generate load test by exhaustively traversing program paths • For exception handling • amplify existing tests to exhaustively explore new exceptional behaviors Exhaustive white-box testing techniques can cost-effectively validate non-functional requirements.

  28. Research Progress So Far… Software Requirement Validation Functional Non-functional Exception Handling Load Testing for single programs ASE11’ for software pipelines ISSTA12’ best paper award ICSE12’ Extension of ICSE12’ in preparation

  29. Research Progress So Far… Software Requirement Validation Functional Non-functional Exception Handling Load Testing for single programs ASE11’ for software pipelines ISSTA12’ best paper award ICSE12’ Extension of ICSE12’ in preparation

  30. White-box Load TestingRevisiting the objective • Highly dependent on inputs values • Response time for 20 POIs ranges from 11sec to 128sec • Increasing inputs size is too expensive • Increase 30sec jzlibresp. time means going from a 1MB to a 75MB input • Increasing input size is doing more of the same • Increasing tables size does not reveal new behavior in query application • Missing other resources • Memory & energy constraints on mobile platform We want tests that induce high-loads by selecting the right values, that exercise a diversity of paths, that may target a variety of resources

  31. White-box Load TestingIntuition Pick the longestpath computational space

  32. White-box Load TestingIntuition Brute force approach: traverse all paths and return the longest one. But first, how to systematically traverse program paths? computational space

  33. foo(int x, int y) { z = 2*x; if (z == x) if (x > y+8) print(“Hi”) } Symbolic Execution(since 1976) • Goal: A test input for every program path • Use symbolic test generation to explore program paths • Widely used in automated software testing: DART, CUTE, EXE, JPF, … 2*y == x 2*y == x 2*y == x F F F T T T x > y + 8 x > y + 8 x > y + 8 F F F T T T PC: 2y ≠ x Input: x=0, y=1 PC: 2y = x ∧ x ≤ y+8 Input: x=1, y=2 PC: 2y = x ∧ x > y+8 Input: x=-10, y=-20

  34. Findings Long Paths with Symbolic Execution • Brute force approach • Generate every path on N inputs • Return input for the longest path • Cannot scale • With 5 POIs, a full symbolic execution reveals 142,352possible paths, and takes 171 min • With 6 POIs, full SE fails to finish in 4 hours For N=5 bytecodecount <70ms 0.43~0.5sec

  35. Findings Long Paths with Symbolic Execution • Brute force approach • Generate every path on N inputs • Return input for the longest path • Cannot scale • With 5 POIs, a full symbolic execution reveals 142,352possible paths, and takes 171 min • With 6 POIs, full SE fails to finish in 4 hours Wasted Efforts Longest paths are here For N=5 bytecode count <70ms 0.43~0.5sec

  36. Revisiting the objective Not Scalable We want tests that induce high-loads by selecting the right values, that exercise a diversity of paths, that may target a variety of resources.

  37. Adapting Symbolic Execution towards Load Sensitive Paths • Directed • Favor paths according to performance measure • Explore diverse paths

  38. Adapting Symbolic Execution towards Load Sensitive Paths • Incremental • Iterative-deepening • Directed • Favor paths according to performance measure • Explore diverse paths lookAhead Frontier Diversity Check: 

  39. Adapting Symbolic Execution towards Load Sensitive Paths • Incremental • Iterative-deepening • Directed • Favor paths according to performance measure • Explore diverse paths lookAhead Frontier Diversity Check:  Step1: Split the frontier Step2: Check gap > TH C1 C2

  40. Adapting Symbolic Execution towards Load Sensitive Paths • Incremental • Iterative-deepening • Directed • Favor paths according to performance measure • Explore diverse paths lookAhead Frontier Diversity Check: 

  41. Adapting Symbolic Execution towards Load Sensitive Paths • Incremental • Iterative-deepening • Pruning on frontiers • Directed • Favor paths according to performance measure • Explore diverse paths lookAhead Frontier Diversity Check:  Frontier Diversity Check: 

  42. Adapting Symbolic Execution towards Load Sensitive Paths • Incremental • Iterative-deepening • Pruning on frontiers • Directed • Favor paths according to performance measure • Explore diverse paths lookAhead Frontier Diversity Check:  Frontier Diversity Check:  C1 C2

  43. Adapting Symbolic Execution towards Load Sensitive Paths • Incremental • Iterative-deepening • Pruning on frontiers • Directed • Favor paths according to performance measure • Explore diverse paths lookAhead Frontier Diversity Check:  Frontier Diversity Check: 

  44. Adapting Symbolic Execution towards Load Sensitive Paths • Incremental • Iterative-deepening • Pruning on frontiers • Directed • Favor paths according to performance measure • Explore diverse paths lookAhead Frontier Diversity Check:  Frontier Diversity Check:  Frontier Diversity Check: 

  45. Adapting Symbolic Execution towards Load Sensitive Paths • Incremental • Iterative-deepening • Pruning on frontiers • Directed • Favor paths according to performance measure • Explore diverse paths lookAhead Frontier Diversity Check:  Frontier Diversity Check:  Frontier Diversity Check: 

  46. ImplementationSymbolic Load Generation (SLG) • Implemented as an extension to SPF • Record & replay of paths • Path Performance Measures • Response Time: weighted bytecode count (invoke: 10, Others: 1) • Memory Usage: listens to object life cycle operations • Test Instantiation • Implemented new Yices Java API to work with JPF • Yicesappears to be better than others (choco, cvc3)

  47. Dealing with Solver LimitationsConstraint Limited Load Generation (CLLG-k) • Challenge • Load tests traverse long paths --- more constraints for solver • Every SMT solver has an limit on size of constraints it can handle efficiently • Constraint Limited Load Generation(CLLG-k) • Wrapper algorithm for SLG • Chains partial solutions together • Scalable but sacrifices test quality • Introduce a new parameter: maxSolverConstraints(k) Partial inputs generated by SLG with bound k CLLG-k

  48. Evaluation of SLG • Summary • Parameters & Environment • lookAhead=50 across programs, testSuiteSize=10 • 2.4GHz Intel Core 2 Duo, JVM 1.6, 2GB MEM

  49. RQ1: Jzlib for Response Time • Control treatment: Random • 3-hour cap enforced across runs 50MB 4.5X 100MB 3.4X

  50. RQ1: Jzlib for Response Time 25MB 100MB

More Related