1 / 62

Adaptive Random Test Case Prioritization

Adaptive Random Test Case Prioritization. Speaker: Bo Jiang * Co-authors: Zhenyu Zhang * , W.K.Chan † , T.H.Tse * * The University of Hong Kong † City University of Hong Kong. Contents. Background Motivation Adaptive Random Test Case Prioritization Experiments and Results Analysis

ely
Télécharger la présentation

Adaptive Random Test Case Prioritization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Adaptive Random Test Case Prioritization Speaker: Bo Jiang* Co-authors: Zhenyu Zhang*, W.K.Chan†, T.H.Tse* *The University of Hong Kong †City University of Hong Kong

  2. Contents • Background • Motivation • Adaptive Random Test Case Prioritization • Experiments and Results Analysis • Related Works • Conclusion & Future work

  3. Regression Testing Techniques Obsolete Test Case Elimination Program P Test Suite T Test Suite T’ Test Suite T’ Test Suite T’ Test Suite T • Accounts for 50% of the cost of software maintenance. Test Case Reduction Test Case Augmentation Test Case Selection Test Case Prioritization Program P’ Test Suite T’

  4. Test Case Prioritization • Definition • Test case prioritization permutes a test suite T for execution to meet a chosen testing goal. • Typical testing goals • Rate of code coverage • Rate of fault detection • Rate of requirement coverage • Merits • No impact on the fault detection ability

  5. Coverage-based Test Case Prioritization Technique • Total-statement/function/branch • Highest code coverage first • Resolve tie-case randomly • Additional-statement/function/branch • Additional highest code coverage first • Reset when no more coverage can be achieved • Resolve tie-case randomly • Disadvantages • Hard to scale to larger programs

  6. Contents • Background • Motivation • Adaptive Random Test Case Prioritization • Experiments and Results Analysis • Related Works • Conclusion & Future work

  7. Problem With Total Techniques GREP FLEX APFD Elbaum et al. @ TSE 2002

  8. Problem With Total(greedy) Techniques GREP FLEX APFD Total strategy may NOT be effective for real-life program Elbaum et al. @ TSE 2002

  9. 45 40 35 30 25 Time Used for Prioritization 20 15 10 5 0 1 2 3 4 5 6 Random Siemens Problems with Additional Techniques Total Siemens Total Unix Random Unix Additional Siemens Additional Unix

  10. 45 40 35 30 25 Time Used for Prioritization 20 15 10 5 0 1 2 3 4 5 6 Random Siemens Problems with Additional Techniques Additional Techniques may NOT be efficient for real-life programs. Total Siemens Total Unix Random Unix Additional Siemens Additional Unix

  11. 45 40 35 30 25 Time Used for Prioritization 20 15 10 5 0 1 2 3 4 5 6 Random Siemens Problems with Additional Techniques Can we find a prioritization techniques that is both effective and efficient for real life program? Total Siemens Total Unix Random Unix Additional Siemens Additional Unix

  12. Adaptive Random Testing (ART) • Adaptive Random Testing (ART) • A technique for test case generation • Evenly spread randomly generated test cases across the input domain. • In empirical study, ART can detect failures using up to 50% fewer test cases than random testing.

  13. Fixed-Sized-Candidate-Set ART Algorithm • Random generate a test case and execute it.

  14. Fixed-Sized-Candidate-Set ART Algorithm • Randomly generate a set of candidate test cases.

  15. Fixed-Sized-Candidate-Set ART Algorithm • For each candidate test case, find its nearest neighbor within the executed test cases.

  16. Fixed-Sized-Candidate-Set ART Algorithm • Select the test case which has longest distance with its nearest neighbor and execute it.

  17. Fixed-Sized-Candidate-Set ART Algorithm • Randomly generate a set of candidate test cases.

  18. Fixed-Sized-Candidate-Set ART Algorithm • For each candidate test case, find its nearest neighbor within the executed test cases.

  19. Fixed-Sized-Candidate-Set ART Algorithm • For each candidate test case, find its nearest neighbor within the executed test cases.

  20. Fixed-Sized-Candidate-Set ART Algorithm • For each candidate test case, find its nearest neighbor within the executed test cases.

  21. Fixed-Sized-Candidate-Set ART Algorithm • For each candidate test case, find its nearest neighbor within the executed test cases.

  22. Fixed-Sized-Candidate-Set ART Algorithm • For each candidate test case, find its nearest neighbor within the executed test cases.

  23. Fixed-Sized-Candidate-Set ART Algorithm • Select the test case which has longest distance with its nearest neighbor and execute it.

  24. Fixed-Sized-Candidate-Set ART Algorithm • Repeat until a failure is encountered. X

  25. Adaptive Random Testing (ART) • ART is based on the observation that failure turned to cluster across the input domain. • Intuitively, evenly spread the test case may increase the probability of exposing the first fault faster. • In test case prioritization, we also want to increase the rate of fault detection.

  26. Use ART directly for test case prioritization? • The variety of black-box input information makes it hard to define a general distance metric. • Video streams • Images • Xml • … • The white-box coverage information of the previously executed test cases are readily available • Statement coverage • Branch coverage • Function coverage • And…

  27. Distribution of Failures in Profile Space on LilyPond William Dickinson et al. @ FSE, 2001.

  28. MDS Display of Distribution of Failures in Profile Space on LilyPond Failures tend to cluster together. William Dickinson et al. @ FSE, 2001.

  29. MDS Display of Distribution of Failures in Profile Space on GCC William Dickinson et al. @ FSE, 2001.

  30. Distribution of Failures in Profile Space on GCC Failures tend to cluster together. William Dickinson et al. @ FSE, 2001.

  31. Use ART directly for test case prioritization? • The variety of black-box input information makes it hard to define a uniform distance metric. • Video streams • Images • Xml • … • The white-box coverage information of the previously executed test cases are readily available • Statement coverage • Branch coverage • Function coverage • … Why NOT use such low-cost white-box information to evenly spread test cases across the code coverage space?

  32. Contents • Background • Motivation • Adaptive Random Test Case Prioritization • Experiments and Results Analysis • Related Works • Conclusion & Future work

  33. Adaptive Random Test Case Prioritization • Generate candidate set • Random select a test case into the candidate set • If code coverage improve, continue; Otherwise, stop. • Merits: No magic number, non-parametric • Select the farthest candidate from the prioritized set • Distance between test cases • Distance between a candidate test case and the already prioritized test cases • Repeat until all test cases are prioritized

  34. Adaptive Random Test Case Prioritization • How to measure the distance of test cases • Jaccard Distance • General distance metric for binary data • Can also use other distance metric for substitution. • How to select the test case from the candidate set that is farthest away from the already prioritized test cases? • Maximize the minimumdistance (maxmin for short) • Chen et al. @ ASIAN '04, LNCS 2004 • Maximize the average distance (maxavg for short) • Ciupa et al. @ ICSE 2008 • Maximize the maximum distance (maxmax for short)

  35. Contents • Background • Motivation • Adaptive Random Test Case Prioritization • Experiments and Results Analysis • Related Works • Conclusion & Future Work

  36. Research Questions • Do different levels of coverage information have significant impact on ART techniques? • Do different definitions of test set distances have significant impacts on ART techniques? • Are ART techniques efficient?

  37. Subject Programs

  38. Techniques Studied in the Paper

  39. Experiment Setup • Dynamic coverage information collection • gcov tool • Effectiveness Metric • APFD: weighted average of the percentage of faults detected over the life of the suite • Process • For each of the 11 subject programs, randomly select 20 test suite, and repeat 50 times for each ART techniques.

  40. Research Questions • Do different levels of coverage information have significant impact on ART techniques? • Do different definitions of test set distances have significant impacts on ART techniques? • Are ART techniques efficient?

  41. Do different levels of coverage information have significant impact on ART techniques? • Fix the other variable: definitions of test set distances. • Perform multiple comparison between each pair of coverage information and gather the statistics.

  42. Do different levels of coverage information have significant impact on ART techniques? • Fix the other variable: definitions of test set distances. • Perform multiple comparison between each pair of coverage information and gather the statistics. As confirmed by previous research: Branch > Statement > Function

  43. Research Questions • Do different levels of coverage information have significant impact on ART techniques? • Branch > Statement > Function • Do different definitions of test set distances have significant impacts on ART techniques? • Is ART techniques efficient?

  44. The Impact of Test Set Distance • Fix the other variable: definitions of coverage information • Perform multiple comparison between each pair of test set distance and gather the statistics.

  45. The Impact of Test Set Distance • Fix the other variable: definitions of coverage information • Perform multiple comparison between each pair of test set distance and gather the statistics. Max-Min > Max-Avg≈ Max-Max

  46. Best ART Technique ART-br-maxmin is the best ART prioritization Technique

  47. Research Questions • Do different levels of coverage information have significant impact on ART techniques? • Branch > Statement > Function • Do different definitions of test set distances have significant impacts on ART techniques? • Max-Min > Max-Avg > Max-Max • How doesART-br-maxmincompare with greedy? • Is ART techniques efficient?

  48. Multiple Comparisons for ART-br-maxmin on Siemens

  49. Multiple Comparisons for ART-br-maxmin on Siemens Only maginal difference difference between ART-br-maxmin and traditional coverage-based techniques, and it is not statistical significant.

  50. Multiple Comparisons for ART-br-maxmin on UNIX

More Related