220 likes | 331 Vues
This paper explores the challenges of test suite prioritization within software development, particularly regarding time constraints. Traditional prioritization methods often overlook overall time budgets, leading to inefficiencies. We introduce a time-aware prioritization approach that seeks to maximize the detection of errors within limited time frames, utilizing various Knapsack solvers. Our experimental results reveal the effectiveness and efficiency of different solvers, demonstrating trade-offs between coverage and execution time. The study emphasizes choosing the right prioritizer based on specific test suite characteristics.
E N D
Efficient Time-Aware Prioritization with Knapsack Solvers Sara AlspaughKristen R. WalcottMary Lou Soffa University of Virginia Michael BelanichGregory M. Kapfhammer Allegheny College ACM WEASEL Tech
Test Suite Prioritization • Testing occurs throughout software development life cycle • Challenge: time consuming and costly • Prioritization: reordering the test suite • Goal: find errors sooner in testing • Doesn’t consider the overall time budget • Alternative: time-aware prioritization • Goal 1: find errors sooner in testing • Goal 2: execute within time constraint
Motivating Example Original test suite with fault information T1 4 faults 2 min. T3 2 faults 2 min. T4 6 faults 2 min. T2 1 fault 2 min. Assume: - Same execution time - Unique faults found Prioritized test suite T4 6 faults 2 min. T1 4 faults 2 min. T3 2 faults 2 min. T2 1 fault 2 min. Testing time budget: 4 minutes
n P n P ¤ c x · t t i i ¤ x i 1 i i t = i 1 m a x = x i i t i m a x i c i The Knapsack Problem for Time-Aware Prioritization Maximize: , where is the code coverage of test and is either 0 or 1. Subject to the constraint: where is the execution time of test and is the time budget.
The Knapsack Problem for Time-Aware Prioritization Assume test cases cover unique requirements. T1 4 lines 2 min. T2 1 line 2 min. Time Budget: 4 min. T3 2 lines 2 min. T4 5 lines 2 min. Total Value: Space Remaining: 5 2 min. 9 0 min. 0 4 min.
The Extended Knapsack Problem • Value of each test case depends on test cases already in prioritization • Test cases may cover same requirements T1 4 lines 2 min. T1 0 lines 2 min. T2 1 line 2 min. Time Budget: 4 min. T3 2 lines 2 min. T4 5 lines 2 min. Total Value: Space Remaining: 5 2 min. 7 0 min. 0 4 min. UPDATE
Goals and Challenges • Evaluate traditional and extended knapsack solvers for use in time-aware prioritization • Effectiveness • Coverage-based metrics • Efficiency • Time overhead • Memory overhead • How does overlapping code coverage affect results of traditional techniques? • Is the cost of extended knapsack algorithms worthwhile?
The Knapsack Solvers • Random: select tests cases at random • Greedy by Ratio: order by coverage/time • Greedy by Value: order by coverage • Greedy by Weight: order by time • Dynamic Programming: break problem into sub-problems; use sub-problem results for main solution • Generalized Tabular: use large tables to store sub-problem solutions
The Knapsack Solvers (continued) • Core: compute optimal fractional solution then exchange items until optimal integral solution found • Overlap-Aware: uses a genetic algorithm to solve the extended knapsack problem for time-aware prioritization
j k ³ ´ t t ¸ £ £ m a x m a x c c 1 2 h i T T t t [ ] 1 2 T 1 1 1 2 ¡ x n x ; : : : c c c T T x 1 ¸ 2 ¸ ¸ ; ; n i 1 : : : t t t 1 2 n The Scaling Heuristic • Order the test cases by their coverage-to-execution-time ratio such that: • If , then it is possible to find an optimal solution that includes . • Check the inequality for each test case until it no longer holds. • belong in the final prioritization.
Implementation Details TestSuite(T) Test Transformer Program Under Test (P) New TestSuite(T ’) Knapsack Solver CoverageCalculator • Knapsack Solver Parameters1. Selected Solver2. Reduction Preference3. Knapsack Size
Evaluation Metrics • Code coverage: Percentage of requirements executed when prioritization is run • Basic block coverage used • Coverage preservation: Proportion of code covered by prioritization versus code covered by entire original test suite • Order-aware coverage: Considers both the order in which test cases execute in addition to overall code coverage
Experiment Design • Goals of experiment: • Measure efficiency of algorithms and scaling in terms of time and space overhead • Measure effectiveness of algorithms and scaling in terms of three coverage-based metrics • Case studies: • JDepend • Gradebook • Knapsack Size • 25, 50, and 75% of execution time of original test suite
Summary of Experimental Results • Prioritizer Effectiveness: • Overlap-aware solver had highest overall coverage for each time limit • Greedy by Value solver good for Gradebook • All Greedy solvers good for JDepend • Prioritizer Efficiency: • All algorithms took small amount of time and memory except for Dynamic Programming, Generalized Tabular, and Core • Overlap-aware solver required hours to run • Generalized Tabular had prohibitively large memory requirements • Scaling heuristic reduced overhead in some cases
Conclusions • Most sophisticated algorithm not necessarily most effective or most efficient • Trade-off: effectiveness versus efficiency • Efficiency or effectiveness most important? • Effectiveness overlap-aware prioritizer • Efficiency low-overhead prioritizer • Prioritizer choice depends on test suite nature • Time versus coverage of each test case • Coverage overlap between test cases
Future Research • Use larger case studies with bigger test suites • Use case studies written in other languages • Evaluate other knapsack solvers such as branch-and-bound and parallel solvers • Incorporate other metrics such as APFD • Use synthetically generated test suites
Thank you! Questions? http://www.cs.virginia.edu/walcott/weasel.html