Comparison of Unit-Level Automated Test Generation Tools
190 likes | 360 Vues
Comparison of Unit-Level Automated Test Generation Tools. Shuang Wang Co-authored with Jeff Offutt April 4, 2009. 1. Motivation. We have more software, but insufficient resources We need to be more efficient Frameworks like JUnit provide empty boxes
Comparison of Unit-Level Automated Test Generation Tools
E N D
Presentation Transcript
Comparison of Unit-Level Automated Test Generation Tools Shuang Wang Co-authored with Jeff Offutt April 4, 2009 1
Motivation • We have more software, but insufficient resources • We need to be more efficient • Frameworks like JUnit provide empty boxes • Hard question: what do we put in there? • Automated test data generation tools • Reduce time and effort • Easier to maintain • Encapsulate knowledge of how to design and implement high quality tests
What are our criteria? • Two commercial tools • AgitarOne • JTest • Free • Unit-level • Automated test generation • Java What’s available out there? 3
Experiment Goals and Design • Compare three unit level automatic test data generators • Evaluate them based on their mutation scores • Subjects • Three free automated testing tools • - JCrasher, TestGen4j, and JUB • Control groups • - Edge Coverage and Random Test • Metric • Mutant score results 4
Experiment Design muJava JCrasher Mutation Score JCrasher Test Set JC TestGen4J Mutation Score TestGen4J Test Set TG JUB Mutation Score P JUB Test Set JUB Mutants Manual Random Test Set Ram Random Mutation Score Manual Edge Cover Test Set EC Edge Cover Mutation Score 5
Experiment Design muJava JCrasher Mutation Score JCrasher Test Set JC TestGen4J Mutation Score TestGen4J Test Set TG JUB Mutation Score P JUB Test Set JUB Mutants Manual Random Test Set Ram Random Mutation Score Manual Edge Cover Test Set EC Edge Cover Mutation Score 6
Experiment Design muJava JCrasher Mutation Score JCrasher Test Set JC TestGen4J Mutation Score TestGen4J Test Set TG JUB Mutation Score P JUB Test Set JUB Mutants Manual Random Test Set Ram Random Mutation Score Manual Edge Cover Test Set EC Edge Cover Mutation Score 8
Subjects (Automatic Test Data Generators) Control groups • Edge Coverage • one of the weakest and most basic test criterion • Random Test • the “weakest effort” testing strategy 9
Experiment Design muJava Jcrasher Mutation Score JCrasher Test Set JC TestGen4J Mutation Score TestGen4J Test Set TG JUB Mutation Score P JUB Test Set JUB Mutants Manual Random Test Set Ram Random Mutation Score Manual Edge Cover Test Set EC Edge Cover Mutation Score 10
muJava • Create mutants • Run tests 11
Results & findings Total % Killed 12
Results & findings Efficiency 13
Example • For vendingMachine, except for edge coverage, the other four mutation scores are below 10% • MuJava creates dozens of mutants on these predicates, and the mostly random values created by the three generators have a small chance of killing those mutants 15
Example • Scores for BoundedStack were the second lowest for all the test sets except edge coverage • only two of the eleven methods have parameters. The three testing generators depend largely on the method signature, so fewer parameters may mean weaker tests 16
Example • JCrasher got the highest mutation score among the three generators • JCrasher uses invalid values to attempt to “crash” the class 17
Conclusion • These three tools by themselves generate tests that are very poor at detecting faults • Among public-accessible tools, criteria-based testing is hardly used • We need better Automated Test Generation Tools 18
Contact Shuang Wang Computer Science Department George Mason University SWANGB@gmu.edu 18