190 likes | 392 Vues
Improving Test Suites for Efficient Fault Localization . By: Benoit Baudry , Franck Fleurey , Yves Le Traon , 2006. Laleh Sh. Ghandehari. Feb. 28, 2011. Outline. Definition Diagnosis fault algorithm Test criterion for diagnosis Automatically optimize test suit
E N D
Improving Test Suites for Efficient Fault Localization By: Benoit Baudry, Franck Fleurey, Yves Le Traon, 2006 • Laleh Sh. Ghandehari • Feb. 28, 2011
Outline • Definition • Diagnosis fault algorithm • Test criterion for diagnosis • Automatically optimize test suit • Experimental validation • Laleh Sh. Ghandehari
Definition • Test : the goal is generating test data with high fault-revealing power. • Locating/Diagnosis fault : uses all available information coming from testing to locate fault. • The more information coming from testing, the more precise the diagnosis would be. • test-for-diagnosis criterion (TfD) : evaluate the “fault locating power” of test cases. The capacity of test cases to help the fault localization task. • Laleh Sh. Ghandehari
Diagnosis fault algorithm • Diagnosis accuracy • The number of statements have been to examine before finding a fault. • Tarantula approach • Faulty statements more frequently appear in the traces of the failed test cases than in the passed test cases. • Using diagnosis matrix • Laleh Sh. Ghandehari
Diagnosis matrix Diagnosis matrix Diagnosis results • Laleh Sh. Ghandehari
Trust value • For s statement: • Trust(s) the ratio between the percentage of passed test cases that execute s and the total percentage of test cases that execute s. • Trust(s) = %Passed(s) / (%Passed(s) + %Failed(s)) • Intensity(s) = Max(%Passed(s),%Failed(s)) – the higher this value is the most accurate the trust should be. • Laleh Sh. Ghandehari
Test suite’s criterion • Code coverage, base criteria. • Test suite criterion for diagnosis: • N-coverage: having at least N test cases that cover each statement of the program. • Distinguishing statements: minimize the number of indistinguishable statements in the diagnosis matrix. • Indistinguishable statements: has same value for trust and intensity. • Laleh Sh. Ghandehari
Dynamic Basic Block • The set of statements of P that is covered by the same test cases of TS. • OR • The set of statements that have identical lines in the diagnosis matrix. • B(TS): the set of dynamic basic blocks in P, distinguished by TS. • { (1,2), (3,7), (4), (5,6) } • Laleh Sh. Ghandehari
Test for diagnosis criterion • Ideal accuracy of diagnosis • Minimize the size of DBBs.(not decidable) • Maximize the number of DBBs, the max possible number is the number of static basic blocks. • Laleh Sh. Ghandehari
Test for diagnosis criterion • A test suite satisfies the TfD criterion if it maximizes the number of dynamic basic blocks distinguished in the program under test. • Laleh Sh. Ghandehari
Automatic test optimization • Bacteriologic approach • Mutation operator: Let T=[C1,..,Cn] be a test case composed of n values. Let Cibe a randomly selected value in T. The mutation operator consists in replacing Ciby a randomly generated valid value C’i: • T= [C1,.., Ci,..Cn]-> T=[ C1,.., C’i,..Cn] • Fitness function: computes the quality of a test case for a particular criterion. Let S be a test suite, the fitness value of a test case t is f(t) = f(S U {t}) – f(S) • Laleh Sh. Ghandehari
Automatic test optimization • Initial test suite as an input • Compute the fitness value for each test case. • Loop • Mutation operator generate new test cases. • Compute the fitness value for the test case, it added to the solution if it can improve the quality of the set. • Stopping criteria: A given fitness value reached, number of generation • Laleh Sh. Ghandehari
Automatic test optimization • Fitness function for test suite S: • Statement coverage : F(S) = |C(S)|/|P| • TfD : F(S) = |B(S)| • N-coverage criterion: produce N test suites that each cover all the statements and merge them – because the algorithm is random. • Laleh Sh. Ghandehari
Experimental validation • The initial test suite is optimized • Faulty version (Mutant) are generated for the program under test • For each mutant • Test cases are executed, verdicts and execution traces • Diagnosis matrix is build • Diagnosis algorithm is executed • Laleh Sh. Ghandehari
Experimental validation • System under test • 72 classes • 1478 lines • Test suit has 15 test cases • The test suite distinguishes 113 DBBs. • 346 faulty programs • Run optimization algorithm with the coverage fitness function, 4 times and merge the results. • Run optimization algorithm with the TfD fitness function, after around 150 iteration the final test suite has 186 DBBs. • Laleh Sh. Ghandehari
Results • Laleh Sh. Ghandehari
Scalability • The optimization test is really time consuming: • Using diagnosis algorithm • Laleh Sh. Ghandehari
Questions? • Laleh Sh. Ghandehari
Thank you • Laleh Sh. Ghandehari