520 likes | 648 Vues
This article explores McCabe’s Cyclomatic Complexity and its significance in software testing. We delve into the concept of linearly independent paths in flowgraphs, useful in defining test coverage. The relationship between closed loops, edges, and nodes is illustrated through examples. Additionally, we discuss test case generation strategies, including domain partitioning and boundary value analysis. Symbolic execution is highlighted as a method to find inputs that exercise specific paths within a program, enabling effective validation of software functionality.
E N D
McCabe’s Cyclomatic Complexity Number of “linearly independent paths” • useful in defining test coverage (See later) • Counts the number of closed loops in the graph • FA() = 0 • Fs(m1,m2) = m1 + m2 • FC(m1,m2) = m1 + m2 + 1 • Fl(m1) = m1 + 1 v(P) = #edges - #nodes +2 (Familiar?)
McCabe: Example Edges = 12 Nodes = 10 v = 12 - 10 + 2 = 4
More generally... • Can define a set of prime flowgraphs • those which cannot be broken down by nesting • corresponding to the statements of the langauge • And a measure for each • Yields a Prime Decomposition Theorem: • “The decomposition of a flowgraph into primes is unique”
A more general approach to CFGs • For any language, a Prime Flowgraph is one which cannot be broken down by sequencing or nesting ... if then repeat until cases ??
Hierarchical measures (again) • Define measure for each prime flowgraph • Define measure for sequencing • Define measure for nesting Eg. number of nodes: nd(P) = #nodes in P, for each prime
Example: Structuredness • Whether a program is structured can be seen as a measure as follows: str(P) = 1 if P is one of the allowed primes 0 otherwise str(F1;...Fn) = min(str(F1),...,str(Fn) str(F(F1,...,Fn)) = min(str(F),str(F1),...,str(Fn))
Linearly Independent Paths • The vector representation of a path is a vector which counts the number of occurrences of each edge. • A set of paths is l.i. if none can be represented as a linear combination of the others (in the vector representation).
1 2 3 4 5 6 7 9 11 8 10 12 First number each edge A path can be represented as a vector counting edges visited A B C D (1,0,1,0,1,1,0,1,0,0,0,1) (1,0,1,0,1,0,1,0,1,1,1,1) (1,0,1,0,1,0,1,0,0,0,1,1) (0,1,0,1,1,1,0,1,0,0,0,1)
Now can add and subtract vectors: Eg. D-A = (-1,1,-1,1,0,0,0,0,0,0,0,0) E -1 1 So E=B+D-A -1 1
How do we find test sets? • Given a test strategy it is not easy to find test cases that exercise the required paths • Even for Statement Coverage some parts of the code may be unreachable • A single path can achieve Branch Coverage for: while(...) do “some complex program” but unlikely to be possible in practice
Domain Partitioning What have we been doing? • Partitioning input space according to some property • Selecting Test case inputs which are representatives of each partition • Eg to ensure different paths executed • Assuming behaviour similar for all values of partition
Boundary Value Analysis • Also important to test software at the boundaries of the partitions. • Less than (or equal)? • length of list (or n-1)? • closure reversal (“not <” is not “>”)? • How do we identify boundaries?
Both ends closed min max Half open min max Single variable case • Open and closed intervals Both ends open min max P3 P1 P2
open boundary Multiple variables • Input domains are multi-dimensional • Boundaries are hyperplanes • Can be open or closed at each intersection closed boundary on point off point extreme point
Finding Test Cases • CFGs model software • Test strategy to select paths to test • Data flow Analysis to choose “best” test paths • Now need to find test inputs which exercise those paths
Example • Find All DU paths for example program • Find test cases which execute the paths
ADUP Usage p q CFG Program p 123 12343 1235 123435 12357 1234357 q 23 234 235 2356 43 434 435 4356 smallest(int p) (*p>2*) { int q = 2; while(p mod q > 0 AND q < sqrt p) do q := q+1 ; if (p mod q = 0) then print(q,’is factor’) else print(p,’is prime’) ; } 1 2 3 4 5 6 7 8 d u u u d u ud u u
100% coverage 123578 12343578 123568 123434358 12343568 Test Output 3 is prime 5 is prime 2 is sm fact 11 is prime 3 is sm fact ADUP p 123 1235 123435 12357 1234357 q 23 234 235 2356 43 434 435 4356 Subpaths subsumed 12357 1234357 2356 434 4356 Test Input p=3 p=5 p=4,6,8... p=4,8,12... 9,10,..15 p=9,15,21..
How were test cases found? • Required outcome at each predicate node • Consider all requirements together • Guess a value that will satisfy them • Can we improve on this!
Symbolic Execution • How to find test inputs to exercise a path? • Need certain choice at each predicate node • Give a symbolic value to each variable • Walk the path collecting requirements on symbolic input • Then have a set of inequalities to solve • Example: Find test cases for each path by symbolic execution:
Path 123578 F F p q X Y X 2 X 2 X 2 X 2 X 2 Conditions X mod 2 =0 OR 2 ge sqrt X X mod 2 > 0 Candidates X=4,6,8,... 3,4 X=3,5,7,... Solutions X=3 smallest(p) { int q = 2; while(p mod q > 0 AND q < sqrt p) do q := q+1 ; if (p mod q = 0) then print(q,’is factor’) else print(p,’is prime’) ; }
Path 12343578 Conditions X mod 2 > 0 2 < sqrt X X mod 3 = 0 OR 3 ge sqrt(X) X mod 3 > 0 Candidates X=3,5,7,... X=5,6,7.. X=3,6,9.. 3,4..9 X=4,5,7,8,.. p q X Y X 2 while (T) X 3 while (F) if (F) X is prime Solutions X=5,7 Output: 5 is prime 7 is prime
Path 123568 Conditions X mod 2 = 0 OR 2 ge sqrt X X mod 2 = 0 Candidates X=4,6,8,.. 3,4 X=4,6,8,.. p q X Y X 2 while (F) if (T) Y is sm fact Solutions X=4,6,8.. Output: 2 is sm fact
Path 12343568 Conditions X mod 2 > 0 2 < sqrt X X mod 3 = 0 OR 3 ge sqrt(X) X mod 3 = 0 Candidates X=3,5,7.. X=5,6,7.. X=3,6,9.. 3,4..9 X=3,6,9.. p q X Y X 2 while (T) X 3 while (F) if (T) Y is sm fact Solutions X=9,15,21..
Path 12343435_8 Solutions [5,7,9,11,13.. [5,7,11,13,17 [11,13,17,19.. [none from this [11,13 [must be false X=11,13 Output: 11 is prime 13 is prime p q X 2 while (T) X 3 while (T) X 4 while (F) if (_) ??????? Conditions X mod 2 > 0 2 < sqrt X X mod 3 > 0 3 < sqrt X X mod 4 = 0 OR 4 ge sqrt(X) X mod 4 ? 0 Candidates X=3,5,7.. X=5,6,7.. X=4,5,7,8.. X=10,11,12.. X=4,8,12.. 3,4..16 X=.....
Difficulties with Symbolic Execution • Generally, many paths are not feasible • Conditions can become complex: • when complex expressions on rhs of assignments • then program variables are complex expressions in terms of the symbolic vars • Sets of conditions can be computationally complex to solve
Possible Solutions • Computational Complexity: • Use numerical methods to calculate the tests • Straight line equivalents • Program Instrumentation • Adaptive testing (later) • Complex predicates • Condition/Decision strategies (later) • Many Infeasible paths • Adaptive testing (later)
Straight Line equivalents • Construct the “straight line” program corresponding to the path required. • replace predicates with path constraints • a real valued expression which records the requirement as a minimisation • Solve the path constraints using numerical methods
Path Constraints • Eg. if(x = y) is replaced by c1:= abs(x-y) • and if(x>y) is replaced by c2 := x-y • Then we must minimise the ci • Can use numerical methods to do this
Program instrumentation • generally - a method to allow testing of a unit in place by augmenting program • Here - add function calls which record value of key variables • replace predicates with calls which guarantee correct path is taken • run program to generate conditions • Again use numerical methods to solve
Conditions and Decisions • Above strategies do not take account of predicates with more than one conjunct • There are more strategies which distinguish • Conditions - the individual clauses of predicate, from • Decisions - the outcome of evaluating the whole predicate
Condition Coverage • Achieve all possible combinations of simple Boolean conditions at each decision node • In critical real-time applications over half of statements may be Boolean expressions • Several variants of strategies which account for individual conditions
Example Condition Strategies • Decision coverage (DC) • every decision tested in each possible outcome • Condition/Decision coverage (C/DC) • as above plus, every condition in each decision tested in each possible outcome • Modified Condition/Decision (MC/DC) • as above plus, every condition shown to independently affect a decision outcome (by varying that condition only) • Multiple-condition coverage (M-CC) • all possible combinations of conditions within each decision taken
Modified Condition/Decision Coverage • Multiple-condition coverage is strongest but grows exponentially in # conditions • Modified C/D is linear like C/D • Eg. For A and B • (T,T) required to exercise decision true • (F,T) required for independence of A • (T,F) required for independence of B • (F,F) not required • MC/DC (among others) is required for flight-critical commercial avionics software
Further Problems with Symb. Ex. • When loop conditions are input dependent • When array indices are input dependent • When external functions are called
Adaptive Testing The above approach has been in 4 stages: 1) Construct the control flow graph • a parsing problem - automatable • can all add “instrumentation” here 2) Choose the test paths • According to some test strategy • CFG - possibly with data flow considerations
Four stages (cont.) 3) Choose the test cases • by symbolic execution and simultaneous ineqs • or by backwards substitution • can reveal Infeasible paths requiring reverting to stage 2. 4) Execute the test cases • Only now do we execute the program • Adaptive testing merges stages 2), 3) and 4)
Problems with 4-stage approach • Infeasible paths (stage 3) require selection of new paths (return to stage 2) • Computational complexity of test case selection Adaptive testing develops test cases one at a time and uses result of previous test case execution to help select next test case
Inductive Strategies • Choose first test input x1 (perhaps at random) • Execute test and record path taken, p1 • Say k-1 tests have been done giving {(x1,p1),...(xk-1,pk-1)} • use some strategy to select xn Several such strategies exist.
Diagonalisation Important “method” in Mathematics: • Cantor’s uncountability of Reals • Godel’s Incompleteness • Undecidability of Halting problem For list of lists, find a new list by choosing an element different from each on the diagonal A11, A12, A13, ... A21, A22, A23, ... A31, A32, A33, ... ... New = B1, B2, B3, ... where B1 = A11 B2 = A22 B3 = A33 ...
Diagonalisation (2) • Each path pi gives a conjunctive predicate Pi • These predicates characterise a set of non-overlapping subdomains of the input space • We must find a new input xk not in any Pi • Let Pi be conjunction of Ci,1,Ci,2,...Ci,ki • For each i, choose xk to violate some Ci,j • eg. xk not in Ci,i
Path Prefix Strategy [Prather and Myers, IEEE Trans. SE-13(7) 1987] For Branch coverage • For a path p, define its reversible prefix q • the initial portion of p to the first decision node where the branches are not yet fully covered • A reversal of p is then any path with same reversible prefix but then a different continuation
Path Prefix Strategy (2) • Choose first input in some way and execute to give first path, p1 • Given p1,...,pk-1, let pi be path with shortest reversible prefix • Choose next input to give a reversal of pi • Execute and add the new path to set of paths
Path Prefix: earlier example • Choose first input p = 3 (say) • execution gives path p1 = 12357 • Reversible prefix = 123, Reversal = 1234.... • Deduce second input, p = 5 • execution gives path p2 = 12343578 • reversible prefix 123435 • path p1 also now has reversible prefix 1235 • choose shorter p2, Reversal = 12356 • Deduce 3rd input, p = 4 • execution gives path p3= 123568 • All branches covered
Problems with Path prefix • Still need to deduce input for new path • the inversion problem (later) • Still may get infeasible paths • absolute infeasibility - a path can never be executed • relative infeasibilty - a path cannot be the continuation of any of the current reversible prefixes
Example of relative infeasibilty Conditionals in sequence: in1 = (false,false) p1 = F,F,F reverse at 1 gives: in2 = (true,false) p2 = T,T,F reverse p1 at 2 gives F,F,T - infeasible reverse p2 at 2 gives T,T,T infeasible but T,F,T is feasible, eg in3 = (true,true) simple(bool x, y) if(x = true) then S1 else S2; if(x xor y = true) then S3 else S4; if(x and y = true) then S5 else S6; 1 2 3 - # paths to node grows exponentially - # previous nodes grows linearly
The Inversion Problem • How do we find the input which reverses the decision at Pk ? P1&...&Pk-1 D x x’ Pk not Pk
The Inversion Problem (2) • Need to find x’ given x • Done by Back Substitution • execute with x recording all states for prefix • pick change of a variable to change Pk • substitute back through program logic to calculate required input • same as for 4 step approach but with actual values • For real-valued conditions can use grad(Pk) to cross boundary via normal