250 likes | 364 Vues
This presentation by Xia Cheng discusses a cutting-edge system designed to automate the generation of test data and the symbolic execution of programs. As the demand for automated testing grows due to the limitations of traditional approaches reliant on programmer intuition, this novel system addresses these challenges. It generates test data based on constraints, detects non-executable paths, and identifies program errors, improving the testing process’s reliability and efficiency. By employing symbolic execution and advanced constraint solving techniques, the system ultimately aids in better program validation and quality assurance.
E N D
A System to Generate Test Data and Symbolically Execute Programs Lori A. Clarke Presented by: Xia Cheng
Motivation • Testing program and of the need for automated systems to aid in this process is growing as important problem • The limitation of usual testing approach • A novel system to generate test data and symbolically execute programs contributes to this area
Limitation of Usual Approach • relying solely on the intuition of programmer • Creation of program assertions • Human interaction • Results may be questionable • Flaw in assertions or limitation in theorem prover, human or machine
Goal of this work • Implements a system to aid the selection of test data and the detection of program errors • Technique used: • Symbolically execute programs • Generate test data as the set of constraints
System Capabilities • Generates test data to drive execution down a program path • Detects nonexecutable program paths • Creates symbolic representations of the program’s output variables as functions of the program’s input variables. • Detects certain types of program errors.
Generating Test Data • Problem 1: • To execute any specified statement in a program is analogous to the halting problem • Solution • Analyze paths that are restricted to a maximum-loop count or number of statements • System requires that the paths be completely specified but leaves the criteria of path selection to the user
Generating Test Data • Problem 2 • To satisfy the conditional statements on the path requires that the system be capable of solving arbitrary systems of inequalities • Solution • inequalities will usually be relatively simple and often linear • Conjugate gradient method
Generating Test Data • Problem 3 • Array subscripts depend on input data A(1)=10 A(2)=0 IF(A(J).LT.5.)… • Solution • to ignore input and output statements except for the read and write variable lists.
System Overview • The subject program is represented by a directed graph call the control flow graph • In order to generate test data for a control path the variable relationship must be determined. • To generate the constraints the path is symbolically executed • whenever a conditional transfer of control is encountered one or more constraints are generated
Generating Test Data • A solution to the set of constraints is test data that will drive execution down the give path • If the set of constraints is inconsistent, then the given path is nonexecutable • Artificial constraints are temporarily created to increase the chance of detecting common programming error
Symbolic execution • executed values not assigned to variables but expression denoting the evolution of the variables. • Each constraint would be passed to an inequality solver to check its inconsistent • For example
Input Values I1->J I2->K SUBROUTING SUB(J,K) J = J + 1 IF (J.GT.K) GO TO 10 J = K – J GO TO 20 10 J = J – K 20 IF (J.GT.-1) GO TO 30 J = -J 30 RETURN END Symbolic execution
Control path (consistent):1-5, 7, 9 I1+1<=I2 SUBROUTING SUB(J,K) J = J + 1 IF (J.GT.K) GO TO 10 J = K – J GO TO 20 10 J = J – K 20 IF (J.GT.-1) GO TO 30 J = -J 30 RETURN END Symbolic execution
Control path (consistent):1-5, 7, 9 J=I2-I1-1 SUBROUTING SUB(J,K) J = J + 1 IF (J.GT.K) GO TO 10 J = K – J GO TO 20 10 J = J – K 20 IF (J.GT.-1) GO TO 30 J = -J 30 RETURN END Symbolic execution
Control path (consistent):1-5, 7, 9 I2-(I1+1) > -1 SUBROUTING SUB(J,K) J = J + 1 IF (J.GT.K) GO TO 10 J = K – J GO TO 20 10 J = J – K 20 IF (J.GT.-1) GO TO 30 J = -J 30 RETURN END Symbolic execution
Control path(inconsistent):1-3, 6-9 I1+1>I2 SUBROUTING SUB(J,K) 1 J = J + 1 IF (J.GT.K) GO TO 10 J = K – J GO TO 20 10 J = J – K 20 IF (J.GT.-1) GO TO 30 J = -J 30 RETURN END Symbolic execution
Control path(inconsistent):1-3, 6-9 J=I1+1-I2 SUBROUTING SUB(J,K) J = J + 1 IF (J.GT.K) GO TO 10 J = K – J GO TO 20 10 J = J – K 20 IF (J.GT.-1) GO TO 30 J = -J 30 RETURN END Symbolic execution
Control path(inconsistent):1-3, 6-9 I1+1-I2<=-1 SUBROUTING SUB(J,K) J = J + 1 IF (J.GT.K) GO TO 10 J = K – J GO TO 20 10 J = J – K 20 IF (J.GT.-1) GO TO 30 J = -J 30 RETURN END Symbolic execution
Structure of the Analysis Program • The system consist of Preprocessor, Symbolic execution, Constraint simplification, Inequality solver • Preprocessor
Preprocessor • Built by data flow analysis program • DAVE translates the subject program into a list of tokens • DAVE creates a data base of information about each program unit • Symbol table, COMMON table, label table, statement flow table
Intermediate Code Phase • What does this phase do? • Before the subject program is analyzed the token list is translated into an intermediate code similar to an assembly language • The intermediate code for each statement is stored in a doubly linked list that is attached to the corresponding node of the control flow graph • Intermediate code representing a conditional statement is attached to the corresponding edge of the graph.
Intermediate Code Phase • For example
Intermediate Code Phase • Advantages • Allows the analysis to be more easily adapted to other languages • Easy to fold constants and simplify the variable representation during analysis • Enable future optimization and detection of parallelism in the code
Path Selection • Static selection • A path is designated by a sequence of subprogram names, statement numbers, and loop counts. • Each path must satisfy the conditions: • It must be a control path • It can enter or return from a subprogram only when the corresponding code contains a procedure reference or return • Whenever a path enters a program unit the initial statement must be the first executable statement in the program unit • For example
Path Selection • Interactive selection (human oriented) • User designates the starting subprogram unit • User chooses one of exit nodes