10 likes | 152 Vues
The PRECIS (Predicate Clustering for Invariant Synthesis) technique innovatively generates program invariants by leveraging program path information, thus enhancing the inference process through statistical analysis of dynamic data. Unlike traditional methods, PRECIS retains contextual relevance, addresses pointer aliasing, and scales better by avoiding costly tasks typical in static analysis. This paper outlines the algorithm's flow, including instrumenting programs, employing regression strategies, and clustering paths for invariant synthesis, showcasing its effectiveness through benchmark programs and their invariant coverage.
E N D
PRECIS: Inferring Invariants Using Program Path Guided Clustering ParthSagdeo, VirajAthavale, SumantKowshik, ShobhaVasudevanCoordinated Sciences Laboratory, University of Illinois at Urbana-Champaign Test cases Clusters with output functions Trace data Data generation Predicate Clustering Invariant generation Program source Tool Flow Invariants • Introduction • PRECIS(PREdicate Clustering for Invariant Synthesis) is an invariant generation technique that uses program path information to guide the statistical analysis of dynamic data. In contrast to existing approaches, PRECIS: • Gains context lacking in other dynamic tools such as DAIKON by using statically generated program path information • Resolves pointer aliasing, which is not possible through purely symbolic approaches. • Remains more scalable than static approaches by not performing expensive tasks such symbolic execution or theorem proving. • The PRECIS Algorithm • The major steps in the PRECIS flow are to: • Instrument the program to capture inputs, target outputs and path conditions. • Use a regression strategy to infer a linear relationship from the inputs to the function to each output. • Cluster program paths to generate succinct invariants. • Paths are examined as candidates for clustering with neighboring groups. • Among the neighboring groups, those that represent a common input-output behavior are clustered. • At the end of this process, each cluster represents a single input-output invariant common to all paths in the cluster. Results Summary of benchmark programs and PRECIS invariant coverage. Example inta, b, c; intmin = a, max = a; if (min > b) //p0 min = b; else if(max < b) //p1 max = b; if(min > c) //p2 min = c; else if(max < c) //p3 max = c; Example Trace Data: The result after running a number of test vectors. The tuples are grouped by their predicate word (1X01, 1X00, and 0100). A graph of the dependence of invariant quality (path coverage) on the test suite size. PRECIS converges to 100% path coverage quickly – within 100 to 2000 test executions Example Source Code: Computes the min and max of three ints. All inputs (a,b,c), predicates (p0,p1,p2,p3), and outputs (min, max) are instrumented. Predicate Groups After Regression: The result after a linear regression is performed on the predicate groups with enough support: 1X01 and 0100. DAIKON PRECIS Rank 0: True for the given trace data but do not hold for the program in general. Rank 1: Provide useful information about outputs, but do not express outputs as exact function of other variables. Rank 2: Express an output as an exact function of inputs, but only holds for a subset of program paths Rank 3: Express the output function completely. • Applications • The invariants generated by PRECIS can be used for a variety of purposes: • Program Correctness • Developer Understanding • Documentation • Experiments • Invariants generated for the Siemens benchmarks: • replace.c: A string replacement and regex library • schedule.c: A scheduling program • BST.java: A binary search tree implementation • Each demonstrates PRECIS as a powerful approach to inferring a large number of useful invariants. Predicate Clusters after Merging: The two predicate clusters (in blue and purple) after the merging of predicate groups. Each represents a single invariant. Resulting Invariants: The invariants generated for the two predicate clusters above. Each is of the form: (predicate word) (output = lin. comb. of inputs)