530 likes | 639 Vues
Explore groundbreaking results in program slicing theory for improved accuracy and efficiency. This study presents innovative methodologies for slicing programs, with a focus on data and control flow compression. Learn about the algorithm's two-stage process and enhancements to traditional slicing methods.
E N D
New Results inProgram Slicing Aharon Abadi, Ran Ettinger, and Yishai Feldman IBM Haifa Research Lab
Context • The Programmer’s Apprentice • The Plan Calculus • Bogart • Midas • Sliding • Painless • Paz • Aderet
Improving Slice Accuracy by Compression of Data and Control Flow Paths Presented at ESEC/FSE 2009
Program Slicing Program Slice x := exp x := exp Start Slice The same sequence of values
Control-Flow Path Compression test X test X if-zero-go-to A if-zero-go-to A . . . L:test Y if-zero- go-to B go-to B . . . go-to L A:Z0 A:Z0 • Work in two stages: • Compute the ‘traditional’ slice • Control dependences • Data Dependences • - Compute the necessary branches • to prevent infeasible control paths . . . B: B:
Control-Flow Path Compression test X test X test X if-zero-go-to A if-zero-go-to A go-to B . . . • Limitations of previous approaches: • - insert all the loop; • add branches not from the program; • or • - do not preserve behavior A:Z0 L:test Y B: if-zero- go-to B go-to B . . . go-to L • This algorithm: • preserves behavior • yields a sub-program • one version may turn conditional branches into unconditional ones (“rhetorization”) A:Z0 . . . B:
Data-Flow Path Compression Start:R2:=0 Start:R2:=0 R7:=exp1 Loop: R2:=R2 + 1 compare R2, R9 if-not-less-go-to Out Temp:=R7 R7:=Temp go-to Loop Out: R0:=R7 + 1 R7:=exp1 Loop: R2:=R2 + 1 compare R2, R9 R7:=exp1 Out: R0:=R7 + 1 if-not-less-go-to Out use R7 Temp:=R7; spill R7 to memory … ; code that uses ; all registers The result is too large R7:=Temp; restore R7 The value of R7 does not depend on the loop go-to Loop Out: R0:= R7 + 1 Previous syntax-preserving algorithms insert the loop and the assignments inside it
Control-Flow Path Compression if (x<11) goto A4 x<11 F T x := x+1 goto A4 x:=x+1 goto A2 A4: if (x<9) goto A3 x<9 T F goto A2 x := x-1 goto A3 A1: if (y<T) goto A2 x:=x+2 x:=x-1 y := y–1 goto A1 y<T T F A3: x := x+2 goto A2 A2: print(x) print(x) y:=y-1 goto A
Compute the ‘Traditional’ Slice if (x<11) goto A4 x<11 x<11 F T x := x+1 goto A4 x:=x+1 x:=x+1 goto A2 A4: if (x<9) goto A3 x<9 x<9 goto A2 T F x := x-1 goto A3 A1: if (y<T) goto A2 x:=x+2 x:=x+2 x:=x-1 x:=x-1 y := y–1 goto A1 y<T y<T T F A3: x := x+2 goto A2 A2: print(x) print(x) print(x) y:=y-1 goto A
Completing Control Flow Paths:Main Lemma All paths from the same point in the slice enter the slice at a single point • precisely identifies the possible sets of branches that • may be added to the slice • any path in the original program can be chosen • optimizations can be performed
Compute the Necessary Branches x<11 F T if(x<11) goto A4 x:=x+1 goto A4 x:=x+1 goto A2 goto A2 x<9 T F A4: if(x<9) goto A3 x:=x-1 goto A3 A1: if(y<T) goto A2 x:=x+2 x:=x-1 y:=y–1 goto A1 y<T T F A3: x:=x+2 goto A2 A2: print(x) print(x) y:=y-1 goto A
Data-Flow Path Compression Start:R2:=0 R7:=exp1 Loop: R2:=R2 + 1 compare R2, R9 if-not-less-go-to Out use R7 Temp:=R7; spill R7 to ;memory … ; code that uses ;all registers R7:=Temp; restore R7 go-to Loop Out: R0:=R7 + 1 R2:=0 R7:=exp1 R2:=R2+1 compare R2,R9 if-not-less use R7 go-to Out Temp:=R7 R7:=exp1 Out:R0:=R7 + 1 +1 R0:=R7+1 exit R7:=Temp goto Loop
out data port holds the last value in data port holds the next value Data-Flow Path Compression Start:R2:=0 R7:=exp1 Loop: R2:=R2 + 1 compare R2, R9 if-not-less-go-to Out use R7 Temp:=R7; spill R7 to ; memory … ; code that uses ; all registers R7:=Temp; restore R7 go-to Loop Out: R0:=R7+1 R2:=0 0 d1 d2 exp1 R7:=exp1 d1 R2:=R2+1 ++ compare R2,R9 if-not-less use R7 go-to Out Temp:=R7 • The Plan Calculus:The Programmer’s Apprentice,Rich and Waters, 1990 ++ R7:=R7+1 • R7,Temp carry the value of exp1 exit R7:=Temp • Use data edges instead of variables goto-Loop
compare R2,R9 F T exit entry Start:R2:=0 Loop: R2:=R2 + 1 compare R2, R9 if-not-less-go-to Out use R7 Temp:=R7; spill R7 to ; memory … ; code that uses ; all registers R7:=Temp; restore R7 go-to Loop R7:=exp1 R7:=exp1 exp1 0 R2 R9 R2 ++ R7 Out: R0:=R7 + 1 Out: R0:=R7 + 1 if-not-less R7:= exp1 R0:=R7 + 1 use R7 ++ R0
compare R2,R9 F T exit Decompression entry Start:R2:=0 Loop:R2:=R2 + 1 compare R2, R9 if-not-less- use R7 ; spill R7 to ; memory … ; code that uses ; all registers ; restore R7 go-to Loop R7:=exp1 exp1 0 go-to Out R2 R9 Temp:=R7 R2 ++ R7 R7:=Temp Out: R0:=R7 + 1 if-not-less use R7 R7:=exp1 go-to Out ++ R0:=R7 + 1 Out: R0
Properties of the Slices • Syntax preserving, possibly rhetorizing • Behavior preserving • Executable • For structured programs • At least as accurate as previous algorithms • Strictly smaller in interesting cases • For unstructured programs • Empirically shown to be superior • Modification of the algorithm guaranteed at least as accurate
Implementation • A family of slicing algorithms • rhetorizing (*RB, *RM) • strictly syntax-preserving (*PB, *PM) • amorphous (*AB, *AM) • adds new branches(not from the program) test X if-zero-go-to A . . . goto A2 A1:if(y<T) L:test Y if-zero-go-to B . . . go-to L B:go-to C go-to exit A:Z 0 C: . . . goto exit
Empirical Study • Corpus of 15 manually-written assembly-language modules from a large mainframe product • 8578 non-comment source lines • Computed slices from all lines • 5801 non-empty slices
Related Work BH: Ball & Horwitz 1993 CF: Choi & Ferrante 1994 Ag: Agrawal 1994 KH: Kumar & Horwitz 2002 HD: Harman & Danicic 1998 HLB: Harman, Lakhotia & Binkley 2006
Conclusions • Two techniques for reducing slice size • Control-Flow Path Compression • Precise identification of all correct solutions • Shortest paths significantly improve slice accuracy • 17-22% improvement for 30-37% of the cases • Data-Flow Path Compression • Eliminates copy assignments • Yields significant improvement in a few cases • 24% improvement for 1% of the slices computed • Strictly smaller even for structured programs
Refactoring’s Rubicon:Extract Method • Automating Extract Method is Refactoring’s Rubicon (Fowler*) • The one that demonstrates “serious tool support” • Precondition for many other transformations • This Rubicon has not yet been crossed • Getting it right requires more analysis capability than is available in current IDEs *http://www.martinfowler.com/articles/refactoringRubicon.html
Fowler’s Example (website) void printOwing() { printBanner(); //print details System.out.println("name: " + _name); System.out.println("amount " + getOutstanding()); } void printOwing() { printBanner(); printDetails(getOutstanding()); } void printDetails(double outstanding) { System.out.println("name: " + _name); System.out.println("amount " + outstanding); }
A Case Study inEnterprise Refactoring • Converted a Java Servlet to use the MVC pattern* • Used as much automated support as available • The whole conversion could be described as a series of cataloged (“small”) refactorings • Most steps were inadequately supported by the IDE • Some were not supported at all * Based on Alex Chaffee’s “Refactoring to Model-View-Controller” article (http://www.purpletech.com/articles/mvc/refactoring-to-mvc.html)
Fully Supported Refactorings Uses Extract Method Extract Temp (Self) Encapsulate Field Replace Magic Number with Symbolic Constant Inline Temp Extract Superclass Delete Methods Move Method 3 3 2 1 1 1 1 1 Total 13 Case-Study: Automation (1)
Partial(*) or No(**) Support Uses Extract Method * Substitute Expression ** Replace Temp with Query * Replace Method with Method Object ** Substitute Statement ** Extract Class * Move Statement (or Swap Statements) ** 10 5 3 2 1 1 1 Total 23 Case-Study: Automation (2)
Currently Unsupported Casesof Extract Method • Extract multiple fragments • Extract a partial fragment • select sub-expressions as parameters • Extract loop with partial body • loop duplication with data flow • Extract code with conditional exits Program slicing pulls related code together!
Traditional Slicing slice (v.): to cut with or as if with a knife slice (n.): a thin flat piece cut from something Fine Slicing Merriam-Webster
Traditional Slicing A (backward) slice of a given program with respect to selected “interesting” variables is a subprogram that computes the same values as the original program for the selected variables A (backward) fine sliceof a given program with respect to selected “interesting” variables and other “oracle” variables is a subprogram that computes the same values as the original program for the selected variables, given values for the oracle variables Fine Slicing
Fine Slicing • A generalization of traditional program slicing • Fine slices can be precisely bounded • Slicing criteria include set of data and control dependences to ignore • Fine slices are executable and extractable • Complement slices (co-slices) are also fine slices • Oracle-based semantics for fine slices • Algorithm for computing data-structure representing the oracle • Forward fine slices are executable, may be slightly larger than traditional forward slices • Confines generalize blocks for unstructured programs
Extract Computation • A new refactoring • Extracts a fine slice into contiguous code • Computes the co-slice • Computation can then be extracted into a separate method using Extract Method • Passes necessary “oracle” variables between slice and co-slice • Generates new containers if series of values need to be passed
(a) Extract multiple fragments User user = getCurrentUser(request); if (user == null) { response.sendRedirect(LOGIN_PAGE_URL); return; } response.setContentType("text/html"); disableCache(response); String albumName = request.getParameter("album"); PrintWriter out = response.getWriter();
(b) Extract a partial fragment out.println(DOCTYPE_HTML); out.println("<html>"); out.println("<head>"); out.println("<title>Error</title>"); out.println("</head>"); out.print("<body><p class='error'>"); out.print("Could not load album '" + albumName + "'"); out.println("</p></body>"); out.println("</html>");
(c) Extract loop with partial body 1 2 3 4 5 6 7 8 9 10 out.println("<table border=0>"); int start = page * 20; int end = start + 20; end = Math.min(end, album.getPictures().size()); for (int i = start; i < end; i++) { Picture picture = album.getPicture(i); printPicture(out, picture); } out.println("</table>");
2 3 4 5 *** *** 6 7 *** 9 1 6 8 10 int start = page * 20; int end = start + 20; end = Math.min(end, album.getPictures().size()); Queue<Picture> pictures = new LinkedList<Picture>(); for (int i = start; i < end; i++) { Picture picture = album.getPicture(i); pictures.add(picture); } out.println("<table border=0>"); for (int i = start; i < end; i++) printPicture(out, pictures.remove()); out.println("</table>");
(d) Extract code with conditional exits if (album == null) { new ErrorPage("Could not load album '" + album.getName() + "'").printMessage(out); return; } //...
if (invalidAlbum(album, out)) return; } //... boolean invalidAlbum(Album album, PrintWriter out) { boolean invalid = album == null; if (invalid) { new ErrorPage("Could not load album '" + album.getName() + "'").printMessage(out); } return invalid; }
entry "<table border=0>" Token Semantics out out.println("<table border=0>"); int start = page * 20; int end = start + 20; end = Math.min(end, album.getPictures().size()); for (int i = start; i < end; i++) { Picture picture = album.getPicture(i); printPicture(out, picture); } out.println("</table>"); println page 20 * start + out album out getPictures i size end min getPicture p2 p1 end "</table>" i out out p1 p2 < println printPicture T F exit ++
entry printPicture "<table border=0>" Fine Slicing out out.println("<table border=0>"); int start = page * 20; int end = start + 20; end = Math.min(end, album.getPictures().size()); for (int i = start; i < end; i++) { Picture picture = album.getPicture(i); printPicture(out, picture); } out.println("</table>"); println page 20 * start + out album out getPictures i size end min getPicture end "</table>" i out out < println printPicture T F exit ++
entry printPicture "<table border=0>" The Fine Slice out out.println("<table border=0>"); for (int i = start; i < end; i++) { printPicture(out, picture); } out.println("</table>"); println start end out picture out i "</table>" i out out < println printPicture T F exit ++
printPicture entry "<table border=0>" Co-Slicing out out.println("<table border=0>"); int start = page * 20; int end = start + 20; end = Math.min(end, album.getPictures().size()); for (int i = start; i < end; i++) { Picture picture = album.getPicture(i); printPicture(out, picture); } out.println("</table>"); println page 20 * start + out album out getPictures i size end min getPicture end "</table>" i out out < println printPicture T F exit ++
entry The Co-Slice int start = page * 20; int end = start + 20; end = Math.min(end, album.getPictures().size()); for (int i = start; i < end; i++) { Picture picture = album.getPicture(i); } page 20 * start + album getPictures i size end min getPicture i start picture < out end T F exit ++
entry entry printPicture Co-slice Fine slice page 20 "<table border=0>" * out start println + album end start picture getPictures i end size min getPicture picture i start i out < "</table>" < end out T F println T F ++ out ++ exit exit
entry out.println("<table border=0>"); int start = page * 20; int end = start + 20; end = Math.min(end, album.getPictures().size()); Queue<Picture> pictures = new LinkedList<Picture>(); for (int i = start; i < end; i++) { Picture picture = album.getPicture(i); pictures.add(picture); printPicture(out,pictures.remove()); } out.println("</table>"); "<table border=0>" Adding a Container out println println page 20 * start + out album out getPictures i pictures size end min getPicture new pictures picture add pictures end pictures "</table>" i out out remove remove picture < < println println printPicture printPicture T F exit ++ ++
voiddisplay(PrintStream out, int start, int end, Queue<Picture>pictures){ out.println("<table border=0>"); for (int i = start; i < end; i++) { printPicture(out, pictures.remove()); } out.println("</table>"); } entry entry "<table border=0>" The Fine Slice out println println end start pictures out out i pictures pictures "</table>" i out out remove remove picture < < println println printPicture printPicture T F exit ++ ++
out.println("<table border=0>"); int start = page * 20; int end = start + 20; end = Math.min(end, album.getPictures().size()); Queue<Picture> pictures = new LinkedList<Picture>(); for (int i = start; i < end; i++) { Picture picture = album.getPicture(i); pictures.add(picture); printPicture(out,pictures.remove()); } out.println("</table>"); entry "<table border=0>" Program with Container out println println page 20 * start + out album out getPictures i pictures size end min getPicture new pictures picture add pictures end pictures "</table>" i out out remove remove picture < < println println printPicture printPicture T F exit ++ ++
entry int start = page * 20; int end = start + 20; end = Math.min(end, album.getPictures().size()); Queue<Picture> pictures = new LinkedList<Picture>(); for (int i = start; i < end; i++) { Picture picture = album.getPicture(i); pictures.add(picture); } display(out, start, end, pictures); The Co-Slice out page 20 * start + album getPictures start i pictures size end min getPicture new pictures picture add pictures end pictures pictures i < < display out T F exit ++ ++
Conclusions • Fine slicing algorithm yields executable slices whose boundaries can be precisely controlled • Can be used to make any subset of a program executable by adding some control structures but not the data on which they depend • including forward slices, thin slices, barrier slices, chops, and barrier chops • Conjecture: the size of these executable programs will not be substantially larger
Conclusions • New Extract Computation refactoring is an important step towards the automation of Extract Method in difficult cases • Enables the automation of big refactorings from smaller building blocks • Uses new fine-slicing algorithm • Automatically computes complement slices (co-slices) • Automatically generates containers to pass series of values if necessary