1 / 18

CS 795: Code Analysis Techniques

CS 795: Code Analysis Techniques. Slide Set 1 C. M. Overstreet Old Dominion University For CS 488, Fall 2007. Course objectives. At end of course, you will understand how analysis of source code can be used in several aspects of computing/ software engineering: What’s feasible, what isn’t

eljah
Télécharger la présentation

CS 795: Code Analysis Techniques

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 795: Code Analysis Techniques Slide Set 1 C. M. Overstreet Old Dominion University For CS 488, Fall 2007

  2. Course objectives • At end of course, you will understand how analysis of source code can be used in several aspects of computing/ software engineering: • What’s feasible, what isn’t • Basic algorithms • Basic problems • Basic application areas

  3. Lecture overview • Some motivation for code analysis • Some examples: some easy, some hard, some impossible • Some examples of where used • Assignment -- but not for CS 488

  4. Example 1: elimination of infinite loops • Infinite loops are usually undesirable (but often not. Why?) • Can you write a program that checks another program for infinite loops? • Initial idea: just consider while loops: • Check loop body to see if any action modifies some loop control variables. • Does this work? • If no change, is loop infinite? • What if change is conditional? • Can one always do this? How? What about input data?

  5. Example 1 (cont.) • Point: often partial solutions can be useful even if a general solution does not exist. • Question: What is the Halting Problem and why is it important?

  6. Static & Dynamic Analysis • From their beginning, compilers have tried to optimize code. • Done by static analysis • Figure things out by analyzing source code • In UNIX, lint is a very early static code analyzer • Many types of useful information about code itself can only be obtained by running to code and monitoring its behaviors. • This is dynamic analysis

  7. Example 3: uninitialized variables • Uninitialized variables are usually undesirable. • Should compilers make them illegal? • Can compilers make them illegal? • Can you write a program which detects uninitialized variables? Some? All? • Always? Sometimes? • Using static analysis? • i.e., uninitialized vars detected at compile time • Using dynamic analysis? • i.e. uninitialized vars detected at run time

  8. Key application area: Compiler optimization • Oldest application area • Crucial concern of compiler writers when the first compiler was written • What was the first compiler? What language did it compile? • When? • Standard techniques (from 1954): for loops • Eliminate common subexpressions • if a*b occurs more than once, don’t evaluate it twice • Move code out of loop if possible • j = 3*x + 2*y need not be always be recomputed with each loop iteration • Make subscript evaluation more efficient • do an inc rather than a multiply • Usually only worry about loops. Why?

  9. Compiler (cont.) • What’s hard about: • Common subexpression elimination? • Code motion? • Subscript simplification? • Several things: • Need to know nothing else changes the variables of interest • Problem significantly reduced if ftn calls are not present • Within an expression (order of expression eval?) • With loop body • With (or without) ftn calls, alias problem

  10. The Alias Problem - 1 • For several things we discuss, this is a problem. • An alias exists between two or more variables when the same memory location can be accessed with each variable. The variables then are said to be aliased. • Multiple names for the same thing/place. • Some programming languages explicitly allow the programmer to give two different names to the same memory location. (Why?) • Tricky problem; the existence of aliases may depend on the loader rather than the compiler. Why? • Some aliases are easily detected, some not. • Sometimes code logic relies on aliases.

  11. The Alias Problem - 2 • How introduced? • Pointers • Array subscripts: • a[ i ] = x; a[ j ] = y; // does i == j? • a[ a[ i ] ] = x; a[ f[ j ] ] = y; // how hard to detect alias? • Parameters • what’s the difference in “call by name” and “call by value”? • Run-time stack when functions are called with uninitialized local variables • ...

  12. Why Worry about Aliases? • Some of the analyses we’re interested in depends on how variables change values. • Two different dead code definitions: • Code never executed (how can this happen?) • Code whose execution has no external effect (except speed). • Dead code is not always undesirable. Why? • Example: x = 0; y = f[ a, b, c ]; cin >> x; • Is the first statement dead? • If so, it is said to be killed by the cin statement.

  13. Back to loop analysis -1 • To move a statement from inside to outside a loop, we need to check that “key” values are not changed with each loop iteration. • Example: a statement like “x=a+b” inside a loop can sometimes be moved outside a loop if neither a nor b are reassigned in the loop. • True or False? • Does it depend on how x is referenced in the loop? • Byte war story!

  14. Back to loop analysis - 2 • Why can aliases cause problems with code motion? • Why can aliases cause problems with common subexpression elimination? • Why can aliases cause problems with subscript analysis?

  15. Another example: side effects of function calls • Consider the statement: x = f[a,b] + a + g[a,c]; (assume this isn’t C or C++) • What if f calls a function with a side-effect? • Considered to be a dangerous statement. "Careful" programmers avoid this type of coding. • Why? • What if a is passed to f by address? • What if c is global to f and may be changed by f • I've used an language that, if requested, makes no distinction between array refs and function calls. Why might this be useful?

  16. Parameter passing and aliases • Consider a ftn which modifies globals. What if that global is also passed as a parameter to the function? • What if the same variable occurs more than once in a function call’s parameter list?

  17. Program Slice • Program slice: • Pick a variable and a particular statement in a program • Generate a subset of that program that produces exactly the same value for that variable in that statement. • This is called a slice • Assertion: programmers use slices when debugging. • See Weiser’s IEEE SE Program Slicing paper, July 1984.

  18. Searching for papers - demo • www.acm.org, search for papers • program slice • alias problem • control flow graph • parse tree

More Related