Download
finding bugs in dynamic web applications n.
Skip this Video
Loading SlideShow in 5 Seconds..
Finding Bugs in Dynamic Web Applications PowerPoint Presentation
Download Presentation
Finding Bugs in Dynamic Web Applications

Finding Bugs in Dynamic Web Applications

0 Vues Download Presentation
Télécharger la présentation

Finding Bugs in Dynamic Web Applications

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Finding Bugs in Dynamic Web Applications Shay Artzi, Adam Kiezun, Julian Dolby, Frank Tip, Danny Dig, Amit Paradkar, Michael D. Earnst Presented By: Christopher Hamilton

  2. Introduction • Webscript crashes and malformed dynamically-generated Web pages impact usability of Web applications • Current tools for Web-page validation cannot handle the dynamically-generated pages on today’s Internet

  3. The Problem • Bad scripts creating syntactically-malformed HTML • Less portable across browsers and new versions • Non-displayable HTML on separate executions • Browser’s attempt to correct  crashes & security • Discard important information • Trouble indexing correct pages

  4. More Problems • Dynamic web page testing challenges • HTML validation tools only perform testing of static page • Developer must perform • Static Testing • Dynamic Testing

  5. Previous Work • Dynamic test-generation tools (DART, Cute, EXE) • Execute application on concrete inputs • Create additional input by solving symbolicconstraints from control paths • Not practical with Web applications

  6. The Authors’ Goals • Present automated technique for finding faults manifested as Web application crashes or malformed-HTML • Identify minimal part of input responsible for triggering failures • Use of an oracle to detect specification in applications output

  7. Apollo at a Glance • On each execution: • Combined concrete and symbolic execution and constraint solving • Program monitored to record path constraints capturing outcome of control-flow predicates • Oracle determines whether fatal failure or malformed HTML occur • Automatic/iterative creation of new inputs • explore different execution paths

  8. 1 <?php 2 3 make_header(); // print HTML header 4 5 // Make the $page variable easy to use // 6 if(!isset($_GET[’page’])) $page = 0; 7 else $page = $_GET[’page’]; 8 9 // Bring up the report cards and stop processing // 10 if($_GET[’page2’]==1337) { 11 require(’printReportCards.php’); 12 die(); // terminate the PHP program 13 } 14 15 // Validate and log the user into the system // 16 if($_GET["login"] == 1) validateLogin(); 17 18 switch ($page) 19 { 20 case 0: require(’login.php’); break; 21 case 1: require(’TeacherMain.php’); break; 22 case 2: require(’StudentMain.php’); break; 23 default: die("Incorrect page number. Please verify."); 24 } 25 26 make_footer(); // print HTML footer 27 ... 27 function validateLogin() { 28 if(!isset($_GET[’username’])) { 29 echo "<j2> username must be supplied.</h2>\n"; 30 return; 31 } 32 $username = $_GET[’username’]; 33 $password = $_GET[’password’]; 34 if($username=="john" && $password=="theTeacher") 35 $page=1; 36 else if($username=="john" && $password=="theStudent") 37 $page=2; 38 else echo "<h2>Login error. Please try again</h2>\n"; 39 } 40 41 function make_header() { // print HTML header 42 print(" 43 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" 44 "http://www.w3.org/TR/html4/strict.dtd"> 45 <HTML> 46 <HEAD> <TITLE> Class Management </TITLE> </HEAD> 47 <BODY>"); 48 } 49 50 function make_footer() { // close HTML elements opened by header() 51 print(" 52 </BODY> 53 </HTML>"); 54 } 55 ?> PHP Scripting Language • Widely used in Web development • Network interactions • Database • HTTP processing • Object oriented • Classes, interfaces, dynamically dispatched methods • Similar to Java • Scripting • Dynamic typing & eval

  9. 1 <?php 2 3 make_header(); // print HTML header 4 5 // Make the $page variable easy to use // 6 if(!isset($_GET[’page’])) $page = 0; 7 else $page = $_GET[’page’]; 8 9 // Bring up the report cards and stop processing // 10 if($_GET[’page2’]==1337) { 11 require(’printReportCards.php’); 12 die(); // terminate the PHP program 13 } 14 15 // Validate and log the user into the system // 16 if($_GET["login"] == 1) validateLogin(); 17 18 switch ($page) 19 { 20 case 0: require(’login.php’); break; 21 case 1: require(’TeacherMain.php’); break; 22 case 2: require(’StudentMain.php’); break; 23 default: die("Incorrect page number. Please verify."); 24 } 25 26 make_footer(); // print HTML footer 27 ... 27 function validateLogin() { 28 if(!isset($_GET[’username’])) { 29 echo "<j2> username must be supplied.</h2>\n"; 30 return; 31 } 32 $username = $_GET[’username’]; 33 $password = $_GET[’password’]; 34 if($username=="john" && $password=="theTeacher") 35 $page=1; 36 else if($username=="john" && $password=="theStudent") 37 $page=2; 38 else echo "<h2>Login error. Please try again</h2>\n"; 39 } 40 41 function make_header() { // print HTML header 42 print(" 43 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" 44 "http://www.w3.org/TR/html4/strict.dtd"> 45 <HTML> 46 <HEAD> <TITLE> Class Management </TITLE> </HEAD> 47 <BODY>"); 48 } 49 50 function make_footer() { // close HTML elements opened by header() 51 print(" 52 </BODY> 53 </HTML>"); 54 } 55 ?> Failures in PHP Scripts • Execution Failures • Missing an included file • Wrong MySQL query • Uncaught exceptions • Malformed HTML • Generated HTML page not syntactically correct according to HTML validation tool ‘printReportCards.php’ missing make_footer() not executed in certain situations  unclosed HTML tag Generates illegal <j2> tag

  10. Failure-Finding in PHP Applications • Concolic Testing – execute application on initial input, then on additional inputs obtained by solving constraints derived from exercised control flow paths • Extensions • Validate to correctness of control flow output • Use isset, isempty, require, etc. to require generation of constraints absent in other OOPL’s • Use pre-specified set of values for databaseauthentication • Simulate each user input by transformingcode

  11. Transformation of Code • For each page (h) that contains N buttons • Add additional input parameter p to PHP program • Values range from 1 to N • Switch statement inserted including appropriate PHP source file, depending on p • Required modifications are minimal  performed by hand

  12. The Failure Detection Algorithm • parameters: Program P, oracle O • result : Bug reports B; • B : setOf (hfailure, setOf (pathConstraint), setOf (input)i) • P′ ≔ s1 imulateUserInput(P); • B ≔ ?; • pcQueue ≔ emptyQueue(); • enqueue(pcQueue, 4 emptyPathConstraint()); • while not empty(pcQueue) and not timeExpired() do • pathConstraint ≔ dequeue(pcQueue); • input ≔ solve(pathConstraint); • if input , ⊥ then • output ≔ executeConcrete(P′, 9 input); • failures ≔ getFailures(O, 10 output); • foreach f in failures do • merge hf , pathConstraint, 12 inputi into B; • c1 ∧ . . . ∧ cn ≔ executeSymbolic(P′, 13 input); • foreach i = 1,. . . ,n do • newPC ≔ c1 ∧ . . . 15 ∧ ci−1 ∧ ¬ci; • queue(pcQueue, 16 newPC); • return B; A solution, if it exists, to such an alternative path constraint corresponds to an input that will execute the program along a prefix of the original execution path, and then take the opposite branch.

  13. parameters: Program P, oracle O result : Bug reports B; B : setOf (hfailure, setOf (pathConstraint), setOf (input)i) P′ ≔ s1 imulateUserInput(P); B ≔ ?; pcQueue ≔ emptyQueue(); enqueue(pcQueue, 4 emptyPathConstraint()); while not empty(pcQueue) and not timeExpired() do pathConstraint ≔ dequeue(pcQueue); input ≔ solve(pathConstraint); if input , ⊥ then output ≔ executeConcrete(P′, 9 input); failures ≔ getFailures(O, 10 output); foreach f in failures do merge hf , pathConstraint, 12 inputi into B; c1 ∧ . . . ∧ cn ≔ executeSymbolic(P′, 13 input); foreach i = 1,. . . ,n do newPC ≔ c1 ∧ . . . 15 ∧ ci−1 ∧ ¬ci; queue(pcQueue, 16 newPC); return B; Example: Execution 1 (Expose Third Fault) 1 <?php 2 3 make_header(); // print HTML header 4 5 // Make the $page variable easy to use // 6 if(!isset($_GET[’page’])) $page = 0; 7 else $page = $_GET[’page’]; 8 9 // Bring up the report cards and stop processing // 10 if($_GET[’page2’]==1337) { 11 require(’printReportCards.php’); 12 die(); // terminate the PHP program 13 } 14 15 // Validate and log the user into the system // 16 if($_GET["login"] == 1) validateLogin(); 17 18 switch ($page) 19 { 20 case 0: require(’login.php’); break; 21 case 1: require(’TeacherMain.php’); break; 22 case 2: require(’StudentMain.php’); break; 23 default: die("Incorrect page number. Please verify."); 24 } 25 26 make_footer(); // print HTML footer 27 ... 27 function validateLogin() { 28 if(!isset($_GET[’username’])) { 29 echo "<j2> username must be supplied.</h2>\n"; 30 return; 31 } 32 $username = $_GET[’username’]; 33 $password = $_GET[’password’]; 34 if($username=="john" && $password=="theTeacher") 35 $page=1; 36 else if($username=="john" && $password=="theStudent") 37 $page=2; 38 else echo "<h2>Login error. Please try again</h2>\n"; 39 } 40 41 function make_header() { // print HTML header 42 print(" 43 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" 44 "http://www.w3.org/TR/html4/strict.dtd"> 45 <HTML> 46 <HEAD> <TITLE> Class Management </TITLE> </HEAD> 47 <BODY>"); 48 } 49 50 function make_footer() { // close HTML elements opened by header() 51 print(" 52 </BODY> true – sets page = 0 false • HTML validation tool determines output is illegal • NotSet(page) || page2 ≠ 1337 || login ≠ 1 NotSet(page)||page2 ≠ 1337 || login = 1 NotSet(page) ||page2 ≠ 1337 Set(page) GoTo(20) Execution

  14. Example: Execution 2 (The Opposite Path) 1 <?php 2 3 make_header(); // print HTML header 4 5 // Make the $page variable easy to use // 6 if(!isset($_GET[’page’])) $page = 0; 7 else $page = $_GET[’page’]; 8 9 // Bring up the report cards and stop processing // 10 if($_GET[’page2’]==1337) { 11 require(’printReportCards.php’); 12 die(); // terminate the PHP program 13 } 14 15 // Validate and log the user into the system // 16 if($_GET["login"] == 1) validateLogin(); 17 18 switch ($page) 19 { 20 case 0: require(’login.php’); break; 21 case 1: require(’TeacherMain.php’); break; 22 case 2: require(’StudentMain.php’); break; 23 default: die("Incorrect page number. Please verify."); 24 } 25 26 make_footer(); // print HTML footer 27 ... 27 function validateLogin() { 28 if(!isset($_GET[’username’])) { 29 echo "<j2> username must be supplied.</h2>\n"; 30 return; 31 } 32 $username = $_GET[’username’]; 33 $password = $_GET[’password’]; 34 if($username=="john" && $password=="theTeacher") 35 $page=1; 36 else if($username=="john" && $password=="theStudent") 37 $page=2; 38 else echo "<h2>Login error. Please try again</h2>\n"; 39 } 40 41 function make_header() { // print HTML header 42 print(" 43 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" 44 "http://www.w3.org/TR/html4/strict.dtd"> 45 <HTML> 46 <HEAD> <TITLE> Class Management </TITLE> </HEAD> 47 <BODY>"); 48 } 49 50 function make_footer() { // close HTML elements opened by header() 51 print(" 52 </BODY> 53 </HTML>"); 54 } 55 ?> • For path constraint: NotSet(page) ||page2 ≠ 1337 • Constraint solver may get page2  0; login  1 HTML validation tool discovers failure and generates bug report  added to output set of bug reports true true

  15. Minimization on Path Constraints • Eliminates irrelevant constraints • Solution for a shorter path constraint is a smallerinput • Does not guarantee returned path constraint is shortest that exposes failure • Simple, fast, and effective in practice • Differs from input minimization – operate on path constraint that exposes failure instead of input • Handles multiple constraints that lead to failure

  16. Minimization Example • HTML malformation from previous example could have been reached from different execution paths • NotSet(page) || page2 ≠ 1337 || login = 1 page2 ≠ 1337 || login = 1 • Set(page) || page = 0 || page2 ≠ 1337 || login = 1 page2 ≠ 1337 login = 1 (login  1)

  17. Apollo • User Input Simulator • Executor • Bug Finder • Oracle • Bug Report Repository • Input minimizer • Input Generator • Symbolic Finder • Constraint Solver • Value Generator

  18. User Input Simulator • Performs a transformation of the program that models the user input.

  19. Executor: Shadow Interpreter • ShadowInterpreter– PHP interpreter modified to record path constraints and positional information • Symbolic variable associated with each value • At branching points, extend initially empty path constraint with conjunct corresponding to branch taken in execution • Records conditions for PHP-specific comparison operations (isset, empty, etc) which can only be applied to one variable • Concretevalues– influence flow control during execution • Symbolicvalue– records control flow decisions at branching points

  20. Executor: Database Manager • DatabaseManager • (Re) initializes DB used by a PHP application. Restores DB before each execution • Supply additional information about username/password pairs

  21. Bug Finder • BugReport = Path constraint + Input inducing failure • Failure = Type of Failure + Corresponding Message + PHP statement generating bad HTML • Oracle – HTML validation tool (WDG and WC3) • InputMinimizer – uses the path constraints minimization algorithm • Executes program multiple times with multiple inputs that satisfy multiple constraints • Attempts to find shortest path constraint resulting in same failure characteristic

  22. Input Generator • Symbolic Driver – Implements combined concrete and symbolic failure detection algorithm • Select next input (coverage heuristic) • Create additional inputs from each execution • Constraint Driver – implements lightweight symbolic execution • Constraints = equality or inequality • Choco constraint solver • Un-constrainted = random generation and constant-mining

  23. Evaluation • How many faults can Apollo find, and of what varieties? • How effective is the fault localization technique compared to alternative approaches, in terms of number and severity of discovered faults? (line coverage achieved) • How effective is minimization in reducing size of inputs parameter constraints and failure-inducing inputs?

  24. Experimentation <?php echo "<h2>WebChess ".$Version.“Login"</h2>; ?> <formmethod="post" action="mainmenu.php"> <p> Nick: <inputname="txtNick" type="text" size="15"/> <br /> Password: <inputname="pwdPassword" type="password" size="15"/> </p> <p> <inputname="login" value="login“ type="submit"/> <inputname="newAccount" value="New Account“ type="button" onClick="window.open(’newuser.php’, ’_self’)"/> </p> </form>

  25. Generation Strategies • Compared to two other approaches • Halfond and Orso (Randomized) • Chosen from constant values appearing in program source and from default values • Difficult: parameters’ names and types not apparent • Infers names and types from dynamic traces • Minimide’s static analysis • Apollo’s test input generation previously discussed

  26. Methodology • 10-minute runs on each program • Generation of hundreds of inputs • Ran on both Apollo and Random test input generation strategies • WDG offline HTML validation tool • Coverage (number of executed lines / total lines with executable PHP code in application) • Total number of lines w/ PHP opcode

  27. Results Classification • Execution crash: PHP interpreter terminates with exception • Execution error: PHP interpreter emits warning visible in generated HTML • Execution warning: PHP interpreter emits warning invisible to HTML output • HTML error: program generates HTML for which validation tool produces error report • HTML warning: program generates HTML for which validation produces a warning report

  28. Results Analysis Resulted in Malformed HTML Tries to load two missing files Database related Unset Time-zone Apollo Randomized Average line coverage – 58.0% Faults Found on Subject Apps – 214 Average line coverage – 15.0% Faults Found on Subject Apps – 59

  29. Results Analysis: Effects of Constraint Minimization • Minimide’s tool • Approximates string output of program with a context-free grammar. • Able to discover unclosed tags • Intersect grammar with regular expression of matched pairs of delimiters • Covers phpwmis and timeclock (web-based) • Apollo is more effective and efficient • 2.7 more HTML validation faults • 83 additional execution faults • More scalable

  30. Results Analysis: Compared to Static Analysis Reduces size of inputs by up to factor of 0.18 for more than 50% of faults

  31. Threats to Validity and Limitations Threats to Validity Limitations Simulating inputs based on static information False positives… Limited tracking in native methods C, input  output, Limited resources of input parameters Only inputs from global arrays Running as a stand-alone application Web server integration limited • Construct • Malformed HTML = Defect? • Line coverage = quality? • Minimization path constraints? • Internal • Real, unseeded, and unknown faults? • External • Generalized beyond subject programs? • Reproducible?

  32. Future Work • Handle simulated user input dynamically • Create external language to model dependencies between inputs and outputs • Increase line coverage when executing native methods • Webserver integration

  33. Conclusion • Detection of run-time errors • HTML Validation tool as oracle • PHP specific issues • Simulation of interactive user input that occurs when HTML elements are activated • Automated analysis to minimize size of failure-inducing inputs • Apollo run on 4 open source programs • Over 50% line coverage • 214 faults over these applications • Minimized inputs 5.3 times smaller than nonminimized inputs