290 likes | 396 Vues
Explore the analysis of dynamic field access in JavaScript, focusing on points-to analysis, bug finding, and refactoring with a prototype-based object model. Learn about dynamic reads/writes, coercion, and non-standard scope rules. A proposed solution involves lightweight lattices for distinguishing strings in dynamic field accesses, such as Constant String, String Set, Length Interval, Character Inclusion, and more. Evaluation includes dynamic and static analysis benchmarked against popular JavaScript libraries and dynamic behavior. Static analysis highlights the limitations of simple constant propagation for dynamic field access.
E N D
String Analysis for Dynamic Field Access Esben Andreasen Magnus Madsen Department of Computer Science Aarhus University
Motivation • Static Analysis of JavaScript • type analysis, bug finding or refactoring • a key component is Points-To analysis • Analysis of JavaScript is a difficult • a flexible object model (prototype-based) • dynamic field access • coercions and eval • non-standard scope rules • ... Today
Static Field Access Receiver Field Name
Dynamic Field Access (DFA) where e is any expression
Dynamic Reads Object Array v = o["p"] v = o["p" + "q"] v = o[???] o length map pop push reduce reduceRight reverse shift slice some sort splice unshift __defineGetter__ __defineSetter__ __lookupGetter__ __lookupSetter__ constructor hasOwnProperty (4 more) concat every filter forEach indexOf join lastIndexOf
Dynamic Writes o["p"] = v o["p" + "q"] = v o[???] = v All fields of o may now point to v!
Dynamic Reads & Writes o[e1][e2] = v e.g. prototype e.g. toString All fields of Object may now point to v!
Spurious Event Handlers var elm = $("#button"); elm.onclick = function() {} elm[???] = function() {} The function is registered as all possible event handlers
Usage in Practice Survey by Richards et al. [1]: • 8.3% of all reads are dynamic • 10.3% of all writes are dynamic Dynamic field access is prevalent in libraries: • jQuery, Mootools and Prototype: 300+ DFAs [1]: Gregor Richards, Sylvain Lebresne, Brian Burg, and Jan Vitek. An Analysis of the Dynamic Behavior of JavaScript Programs. In PLDI, 2010.
Proposed Solution What we need: A way to distinguish strings flowing into dynamic field accesses. Solution: A set of light-weight lattices. • focus on concatenate, equaland join • compact and efficient • ideally O(1)time and space
(Simple) Lattices • Constant String (CS) single concrete string, e.g. "push". • String Set (SS) set of k strings, e.g. "pop"and "push". • Length Interval (LI) min. and max. length, e.g. the interval [3; 4] for the strings "pop" and "push".
Character Inclusion (CI) Sets of characters which may and mustoccur. Example for the strings "pop" and "push":
Prefixes and Suffixes • Prefix-Suffix Characters (PS) first and last character, e.g. for the strings "pop" and "push": and • Prefix Suffix Inclusion (PSI) may and must sets of characters for the first and last character. (like the character inclusion lattice.)
Index Predicate A boolean valued predicate which may/must hold for each of the first string indices. Examples: • isUppercase (useful for e.g. ”lastIndexOf”) • isUnderscore (useful for e.g. ”__defineGetter__”) • isDigit (useful for array indices)
Index Predicate: Concatenation + 1 0 0 1 0 1 1 0 0 0 1 0 0 1 1 1 0 0 0 may-case 1 0 0 1 1 1 1 0 0 0 1 0 0 1 0 1 1 0 0 0
String Hash • Pick a distributive hash function : where is some universe of a fixed size . • Distributive: • The lattice is the powerset lattice of . • Length Hash (LH) • Hashes the length instead of the string itself.
String Hash: Example foo = (the quick brown fox) 4 33 29 40 13 • String Constant • Prefix Suffix • Character Inclusion • Index Predicate
String Hash: Concatenation Example: Let and Assume a universe of size then: Easy to compute in iterations. Paper has solution in time.
Overview • The Hlattice is the (reduced) product of • the string set lattice (SS) • the character inclusion lattice (CI) • the string hash lattice (SH)
Evaluation Q1: How precise are the lattices, independent of any particular analysis, for reasoning about strings used in DFAs? Q2: To what degree does a more precise string lattice, for DFAs, improve overall precision and performance of a static analysis?
Evaluation: Dynamic Analysis We perform a recordand replay of several popular JavaScript libraries: Record: • The history of every string flowing to a DFA. • The field names of every receiver object at a DFA. Replay: • For every DFA merge the histories of strings and determine for each lattice if it has a false positive. (example next slide)
Dynamic Analysis: Example e = (c ? "a": "b") + "x" o = {"ax": 1, "bx": 2} v = o[e]; + What fields are read by evaluating o[e] in the abstract? join "x" "a" "b"
Evaluation: Dynamic Analysis DFAs with zero false positives Const. Str. Insufficient Hybrid Prefix/Suffic Incl.
Evaluation: Static Analysis Flow-sensitive dataflow analysis for JavaScript • inter-procedural, context-insensitive • instantiated with the constant and hybrid lattices Benchmarks • Mozilla Sunspider + Google Octane • Various GitHub projects
Summary • Dynamic field access is common in JavaScript. • Simple constant propagation is insufficient for reasoning about dynamic field accesses. • The proposed hybridlatticeimproves precision and performance for 7 out of 10 benchmark programs. Thank You!
Arrays var a = [1, 2, 3]; var i = 0; while (...) { var x = a[i++]; }
String Patterns "a|b|c|d".split("|")
Number- & Type String Lattices • Number String (N) A powerset lattice of the strings: {Infinity, -Infinity, NaN, 0, 1, ...} • Type String (T) A powerset lattice of the strings: {Boolean, Function, Object, String, Undefined}