380 likes | 588 Vues
Finding Dominators in Flowgraphs Linear-Time Algorithm 1 and Experimental Study 2 Loukas Georgiadis 1 joint work with Robert E. Tarjan 2 joint work with Renato F. Werneck , Robert E. Tarjan , Spyridon Triantafyllis and David I. August.
E N D
Finding Dominators in Flowgraphs Linear-Time Algorithm 1 and Experimental Study 2 Loukas Georgiadis 1 joint work with Robert E. Tarjan 2 joint work with Renato F. Werneck, Robert E. Tarjan, Spyridon Triantafyllis and David I. August
Dominators in a Flowgraph Flowgraph: G =(V, E, r); each v in V is reachable from r vdominateswif every path from rtow includes v r v w
Dominators in a Flowgraph Flowgraph: G =(V, E, r); each v in V is reachable from r vdominateswif every path from rtow includes v Set of dominators:Dom(w)={v|v dominates w} Trivial dominators: wr,w,rDom(w) Immediate dominator: idom(w)Dom(w) – w and dominatedby everyv in Dom(w) – w
Dominators in a Flowgraph Flowgraph: G =(V, E, r); each v in V is reachable from r vdominateswif every path from rtow includes v Set of dominators:Dom(w)={v|v dominates w} Trivial dominators: wr,w,rDom(w) Immediate dominator: idom(w)Dom(w) – w and dominatedby everyv in Dom(w) – w Goal: Find idom(v)for each vinV (immediate dominator tree) Applications: Program optimization, code generation, circuit testing
History • 1979 Lengauer and Tarjan; O(m· (m,n)) time. • 1997Alstrup, Harel, Lauridsen and Thorup; O(n+m) time for RAM. • 1998 Buchsbaum, Kaplan, Rogers and Westbrook; claimed • O(n+m) for Pointer Machine. (Corrected in 2004 to work in • linear time for RAM.) • 2004 G. and Tarjan • We showed that the Buchsbaum et al. algorithm runs in • O(m·(m,n)) time. • Based on Buchsbaum et al. we gave a linear-time algorithm for • Pointer Machine, simpler than Alstrup et al. (no complicated • data structures).
The Lengauer-Tarjan Algorithm Depth-First Search DFS Tree D We refer to the vertices by their DFS numbers: v<w :v was visited by DFS before w r 1 2 3 4 6 5 7 8
The Lengauer-Tarjan Algorithm: Semidominators Depth-First Search DFS Tree D We refer to the vertices by their DFS numbers: v<w :v was visited by DFS before w Semidominator path (SDOM-path): P= (v0=v,v1,v2, …,vk=w)such that vi>w, for 1 ik-1 r 1 2 3 4 6 5 7 8
The Lengauer-Tarjan Algorithm: Semidominators Depth-First Search DFS Tree D We refer to the vertices by their DFS numbers: v<w :v was visited by DFS before w Semidominator path (SDOM-path): P= (v0=v,v1,v2, …,vk=w)such that vi>w, for 1 ik-1 Semidominator: sdom(w)=min {v|SDOM-pathfromvtow} r 1 2 3 4 6 5 7 8
The Lengauer-Tarjan Algorithm • Overview • Carry out a DFS. • Process the vertices in reverse preorder. • For vertex w, compute sdom(w). • Implicitly define idom(w). • Explicitly define idom(w)by a preorder pass.
The Lengauer-Tarjan Algorithm:Evaluate minima on tree paths Data Structure: Maintain forest F and supports the operations: link(v,w): Add the edge (v,w)to F. eval(v): Let r = root of the tree that contains v in F. If v= r then return v. Otherwise return any vertex with minimum sdom among the vertices uthat are proper descendants of rand ancestors ofv. Initially every vertex in V is a root in F.
The Lengauer-Tarjan Algorithm:Evaluate minima on tree paths Data Structure: Maintain forest F and supports the operations: link(v,w): Add the edge (v,w)to F. eval(v): Let r = root of the tree that contains v in F. If v= r then return v. Otherwise return any vertex with minimum sdom among the vertices uthat are proper descendants of rand ancestors ofv. Initially every vertex in V is a root in F. Simple version: nlinks, mevals in O(mlogn). Sophisticated version: n links, mevals in O(mα(m,n)).
The Linear-Time Algorithm Partition D into trivial and nontrivial microtrees. [Dixon and Tarjan ‘97] Nontrivial microtree: Maximal subtree of Dof size gthat contains at least one leaf of D. Trivial microtree: Single internal vertex of D. 1 2 3 22 15 4 5 7 17 16 21 6 8 18 9 12 19 20 10 11 13 14
The Linear-Time Algorithm Partition D into trivial and nontrivial microtrees. [Dixon and Tarjan ‘97] Nontrivial microtree: Maximal subtree of Dof size gthat contains at least one leaf of D. Trivial microtree: Single internal vertex of D. 1 2 3 22 trivial microtree 15 4 5 7 17 16 21 6 8 18 9 12 nontrivial microtree 19 20 10 11 13 14 g = 3
The Linear-Time Algorithm Partition D into trivial and nontrivial microtrees. [Dixon and Tarjan ‘97] Nontrivial microtree: Maximal subtree of Dof size gthat contains at least one leaf of D. Trivial microtree: Single internal vertex of D. CoreC: Tree D– nontrivial microtrees; has n/g leaves. 1 2 3 22 15 4 5 7 17 16 21 6 8 18 9 12 19 20 10 11 13 14
The Linear-Time Algorithm Partition D into trivial and nontrivial microtrees. [Dixon and Tarjan ‘97] Nontrivial microtree: Maximal subtree of Dof size gthat contains at least one leaf of D. Trivial microtree: Single internal vertex of D. CoreC: Tree D– nontrivial microtrees; has n/g leaves. Line: Path (v1=s,v2, …,vk=t) in C such that outdegreeC(vi)=1, 1ik-1, and outdegreeC(vk)= 0 or >1. 1 2 3 22 15 4 5 7 17 16 21 6 8 18 9 12 19 20 10 11 13 14
The Linear-Time Algorithm Partition D into trivial and nontrivial microtrees. [Dixon and Tarjan ‘97] Nontrivial microtree: Maximal subtree of Dof size gthat contains at least one leaf of D. Trivial microtree: Single internal vertex of D. CoreC: Tree D– nontrivial microtrees; has n/g leaves. Line: Path (v1=s,v2, …,vk=t) in C such that outdegreeC(vi)=1, 1ik-1, and outdegreeC(vk)= 0 or >1. line 1 2 3 22 15 4 5 7 17 16 21 6 8 18 9 12 19 20 10 11 13 14
The Linear-Time Algorithm Partition D into trivial and nontrivial microtrees. [Dixon and Tarjan ‘97] Nontrivial microtree: Maximal subtree of Dof size gthat contains at least one leaf of D. Trivial microtree: Single internal vertex of D. CoreC: Tree D– nontrivial microtrees; has n/g leaves. Line: Path (v1=s,v2, …,vk=t) in C such that outdegreeC(vi)=1, 1ik-1, and outdegreeC(vk)= 0 or >1. There are L 2n/g lines. Contract each line into a single vertex tree C’ with L nodes. {1, 2, 3} {4, 7, 8} {15, 17}
The Linear-Time Algorithm Extend the definition of semidominators for the vertices of the nontrivial microtrees [Buchsbaum et al.]: Pushed external dominator path (PXDOM-path): P= (v0=v,v1,v2, …,vk=w)such that viroot of microtreeofw, for 1ik-1. Pushed external dominator: pxdom(w)= min{v|PXDOM-path fromvtow} pxdom(w) sdom(w) w
The Linear-Time Algorithm Extend the definition of semidominators for the vertices of the nontrivial microtrees [Buchsbaum et al.]: Pushed external dominator path (PXDOM-path): P= (v0=v,v1,v2, …,vk=w)such that viroot of microtreeofw, for 1ik-1. Pushed external dominator: pxdom(w)= min{v|PXDOM-path fromvtow} For any vertex w of the core C pxdom(w) = sdom(w) pxdom(w) sdom(w) w
The Linear-Time Algorithm • Overview • Compute internal dominators in each nontrivial microtree.
The Linear-Time Algorithm • Overview • Compute internal dominators in each nontrivial microtree. • Compute pxdoms in each nontrivial microtree t by link and eval on C’ and Nearest Common Ancestor (NCA) queries on a tree built by the sdom values of the line that contains the parent of the root of t.
The Linear-Time Algorithm • Overview • Compute internal dominators in each nontrivial microtree. • Compute pxdoms in each nontrivial microtree t by link and eval on C’ and Nearest Common Ancestor (NCA) queries on a tree built by the sdom values of the line that contains the parent of the root of t. • Compute sdoms in each line lby a top-down pass • using link and eval on C’ and contracting connected components inl.
The Linear-Time Algorithm • Overview • Compute internal dominators in each nontrivial microtree. • Compute pxdoms in each nontrivial microtree t by link and eval on C’ and Nearest Common Ancestor (NCA) queries on a tree built by the sdom values of the line that contains the parent of the root of t. • Compute sdoms in each line lby a top-down pass • using link and eval on C’ and contracting connected components inl. • Remarks: link and eval run in linear-time on C’ . • Buchsbam et al.claimedthat link and eval run • in linear time on C but the claim isfalse.
The Iterative Algorithm: Set-based Dominators can be computed by solving iteratively the set of equations [Allen and Cocke, 1972] Dom(v)=(u pred(v)Dom(u) ){v}, v r Initialization Dom(r)={r} Dom(v)=, v r In the intersection we consider only the nonempty Dom(u).
The Iterative Algorithm: Set-based Dominators can be computed by solving iteratively the set of equations [Allen and Cocke, 1972] Dom(v)=(u pred(v)Dom(u) ){v}, v r Initialization Dom(r)={r} Dom(v)=, v r In the intersection we consider only the nonempty Dom(u). Each Dom(v)set can be represented by an n-bit vector. Intersection bit-wise AND. Requires n2space. Very slow in practice.
The Iterative Algorithm: Tree-based Efficient implementation [Cooper, Harvey and Kennedy 2000] dfs(r) T {r} changed true while (changed ) do changed false for all vinV– r in reverse postorderdo x nca(pred(v)) if x parent(v)then parent(v)x changed true end done done
The Iterative Algorithm Running Time Each pair wise intersection takes O(n) time. The number of iterations is d + 3. [Kam and Ullman ’76] d =max#back-edges in any cycle-free path of G = O(n) Running time = O(mn2) This bound is tight, but very pessimistic in practice.
The Iterative Algorithm: Generic Tree-based T T0 /* aspanning (sub)tree of G */ changed true while (changed ) do changed false for all vinV– r in orderdo x nca(pred(v)) if x parent(v)then parent(v)x changed true end done done
The Iterative Algorithm: Generic Tree-based T T0 /* aspanning (sub)tree of G */ changed true while (changed ) do changed false for all vinV– r in orderdo x nca(pred(v)) if x parent(v)then parent(v)x changed true end done done Good choices (in practice): T0 =a Bread-First Search(BFS) tree = BFS order
A Hybrid Algorithm Lemma: For any vertex wr, idom(w)=NCA(I,parent(w),sdom(w)). I = (immediate) dominator tree parent(w) = parent of w in the DFS tree D
A Hybrid Algorithm • Lemma: For any vertex wr, • idom(w)=NCA(I,parent(w),sdom(w)). • I = (immediate) dominator tree • parent(w) = parent of w in the DFS tree D • SEMI-NCA: • Compute sdoms as in simple version of LT. • Construct I incrementally applying Lemma. • (NCA calculations implemented naïvely)
Experimental Results • Algorithms • SLT: simple version of Lengauer-Tarjan • LT: almost-linear-time version of Lengauer-Tarjan • IDFS: DFS tree-based iterative • IBFS: BFS tree-based iterative • SNCA: SEMI-NCA
Experimental Results • Inputs • Control-flow graphs from SPARC ’95 generated by the SUIF compiler (Stanford). • > 4900 graphs, avg #vertices ~ 40, #edges ~ 55 • max #vertices ~ 2100, #edges ~ 3200 • Control-flow graphs from SPARC’ 00 generated by the IMPACT compiler (UIUC). • > 2000 graphs, avg #vertices ~ 25, #edges ~ 70 • max #vertices~580, #edges~3100 • VLSI circuits from ISCAS’89 suite. • 50 graphs, avg #vertices ~ 3200, #edges ~ 5000 • max #vertices ~ 24000, #edges ~ 34000
Experimental Results Times relative to BFS: geometric mean and geometric standard deviation IDFS IBFS LT SLT SNCA mean dev mean dev mean dev mean dev mean dev CIRCUITS 5.891.196.171.426.711.184.621.154.401.14 SUIF-INT 2.451.502.251.623.691.402.481.332.731.45 IMPACT 2.601.652.241.774.021.402.741.332.561.31 IMPACTP 2.581.632.251.823.841.442.611.302.521.29
Experimental Results iterations comparisons per vertex SDP(%) IDFS IBFS IDFS IBFS LT SLT SNCA CIRCUITS 76.72.80003.2000 32.6 39.3 12.0 9.9 8.9 IMPACT 73.42.06861.4385 30.9 28.0 15.6 12.8 11.1 IMPACTP 88.62.08191.5376 30.2 32.2 15.5 12.3 10.9 SUIF-INT 63.92.00091.6659 14.9 17.2 11.2 8.6 7.2 SDP = percentage of vertices v that have parent(v) = sdom(v)