1 / 34

Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications

Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications. Karl Schnaitter, UC Santa Cruz Neoklis Polyzotis, UC Santa Cruz Lise Getoor, Univ. of Maryland. VLDB 2009, Lyon, France. Index Selection. Index selection problem: Given a query workload

tallys
Télécharger la présentation

Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Index Interactions in Physical Design TuningModeling, Analysis, and Applications Karl Schnaitter, UC Santa Cruz Neoklis Polyzotis, UC Santa Cruz Lise Getoor, Univ. of Maryland VLDB 2009, Lyon, France

  2. Index Selection • Index selection problem: • Given a query workload • Choose indices that improve workload performance • Index “benefit” is a key concept • Informally, for an index i, • [benefit of i] = [exec cost without i] – [exec cost with i] • Does index benefit depend on other indices? • If so, this is called index interaction

  3. Related Work • Interactions are a key concern in physical tuning • [Whang et al. 1981] make assumptions implying that indices on different tables do not interact • [Finklestein et al. 1988] assume that indices do not interact if they are relevant to separate queries • [Bruno and Chaudhuri 2007] explicitly account for some interactions in on-line index selection • Many more… • These studies treat interactions as a secondary issue, and often rely on ad hoc assumptions

  4. Index Interactions • Let S be a set of indices relevant to a query Q cost(X) cost(X  {a}) benefit({a}, X) cost(X  {b}) Indices a,b are independent with respect to X cost(X  {a,b}) benefit({a}, X  {b})

  5. Index Interactions • Let S be a set of indices relevant to a query Q cost(X) cost(X  {a}) benefit({a}, X) cost(X  {b}) Indices a,bpositively interact with respect to X cost(X  {a,b}) benefit({a}, X  {b})

  6. Index Interactions • Let S be a set of indices relevant to a query Q cost(X) cost(X  {a}) benefit({a}, X) cost(X  {b}) Indices a,bnegatively interact with respect to X cost(X  {a,b}) benefit({a}, X  {b})

  7. Degree of Interaction • = degree of interaction between a,b with respect to X • = =

  8. Problem Statement • Which indices in S interact? • How strong are the interactions? • The Degree of Interaction Problem:

  9. Outline • Properties of Query Optimization • Degree of Interaction Algorithm • Applying Interaction Information

  10. Outline • Properties of Query Optimization • Degree of Interaction Algorithm • Applying Interaction Information

  11. Query Optimization • Computing doi(a,b) is not practical if the optimizer is totally arbitrary • Need to compute • In practice, query optimization is not arbitrary • E.g., we expect • We put mild assumptions on query optimization: • Plans are selected from some fixed space P • Optimizer chooses the cheapest feasible plan from P • Ties are broken consistently

  12. used in opt plan = 20 cost of plan a b c b c d = 50 = 45 a c b c c d = 65 = 50 c = 80 d = 80 Index Benefit Graph • An Index Benefit Graph (IBG) encodes the selection of optimal plans for a query • Introduced by [Frank, Omiecinski, and Navathe 1992] • Example IBG when S = {a,b,c,d} • There are 16 subsets of S • IBG has 8 nodes • But IBG can compute a b c d = 80

  13. Outline • Properties of Query Optimization • Degree of Interaction Algorithm • Applying Interaction Information

  14. Naive Algorithm • Recall that we want the degree of interaction between all pairs of indices in S • Each doi(a,b) may be computed directly Upon termination, T[a,b] = doi(a,b) for all a,b Can save time using an IBG as a cache of cost function Downside: iteration over all subsets of S

  15. The QINTERACT Algorithm Naive Algorithm (condensed) QINTERACT Algorithm QINTERACT algorithm processes two index sets per IBG node We should avoid evaluating doi(a,b,X) for all

  16. QINTERACT Example • Let’s calculate doi(a,b) on the graph below • What happens on iteration Y = {u} ? a b u v = 20 a b u v = 20 a u v = 30 b u v = 30 a u v = 30 b u v = 30 b v = 40 a u = 40 u v = 40 b v = 40 a u = 40 u v = 40 Y Y v = 50 u = 50 v = 50 u = 50

  17. c d = 65 Interleaved IBG Processing • In QINTERACT, the IBG is built, then analyzed • I.e., IBG construction and analysis is serial • We can discover interactions in a partial IBG a b c d = 20 a b c b c d = 50 = 45 . . . . . . a c b c = 80 = 50 c = 80 d = 80 • IBG construction and analysis may be interleaved • Improves accuracy of doi over time

  18. Outline • Properties of Query Optimization • Degree of Interaction Algorithm • Applying Interaction Information • Visualizing Index Interactions • Scheduling Index Creation

  19. Outline • Properties of Query Optimization • Degree of Interaction Algorithm • Applying Interaction Information • Visualizing Index Interactions • Scheduling Index Creation

  20. Visualizing Index Interactions • We can visualize the doi function as a graph • Nodes correspond to indices • Edge between a and b has weight doi(a,b) LI(SD,D) C(CK,NK) 0.02 0.02 0.02 TPC-H Query 7 0.03 LI(SK,SD,D,EP,OK) C(NK,CK) 0.01 0.02 0.04 0.09 0.01 LI(SD,Q) O(CK,OK) S(NK,N,SK) S(NK,SK) S(SK,NK)

  21. Interaction Graph • The connected components have special meaning

  22. Outline • Properties of Query Optimization • Degree of Interaction Algorithm • Applying Interaction Information • Visualizing Index Interactions • Scheduling Index Creation

  23. Scheduling Index Creation • Suppose we want to materialize new indices • In what order should they be created? Schedule = a,b,c Schedule = b,a,c Schedule = c,a,b Benefit a a,b a,b,c c a,c a,b,c b a,b a,b,c Materialized Indices Choose first schedule to maximize benefit over time (shaded area)

  24. Scheduling Index Creation • We define an optimization problem • M = preexisting indices • {a1, …, an} = new indices to create • Permute new indices as t1, …, tn to maximize • This problem is computationally hard • There is a connection to the Set Cover problem, since each new index “covers” more benefit

  25. Greedy Scheduling • We are tempted to use a greedy heuristic • This results in the third schedule Schedule = a,b,c Schedule = b,a,c Schedule = c,a,b Benefit a a,b a,b,c c a,c a,b,c b a,b a,b,c Materialized Indices Greedy schedule can be suboptimal by a factor of about (n – 1)

  26. Interaction-Aware Scheduling • Scheduling can use interaction graph Idea: First find optimal sub-schedules for each Ci Then choose the best interleaving of sub-schedules This heuristic avoids the pitfalls of greedy scheduling We can also show stronger performance guarantees

  27. Conclusions • Index interactions provide useful insights for physical design tuning • The doi metric is an effective characterization of interaction relationships • We can analyze interactions efficiently when the Index Benefit Graph has limited size • Future work?

  28. Thank You

  29. Performance Evaluation • QINTERACT implementation in Java • Uses JDBC to connect to IBM DB2 database • Experiments use 22 TPC-H benchmark queries • We generate indices based on the DB2 advisor • SALL = all indices recommended by DB2 • S1C = indices in SALL with first column only • We monitor the progress of the “serial” and “interleaved” approaches over time

  30. Experimental Results SALL index set 0.1 threshold S1C index set 0.1 threshold

  31. Applications • QINTERACT returns doi(a,b) for all a,b • We propose two applications of this information • Visualizing index interactions • Illustrates the global interactions as a graph • Useful when manually tuning the index set • Scheduling index construction • Want to choose when new indices will be created • Goal is to increase performance as quickly as possible • Knowledge of index interactions can help

  32. Problem Statement • Which indices in S interact? • How strong are the interactions? • The Degree of Interaction Problem: • It may be useful to ignore “minor” interactions • A threshold-based variant:

  33. Index Selection • Index selection problem: • We can quantify the benefit of an index: • Does benefit(a, X) depend on X? • If so, this is called index interaction

  34. Future Work • Expand our support for updates • Implementation of visualization tool • Experiments with materialization scheduling • Incremental updates to doi function • Exploring stronger assumptions on query optimization • Efficient upper bounds on doi function?

More Related