1 / 43

Materializing Views With Minimal Size To Answer Queries

Materializing Views With Minimal Size To Answer Queries. Rada Chirkova (North Carolina State University) and Chen Li (University of California, Irvine). Materializing Minimal-Size Views. Context: relational databases

kaylee
Télécharger la présentation

Materializing Views With Minimal Size To Answer Queries

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Materializing ViewsWith Minimal SizeTo Answer Queries Rada Chirkova (North Carolina State University) and Chen Li (University of California, Irvine)

  2. Materializing Minimal-Size Views • Context: relational databases • The problem: minimize the amount of data required to answer queries, by: • automatically designing new relations (views), and • precomputing and storing (materializing) the new relations • Central issue: inventing new views to materialize • Applications include: • Mediators in data-integration systems • “Database as a service” in enterprise computing Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 2

  3. Example: Modified TPC-H Query Q(name,o_date,priority,comment,o_key,quantity, shipmode) :- customer(c_key,name,’building’), order(o_key,c_key,o_date,priority,comment), lineitem(lineno,o_key,quantity,shipmode). V1(name,o_date,priority,comment,o_key) :- customer(c_key,name,’building’), order(o_key,c_key,o_date,priority,comment), lineitem(lineno,o_key,quantity,shipmode). V2(o_key,quantity,shipmode) :- customer(c_key,name,’building’), order(o_key,c_key,o_date,priority,comment), lineitem(lineno,o_key,quantity,shipmode). Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 3

  4. Partial Answer to the Query Q Name O_Date Priority Comment O_Key Quantity Shipmode Tom 3/14/95 0 close… 134721 26 REG AIR Tom 3/14/95 0 close… 134721 75 REG AIR Tom 3/14/95 0 close… 134721 43 AIR Jack 12/21/94 0 final… 571683 43 MAIL Jack 12/21/94 0 final… 571683 33 AIR Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 4

  5. Minimal-Size Views for the Query Q Q(name,o_date,priority,comment,o_key,quantity, shipmode) :- customer(c_key,name,’building’), order(o_key,c_key,o_date,priority,comment), lineitem(lineno,o_key,quantity,shipmode). V1(name,o_date,priority,comment,o_key) :- customer(c_key,name,’building’), order(o_key,c_key,o_date,priority,comment), lineitem(lineno,o_key,quantity,shipmode). V2(o_key,quantity,shipmode) :- customer(c_key,name,’building’), order(o_key,c_key,o_date,priority,comment), lineitem(lineno,o_key,quantity,shipmode). Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 5

  6. Questions • How do we know that views V1 and V2 are minimal-size views for the query Q? On what databases? • How to find a set of minimal-size views, given a set of queries and a database: • Is the problem decidable? For what inputs? • What is the complexity of the problem? • Are there good efficient algorithms for finding minimal-size views? Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 6

  7. Preliminaries • Two queries are equivalent if they return the same answers on any database. • An equivalent rewriting of a query Q in terms of views V is a query that: • is defined using the relations in V only, and • is equivalent to Q • A conjunctive query (view) can be defined using only equality selections, projections, and joins • A disjunctive query (view) can be defined as a union of a finite number of conjunctive queries (views) Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 7

  8. Problem Specification • Input: • Database instance D with schema R • Workload Q of queries on D • Output (optimal solution): a set Vof views, such that: • each query in Q has an equivalent rewriting in terms of V, and • the total size of the views, SViÎ Vsize(Vi), is minimalon D Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 8

  9. Assumptions • Single database instance • Set semantics • Finite query workloads • Conjunctive queries • Disjunctive views and rewritings Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 9

  10. Main Results • Decidability and upper bounds on the complexity of the problem • Relationship between: • a restriction on the language of the queries, and • the language of optimal views • Dynamic-programming algorithm for finding an optimal solution for conjunctive queries (restricted case) Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 10

  11. Conjunctive Views and Rewritings Theorem. Given a query workload Q and a database D. It is possible to construct a finite search space of views that includes all views in all optimal solutions for Q on D. The number of views in the search space is at most doubly-exponential in the size of the input query workload Q. Corollary. The problem of finding a minimal-size conjunctive viewset is decidable for finite workloads of conjunctive queries, assuming all rewritings are conjunctive. Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 11

  12. Self-Joins in Queries Q1(X,Y) :- p(X,Z), p(Z,T), s(Z,Y). // self-join Q2(X,Y) :- p(X,Z), r(Z,T), s(Z,Y). // no self-joins • Result 1. For some databases and queries, there is a set of disjunctive views that is better than any conjunctive solution. • Example for a single query with self-joins • Result 2. The problem of finding an optimal solution in the space of disjunctive views is decidable, assuming conjunctive rewritings. • Result 3. It is not necessary to consider disjunctive rewritings. • Result 4. The size of the search space of views is at most triply-exponential in the size of the input query workload. Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 12

  13. Queries Without Self-Joins: The Problem Is in NP Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 13

  14. Queries Without Self-Joins: The Problem Is in NP disjunctive views Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 13

  15. Queries Without Self-Joins: The Problem Is in NP disjunctive views conjunctive views Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 13

  16. Queries Without Self-Joins: The Problem Is in NP disjunctive views conjunctive views subexpression views Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 13

  17. Queries Without Self-Joins: The Problem Is in NP disjunctive views conjunctive views subexpression views full-reducer views Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 13

  18. 1. Conjunctive Views Are Enough Theorem. Given a database D and a set of queries Qwithout self-joins. Suppose a set Vof disjunctive views is a solution for (D,Q). Then there exists another solution V’ for (D,Q), such that: • all views in V’ are conjunctive, and • size (V’) £ size (V). Corollary. For any database and any set of queries without self-joins, some optimal disjunctive solution is a set of conjunctive views. Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 14

  19. What We Have Shown disjunctive views conjunctive views Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 15

  20. Idea of the Proof • Given:Q(…) :- S1(…), S2(…), …, Sn(…); rewriting P of Q that usesV: V = V1È V2È … È Vt • Then there exists: V’ = V’1È V’2È … È V’t such that: • for some mapping m, each V’i is an image of Vi,and • each V’i alone can replace any Vj in the rewriting of Q Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 16

  21. Details of the Proof (1) • P º Q, P = P1È P2 È ... È Ps • There exists a conjunctive query Pi:Piº Q • Pi(…) :- Vi1(…), …, Vij(…), …, Vim(…), G(…). • Fix any Vijin Pi; consider, in P, Pr(…) :- Vij(…), …, Vij(…), …, Vij(…), G(…). • Because Pr is contained in Q, there exists a mapping bfrom Q to the expansion of Pr • We can always change b, to redirect all subgoals of Q that map into subgoals of Vij in Pr Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 17

  22. Details of the Proof (1) • P º Q, P = P1È P2 È ... È Ps • There exists a conjunctive query Pi:Piº Q • Pi(…) :- Vi1(…), …, Vij(…), …, Vim(…), G(…). • Fix any Vijin Pi; consider, in P, Pr(…) :- Vij(…), …, Vij(…), …, Vij(…), G(…). • Because Pr is contained in Q, there exists a mapping bfrom Q to the expansion of Pr • We can always change b, to redirect all subgoals of Q that map into subgoals of Vij in Pr Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 17

  23. Details of the Proof (1) • P º Q, P = P1È P2 È ... È Ps • There exists a conjunctive query Pi:Piº Q • Pi(…) :- Vi1(…), …, Vij(…), …, Vim(…), G(…). • Fix any Vijin Pi; consider, in P, Pr(…) :- Vij(…), …, Vij(…), …, Vij(…), G(…). • Because Pr is contained in Q, there exists a mapping bfrom Q to the expansion of Pr • We can always change b, to redirect all subgoals of Q that map into subgoals of Vij in Pr Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 17

  24. Details of the Proof (1) • P º Q, P = P1È P2 È ... È Ps • There exists a conjunctive query Pi:Piº Q • Pi(…) :- Vi1(…), …, Vij(…), …, Vim(…), G(…). • Fix any Vijin Pi; consider, in P, Pr(…) :- Vij(…), …, Vij(…), …, Vij(…), G(…). • Because Pr is contained in Q, there exists a mapping bfrom Q to the expansion of Pr • We can always change b, to redirect all subgoals of Q that map into subgoals of Vij in Pr Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 17

  25. Details of the Proof (2) • We can always change b, to redirect all subgoals of Q that map into subgoals of more than oneVij in Pr • Then, we can replace Pr with P’r: Pr(…) :- Vij(…), …, Vij(…), …, Vij(…), G(…). P’r(…):- Vij(…), G(…). • And P’rº Q Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 18

  26. Details of the Proof (3) • Changing b, to redirect all subgoals of Q that map into subgoals of Vij in Pr : Q(…) :- …, Sk(…,W,…), … Prexp(…) :- …, Sk(…,Y’,…), …, Sk(…,Y,…), … Pr(…) :- Vij(…), Vij(…), …, Vij(…), G(…) Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 19

  27. Details of the Proof (3) • Changing b, to redirect all subgoals of Q that map into subgoals of Vij in Pr : Q(…) :- …, Sk(…,W,…), … Prexp(…) :- …, Sk(…,Y’,…), …, Sk(…,Y,…), … Pr(…) :- Vij(…), Vij(…), …, Vij(…), G(…) Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 19

  28. Details of the Proof (3) • Changing b, to redirect all subgoals of Q that map into subgoals of Vij in Pr : Q(…) :- …, Sk(…,W,…), … Prexp(…) :- …, Sk(…,Y’,…), …, Sk(…,Y,…), … Pr(…) :- Vij(…), Vij(…), …, Vij(…), G(…) b Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 19

  29. Details of the Proof (3) • Changing b, to redirect all subgoals of Q that map into subgoals of Vij in Pr : Q(…) :- …, Sk(…,W,…), … Prexp(…) :- …, Sk(…,Y’,…), …, Sk(…,Y,…), … Pr(…) :- Vij(…), Vij(…), …, Vij(…), G(…) b Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 19

  30. Details of the Proof (3) • Changing b, to redirect all subgoals of Q that map into subgoals of Vij in Pr : Q(…) :- …, Sk(…,W,…), … Prexp(…) :- …, Sk(…,Y’,…), …, Sk(…,Y,…), … Pr(…) :- Vij(…), Vij(…), …, Vij(…), G(…) b Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 19

  31. Details of the Proof (3) • Changing b, to redirect all subgoals of Q that map into subgoals of Vij in Pr : Q(…) :- …, Sk(…,W,…), … Prexp(…) :- …, Sk(…,Y’,…), …, Sk(…,Y,…), … Pr(…) :- Vij(…), Vij(…), …, Vij(…), G(…) b Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 19

  32. Details of the Proof (3) • Changing b, to redirect all subgoals of Q that map into subgoals of Vij in Pr : Q(…) :- …, Sk(…,W,…), … Prexp(…) :- …, Sk(…,Y’,…), …, Sk(…,Y,…), … Pr(…) :- Vij(…), Vij(…), …, Vij(…), G(…) b’ b Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 19

  33. Details of the Proof (4) • Thus, we can replace Pr with P’r: Pr(…) :- Vij(…), …, Vij(…), …, Vij(…), G(…). P’r(…):- Vij(…), G(…). • And P’rº Q Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 20

  34. 2. Subexpression Views Are Enough Theorem. Given a database D and a set of queries Q without self-joins. Suppose a set Vof disjunctive views is a solution for (D,Q). Then there exists another solution V’ for (D,Q), such that: • all views in V’ are conjunctive subexpression-type, and • size (V’) £ size (V). Corollary. For any database and set of queries without self-joins, some optimal disjunctive solution is a set of conjunctive subexpression-type views. The size of the search space of views is at most singly-exponential in the size of the input query workload Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 21

  35. 3. Full-Reducer Views Are Enough A view V is a full-reducer view for a query Q if V and Q have the same body. Theorem. Given a database D and a single queryQ without self-joins. Suppose a set Vof disjunctive views is a solution for (D,Q). Then there exists another solution V’ for (D,Q), such that: • all views in V’ are conjunctive full-reducer views for Q, and • size (V’) £ size (V). Corollary. For any database and any query without self-joins, some optimal disjunctive solution is a set of conjunctive full-reducer views. Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 22

  36. Using Full-Reducer Views To Rewrite Sets of Queries For query workloads with more than one query, we can merge optimal full-reducer views for individual queries in the workload - and the number of subgoals in the merged views never exceeds the number of subgoals in full-reducer views. Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 23

  37. What We Have Shown disjunctive views conjunctive views subexpression views full-reducer views Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 24

  38. The Problem Is in NP Theorem. Given a database instance, for any finite workload of conjunctive queries without self-joins, the problem of finding a minimal-size disjunctive viewset is in NP. Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 25

  39. Generating Minimal-Size Views • Input: a conjunctive query without self-joins and a database • Output: a minimal-size disjunctive viewset for the query on the database • Method: produce a minimal-size set of conjunctive full-reducer views, • by doing exhaustive search in the space of the views • using a dynamic-programming algorithm (cf. query optimization in System R) • The algorithm returns an optimal solution • Can be modified to work for non-singleton query workloads Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 26

  40. Heuristics for Generating Views • Consider only those views that “cover” up to a fixed number of subgoals of the query • Consider only those views that have up to a fixed number of head attributes • Apply the algorithm separately to several subsets of subgoals of the query, then combine the solutions Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 27

  41. Main Results • Decidability and upper bounds on the complexity of the problem • Relationship between: • a restriction on the language of the queries, and • the language of optimal views • Dynamic-programming algorithm for finding an optimal solution for conjunctive queries (restricted case) Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 28

  42. Some Directions of Future Work • Rewriting queries in more expressive languages: • built-in predicates • disjunctive queries • … • Using more expressive languages of views and rewritings • Maximally-contained rewritings of queries in terms of views Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 29

  43. Reference Jia Li, Rada Chirkova, and Chen Li. Minimizing Data-Communication Costs by Decomposing Query Results in Client-Server Environments. UCI ICS Technical Report, 2003. http://www-db.ics.uci.edu/pages/raccoon/ Chirkova and Li Materializing Views with Minimal Size to Answer Queries 6/09/2003 30

More Related