1 / 114

CS 245: Database System Principles

CS 245: Database System Principles. Notes 6: Query Processing Hector Garcia-Molina. Query Processing. Q  Query Plan. Focus: Relational System. Others?. Query Processing. Q  Query Plan. Example. Select B,D From R,S Where R.A = “c”  S.E = 2  R.C=S.C.

boyd
Télécharger la présentation

CS 245: Database System Principles

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 245: Database System Principles Notes 6: Query Processing Hector Garcia-Molina Notes 6

  2. Query Processing Q  Query Plan Notes 6

  3. Focus: Relational System • Others? Query Processing Q  Query Plan Notes 6

  4. Example Select B,D From R,S Where R.A = “c”  S.E = 2  R.C=S.C Notes 6

  5. R A B C S C D E a 1 10 10 x 2 b 1 20 20 y 2 c 2 10 30 z 2 d 2 35 40 x 1 e 3 45 50 y 3 Notes 6

  6. Answer B D 2 x R A B C S C D E a 1 10 10 x 2 b 1 20 20 y 2 c 2 10 30 z 2 d 2 35 40 x 1 e 3 45 50 y 3 Notes 6

  7. How do we execute query? - Do Cartesian product - Select tuples - Do projection One idea Notes 6

  8. RXS R.A R.B R.C S.C S.D S.E a 1 10 10 x 2 a 1 10 20 y 2 . . C 2 10 10 x 2 . . Notes 6

  9. Bingo! Got one... RXS R.A R.B R.C S.C S.D S.E a 1 10 10 x 2 a 1 10 20 y 2 . . C 2 10 10 x 2 . . Notes 6

  10. Relational Algebra - can be used to describe plans... Ex: Plan I B,D sR.A=“c” S.E=2  R.C=S.C X R S Notes 6

  11. Relational Algebra - can be used to describe plans... Ex: Plan I B,D sR.A=“c” S.E=2  R.C=S.C X R S OR: B,D [sR.A=“c” S.E=2  R.C = S.C (RXS)] Notes 6

  12. Another idea: Plan II B,D sR.A = “c”sS.E = 2 R S natural join Notes 6

  13. R S A B C s (R) s(S) C D E a 1 10 A B C C D E 10 x 2 b 1 20 c 2 10 10 x 2 20 y 2 c 2 10 20 y 2 30 z 2 d 2 35 30 z 2 40 x 1 e 3 45 50 y 3 Notes 6

  14. Plan III Use R.A and S.C Indexes (1) Use R.A index to select R tuples with R.A = “c” (2) For each R.C value found, use S.C index to find matching tuples Notes 6

  15. Plan III Use R.A and S.C Indexes (1) Use R.A index to select R tuples with R.A = “c” (2) For each R.C value found, use S.C index to find matching tuples (3) Eliminate S tuples S.E  2 (4) Join matching R,S tuples, project B,D attributes and place in result Notes 6

  16. R S A B C C D E a 1 10 10 x 2 b 1 20 20 y 2 c 2 10 30 z 2 d 2 35 40 x 1 e 3 45 50 y 3 A C I1 I2 Notes 6

  17. =“c” <c,2,10> R S A B C C D E a 1 10 10 x 2 b 1 20 20 y 2 c 2 10 30 z 2 d 2 35 40 x 1 e 3 45 50 y 3 A C I1 I2 Notes 6

  18. =“c” <c,2,10> <10,x,2> R S A B C C D E a 1 10 10 x 2 b 1 20 20 y 2 c 2 10 30 z 2 d 2 35 40 x 1 e 3 45 50 y 3 A C I1 I2 Notes 6

  19. =“c” <c,2,10> <10,x,2> check=2? output: <2,x> R S A B C C D E a 1 10 10 x 2 b 1 20 20 y 2 c 2 10 30 z 2 d 2 35 40 x 1 e 3 45 50 y 3 A C I1 I2 Notes 6

  20. =“c” <c,2,10> <10,x,2> check=2? output: <2,x> next tuple: <c,7,15> R S A B C C D E a 1 10 10 x 2 b 1 20 20 y 2 c 2 10 30 z 2 d 2 35 40 x 1 e 3 45 50 y 3 A C I1 I2 Notes 6

  21. Overview of Query Optimization Notes 6

  22. SQL query parse parse tree convert answer logical query plan execute apply laws statistics Pi “improved” l.q.p pick best estimate result sizes {(P1,C1),(P2,C2)...} l.q.p. +sizes estimate costs consider physical plans {P1,P2,…..} Notes 6

  23. Example: SQL query SELECT title FROM StarsIn WHERE starName IN ( SELECT name FROM MovieStar WHERE birthdate LIKE ‘%1960’ ); (Find the movies with stars born in 1960) Notes 6

  24. Example: Parse Tree <Query> <SFW> SELECT <SelList> FROM <FromList> WHERE <Condition> <Attribute> <RelName> <Tuple> IN <Query> title StarsIn <Attribute> ( <Query> ) starName <SFW> SELECT <SelList> FROM <FromList> WHERE <Condition> <Attribute> <RelName> <Attribute> LIKE <Pattern> name MovieStar birthDate ‘%1960’ Notes 6

  25. Example: Generating Relational Algebra title  StarsIn <condition> <tuple> IN name <attribute> birthdate LIKE ‘%1960’ starName MovieStar Fig. 7.15: An expression using a two-argument , midway between a parse tree and relational algebra Notes 6

  26. Example: Logical Query Plan title starName=name  StarsIn name birthdate LIKE ‘%1960’ MovieStar Fig. 7.18: Applying the rule for IN conditions Notes 6

  27. Example: Improved Logical Query Plan title Question: Push project to StarsIn? starName=name StarsIn name birthdate LIKE ‘%1960’ MovieStar Fig. 7.20: An improvement on fig. 7.18. Notes 6

  28. Example: Estimate Result Sizes Need expected size StarsIn MovieStar P s Notes 6

  29. Example: One Physical Plan Parameters: join order, memory size, project attributes,... Hash join SEQ scan index scan Parameters: Select Condition,... StarsIn MovieStar Notes 6

  30. Example: Estimate costs L.Q.P P1 P2 …. Pn C1 C2 …. Cn Pick best! Notes 6

  31. Textbook outline Chapter 15 5 Algebra for queries [bags vs sets] - Select, project, join, …. [project list a,a+b->x,…] - Duplicate elimination, grouping, sorting 15.1 Physical operators - Scan,sort, … 15.2 - 15.6 Implementing operators + estimating their cost [Ch 5] [15.1] [15.2-15.6] Notes 6

  32. Chapter 16 16.1[16.1] Parsing 16.2[16.2] Algebraic laws 16.3[16.3] Parse tree -> logical query plan 16.4[16.4] Estimating result sizes 16.5-7[16.5-7] Cost based optimization Notes 6

  33. Reading textbook - Chapters 15, 16 Optional: • Sections 15.7, 15.8, 15.9 [15.7, 15.8] • Sections 16.6, 16.7 [16.6, 16.7] Optional: Duplicate elimination operator grouping, aggregation operators Notes 6

  34. Query Optimization - In class order • Relational algebra level • Detailed query plan level Notes 6

  35. Query Optimization - In class order • Relational algebra level • Detailed query plan level • Estimate Costs • without indexes • with indexes • Generate and compare plans Notes 6

  36. Relational algebra optimization • Transformation rules (preserve equivalence) • What are good transformations? Notes 6

  37. Rules:Natural joins & cross products & union R S = S R (R S) T = R (S T) Notes 6

  38. Note: • Carry attribute names in results, so order is not important • Can also write as trees, e.g.: T R R S S T Notes 6

  39. Rules:Natural joins & cross products & union R S = S R (R S) T = R (S T) R x S = S x R (R x S) x T = R x (S x T) R U S = S U R R U (S U T) = (R U S) U T Notes 6

  40. Rules: Selects sp1p2(R) = sp1vp2(R) = Notes 6

  41. Rules: Selects sp1p2(R) = sp1vp2(R) = sp1 [ sp2 (R)] [ sp1 (R)] U [ sp2 (R)] Notes 6

  42. Bags vs. Sets R = {a,a,b,b,b,c} S = {b,b,c,c,d} RUS = ? Notes 6

  43. Bags vs. Sets R = {a,a,b,b,b,c} S = {b,b,c,c,d} RUS = ? • Option 1 SUM • RUS = {a,a,b,b,b,b,b,c,c,c,d} • Option 2 MAX • RUS = {a,a,b,b,b,c,c,d} Notes 6

  44. Option 2 (MAX) makes this rule work:sp1vp2 (R) = sp1(R) U sp2(R) Example: R={a,a,b,b,b,c} P1 satisfied by a,b; P2 satisfied by b,c Notes 6

  45. Option 2 (MAX) makes this rule work:sp1vp2 (R) = sp1(R) U sp2(R) Example: R={a,a,b,b,b,c} P1 satisfied by a,b; P2 satisfied by b,c sp1vp2 (R) = {a,a,b,b,b,c}sp1(R) = {a,a,b,b,b}sp2(R) = {b,b,b,c}sp1(R) U sp2 (R) = {a,a,b,b,b,c} Notes 6

  46. “Sum” option makes more sense: Senators (……) Rep (……) T1 = pyr,state Senators; T2 = pyr,state Reps T1 Yr State T2 Yr State 97 CA 99 CA 99 CA 99 CA 98 AZ 98 CA Union? Notes 6

  47. Executive Decision -> Use “SUM” option for bag unions -> Some rules cannot be used for bags Notes 6

  48. Rules: Project Let: X = set of attributes Y = set of attributes XY = X U Y pxy (R) = Notes 6

  49. Rules: Project Let: X = set of attributes Y = set of attributes XY = X U Y pxy (R) = px [py (R)] Notes 6

  50. Rules: Project Let: X = set of attributes Y = set of attributes XY = X U Y pxy (R) = px [py (R)] Notes 6

More Related