460 likes | 567 Vues
This text delves into the intricacies of query processing in SQL, focusing on the use of relational algebra to optimize execution plans. It highlights the steps for executing a sample query involving two relations, R and S, and elaborates on basic operations such as selection, projection, and joins. The discussion emphasizes the performance implications of various approaches and introduces formal relational query languages, demonstrating how mathematical foundations impact practical SQL execution. Overall, it presents a comprehensive overview for understanding SQL query execution efficiency.
E N D
CS4432: Database Systems II query processing
Query Processing Query in SQL Query Plan in Algebra query processing
Example Data: relation R (A, B, C) relation S (C, D, E) Query: SELECT B, D FROM R, S WHERE R.A = “c” and S.E = 2 and R.C=S.C query processing
SELECT B, D FROM R, S WHERE R.A = “c” and S.E = 2 and R.C=S.C Answer B D 2 x R A B C S C D E a 1 10 10 x 2 b 1 20 20 y 2 c 2 10 30 z 2 d 2 35 40 x 1 e 3 45 50 y 3 query processing
How do we execute query? - Form Cartesian product of all tables in FROM-clause - Select tuples that match WHERE-clause - Project columns that occur in SELECT-clause One idea query processing
Bingo! Got one... R X S R.A R.B R.C S.C S.D S.E a 1 10 10 x 2 a 1 10 20 y 2 . . C 2 10 10 x 2 . . query processing
But ? • Performance would be unacceptable! • We need a better approach to reasoning about queries, their execution orders and their respective costs query processing
Formal Relational Query Languages • Two mathematical Query Languages form basis for “real” languages (e.g. SQL), and for implementation: • Relational Calculus: Lets users describe what they want, rather than how to compute it. (Non-operational, declarative.) • Relational Algebra: More operational, very useful for representing execution plans. query processing
Relational Algebra • Tuple : ordered set of data values • Relation: a set of tuples • Algebra: formal mathematical system consisting of a set of objects and operations on those objects • Relational algebra: Algebra whose objects are relations and operators transform relations into other relations query processing
Relational Algebra ? Query: SELECT B, D FROM R, S WHERE R.A = “c” and S.E = 2 and R.C=S.C query processing
Relational Algebra - can be used to describe plans... Ex: Plan I B,D sR.A=“c” S.E=2 R.C=S.C X R S OR: B,D [sR.A=“c” S.E=2 R.C = S.C (RXS)] query processing
R1 Example Instances “Sailors” and “Reserves” relations S1 S2 query processing
Relational Algebra • Basic operations: • Selection ( ) Selects a subset of rows from relation. • Projection ( ) Deletes unwanted columns from relation. • Cross-product( ) Allows us to combine two relations. • Set-difference ( ) Tuples in reln. 1, but not in reln. 2. • Union( ) Tuples in reln. 1 and in reln. 2. • Additional operations: • Intersection, join, division, renaming: Not essential ! • Algebra is “closed”: Since each operation returns a relation, operationscan be composed ! query processing
Projection query processing
Selection query processing
Union, Intersection, Set Difference • Operate on two union-compatible relations: • Same number of fields. • `Corresponding’ fields have same type. query processing
Cross-Product • Each row of S1 is paired with each row of R1. • Conflict: Both S1 and R1 have a field called sid. • Renaming operator: query processing
Joins • Condition Join : • Result schema same as that of cross-product. query processing
Joins • Equi-Join: condition contains only equalities. • Result schema only one copy of fields for which equality is specified. • Natural Join: Equijoin on all common fields. query processing
Division • Not primitive operator, but useful: Find sailors who have reserved allboats. • A has 2 fields x and y; B has only field y: • A/B = • i.e., A/B contains all x tuples (sailors) such that for everyy tuple (boat) in B, there is an xy tuple in A. query processing
B1 B2 B3 A Examples of Division A/B A/B1 A/B2 A/B3 query processing
Disqualified x values: A/B: all disqualified tuples Expressing A/B Using Basic Operators • Division is useful shorthand. • Idea: For A/B, compute all x values that are not `disqualified’ by some y value in B. query processing
Solution 1: • Solution 2: • Solution 3: Find names of sailors who’ve reserved boat #103 query processing
A more efficient solution: Find names of sailors who’ve reserved a red boat • Information about boat color only available in Boats; so need an extra join: A query optimizer can find this, given the first solution! query processing
What happens if is replaced by in this query? Find sailors who’ve reserved a red or a green boat • Can identify all red or green boats, then find sailors who’ve reserved one of these boats: • Can also define Tempboats using union! (How?) query processing
Find sailors who’ve reserved a red and a green boat • Must identify sailors who’ve reserved red boats, sailors who’ve reserved green boats, then find the intersection (sid is a key for Sailors): query processing
Find names of sailors who’ve reserved all boats • Uses division; schemas of input relations to / must be carefully chosen: • To find sailors who’ve reserved all ‘Interlake’ boats: ..... query processing
Relational Algebra representation used to describe plans... query processing
Example Data: relation R (A, B, C) relation S (C, D, E) Query: SELECT B, D FROM R, S WHERE R.A = “c” and S.E = 2 and R.C=S.C query processing
Relational Algebra - to describe plan Ex: Plan I B,D sR.A=“c” S.E=2 R.C=S.C X R S OR: B,D [sR.A=“c” S.E=2 R.C = S.C (RXS)] query processing
Another idea: Plan II B,D sR.A = “c”sS.E = 2 R S natural join query processing
R S A B C s (R) s(S) C D E a 1 10 A B C C D E 10 x 2 b 1 20 c 2 10 10 x 2 20 y 2 c 2 10 20 y 2 30 z 2 d 2 35 30 z 2 40 x 1 e 3 45 50 y 3 SELECT B,D FROM R,S WHERE R.A = “c” and S.E = 2 and R.C=S.C query processing
Yet another idea: Plan III B,D sS.E = 2 sR.A = “c” R S natural join query processing
Plan III Use R.A and S.C Indexes (1) Use R.A index to select R tuples with R.A = “c” (2) For each R.C value found, use S.C index to find matching tuples (3) Eliminate S tuples S.E 2 (4) Join matching R,S tuples, project B,D attributes and place in result query processing
=“c” <c,2,10> <10,x,2> check=2? output: <2,x> next tuple: <c,7,15> R S A B C C D E a 1 10 10 x 2 b 1 20 20 y 2 c 2 10 30 z 2 d 2 35 40 x 1 e 3 45 50 y 3 A C I1 I2 query processing
Overview of Query Optimization query processing
SQL query parse parse tree convert answer logical query plan execute apply laws statistics Pi “improved” l.q.p pick best estimate result sizes {(P1,C1),(P2,C2)...} l.q.p. +sizes estimate costs consider physical plans {P1,P2,…..} query processing
Example: SQL query Query : Find the movies with stars born in 1960 SELECT title FROM StarsIn WHERE starName IN ( SELECT name FROM MovieStar WHERE birthdate LIKE ‘%1960’ ); query processing
<Query> <SFW> SELECT <SelList> FROM <FromList> WHERE <Condition> <Attribute> <RelName> <Tuple> IN <Query> title StarsIn <Attribute> ( <Query> ) starName <SFW> SELECT <SelList> FROM <FromList> WHERE <Condition> <Attribute> <RelName> <Attribute> LIKE <Pattern> name MovieStar birthDate ‘%1960’ Example: Parse Tree query processing
Example: Generating Relational Algebra title StarsIn <condition> <tuple> IN name <attribute> birthdate LIKE ‘%1960’ starName MovieStar Fig. 16.14: An expression using a two-argument , midway between a parse tree and relational algebra query processing
Example: Logical Query Plan title starName=name StarsIn name birthdate LIKE ‘%1960’ MovieStar Fig. 16.16: Applying the rule for IN conditions query processing
Example: Improved Logical Query Plan title Question: Push project to StarsIn? starName=name StarsIn name birthdate LIKE ‘%1960’ MovieStar Fig. 16.16: An improvement on prev fig query processing
Example: Estimate Result Sizes Need expected size StarsIn MovieStar P s query processing
Example: One Physical Plan Parameters: join order, memory size, project attributes,... Hash join SEQ scan index scan Parameters: Select Condition,... StarsIn MovieStar query processing
Example: Estimate costs L.Q.P P1 P2 …. Pn C1 C2 …. Cn Pick best! query processing
Query Optimization • Relational algebra level … • Detailed query plan level … • Estimate costs • Generate and compare plans query processing