Query Evaluation

Donghui Zhang Northeastern University Query Evaluation Partially using Prof. Hector Garcia-Molina’s slides (Notes06, Notes07) http://www-db.stanford.edu/~ullman/dscb.html

Server SELECT E.Name FROM Emp E WHERE E.SSN<5000 AND E.Age>50 Michael Jordan Donghui Zhang Query Evaluation SQL Query  Query Result ??? • Check the data and • meta data; • Produce query result Michael Jordan Donghui Zhang

Query Evaluation Steps • Query Compiling: get logical Q.P. • Query Optimization: choose a physical Q.P. • Query Execution: execute

query compiling query execution query optimization answer SQL query parse parse tree convert logical query plan execute apply laws statistics Pi “improved” l.q.p pick best estimate result sizes {(P1,C1),(P2,C2)...} l.q.p. +sizes estimate costs consider physical plans {P1,P2,…..}

SELECT E.Name FROM Emp E WHERE E.SSN<5000 AND E.Age>50 Query Compiling  Parse • Background knowledge: Grammar. • Input: SQL query. • Output: a parse tree. • Start with a simple grammar: • Only SFW (no group by, having, nested query) • Simple AND condition (no OR, UNION, EXISTS, IN, …) • One table (no conditions like E.did=D.did)

SELECT E.Name FROM Emp E WHERE E.SSN<5000 AND E.Age>50 Query Compiling  Parse  Grammar • <SFW> := SELECT <SelList> FROM <Table> WHERE <CondList> • <SelList> := <Attribute> | <Attribute>, <SelList> • <CondList> := <Condition> | <Condition> AND <CondList> • <Condition> := <Attribute> <op> <value> • <op>:= > | < | = | >= | <=

SELECT E.Name FROM Emp E WHERE E.SSN<5000 AND E.Age>50 <Attribute> Emp E <Condition> AND <CondList> <Condition> E.Name <Attribute> <op> <value> E.SSN < 5000 <Attribute> <op> <value> E.Age > 50 Query Compiling  Parse  Parse Tree <SFW> SELECT <SelList> FROM <Table> WHERE <CondList>

SELECT E.Name FROM Emp E WHERE E.SSN<5000 AND E.Age>50 E.Name E.SSN<5000 AND E.Age>50 Emp E Query Compiling  Convert • Input: a parse tree. • Output: a logical query plan. • Algorithm:  followed by . • E.Name(E.SSN<5000 AND E.Age>50(E) ) • Alternatively, a l.q.p tree.

SELECT E.Name FROM Emp E WHERE E.SSN<5000 AND E.Age>50 E.Name E.SSN<5000 AND E.Age>50 Emp E Query Compiling  Apply Laws • Replace  with , push  [and ] down. • Only used for multiple tables. So skip.

SELECT E.Name FROM Emp E WHERE E.SSN<5000 AND E.Age>50 Query Optimization  Estimate Result Sizes • The size of each input table is stored as meta data. • Intermediate result: size not known, but needed to estimate I/O cost of physical plan. • But for the simple case,  can be evaluated on the fly. So no need to estimate the size of . So skip.

SELECT E.Name FROM Emp E WHERE E.SSN<5000 AND E.Age>50 E.Name E.SSN<5000 AND E.Age>50 Emp E Query Optimization  Consider Physical Plans • Associate each RA operator with an implementation scheme. • Multiple implementation schemes? Enumerate all. Plan 1 (always work!) on-the-fly scan

SELECT E.Name FROM Emp E WHERE E.SSN<5000 AND E.Age>50 SSN index 1 2 3 4 5 6 Query Optimization  Consider Physical Plans • For the other physical plans, need to know what indices exist. • Primary index: controls the actual storage of a table. • Suppose a primary B+-tree index exists on SSN. • Secondary index: built on some other attribute. Does not store the actual record. Each leaf entry stores a set of page IDs in the primary index. • Suppose a secondary B+-tree index exists on Age. e.g. entry in Age index: Age=50, pageIDs= {1, 4, 6}

SELECT E.Name FROM Emp E WHERE E.SSN<5000 AND E.Age>50 E.Name E.SSN<5000 AND E.Age>50 Emp E Query Optimization  Consider Physical Plans Plan 2 on-the-fly range search in SSN index

SELECT E.Name FROM Emp E WHERE E.SSN<5000 AND E.Age>50 E.Name E.SSN<5000 AND E.Age>50 Emp E Query Optimization  Consider Physical Plans Plan 3 on-the-fly range search in Age index, follow pointers to SSN index

SELECT E.Name FROM Emp E WHERE E.SSN<5000 AND E.Age>50 Query Optimization  Estimate Costs • Estimate #I/Os for each physical plan. • Pick the cheapest one. • Input: physical plan. • Additional input: • meta data (e.g. how many levels a B+-tree has) • assumptions (e.g. the root node of every B+-tree is pinned) • memory buffer size.

SELECT E.Name FROM Emp E WHERE E.SSN<5000 AND E.Age>50 Query Optimization  Estimate Costs  Meta Data • All the database tables. • For each table R: • Schema • T(R): #records in R • For every attribute A: • V(R, A): #distinct values of A • min(R, A): minimum value of A • max(R, A): maximum value of A • Primary index: #levels, #leaf nodes. • Secondary index: #levels, #leaf nodes, average #pageIDs per leaf entry.

SELECT E.Name FROM Emp E WHERE E.SSN<5000 AND E.Age>50 Query Optimization  Estimate Costs  sample input • Assume for table E: • Schema = (SSN: int, Name: string, Age: int, Salary: int) • T(E) = 100 tuples. • For attribute SSN: • V(E, SSN)=100, min(E, SSN)=0000, max(E, SSN)=9999 • For attribute Age: • V(E, Age)=20, min(E, Age)=21, max(E, Age)=60 • Primary index on SSN: 3 level B+-tree, 50 leaf nodes. • Secondary index on Age: 2 level B+-tree, 10 leaf nodes, every leaf entry points to 3.5 pageIDs (on average). • Assumptions: all B+-tree roots are pinned. Can reach the first leaf page of a B+-tree directly. • Memory buffer size: 2 pages.

SELECT E.Name FROM Emp E WHERE E.SSN<5000 AND E.Age>50 E.Name E.SSN<5000 AND E.Age>50 Emp E Query Optimization  Estimate Costs • Cost = 50. (The primary index has 50 leaf nodes. Assume we can reach the first leaf page of a B+-tree directly.) Plan 1 (always work!) on-the-fly scan

SELECT E.Name FROM Emp E WHERE E.SSN<5000 AND E.Age>50 E.Name E.SSN<5000 AND E.Age>50 Emp E Query Optimization  Estimate Costs • Cost = 25. SSN<5000 selects half of the employees, so 50/2=25 leaf nodes. • Note: if condition is E.SSN>5000, needs 1 more I/O. Plan 2 on-the-fly range search in SSN index

SELECT E.Name FROM Emp E WHERE E.SSN<5000 AND E.Age>50 #I/Os in the SSN index #I/Os in the Age index E.Name E.SSN<5000 AND E.Age>50 Emp E Query Optimization  Estimate Costs • Cost = 10/4 +  20/4 * 3.5 = 21. Plan 3 on-the-fly range search in Age index, follow pointers to SSN index

SELECT E.Name FROM Emp E WHERE E.SSN<5000 AND E.Age>50 Check 1/4 of them, since [51,60] is 1/4 of [21,60]. Age index has 10 leaf nodes. E.Name E.SSN<5000 AND E.Age>50 Emp E Query Optimization  Estimate Costs • Cost = 10/4 +  20/4 * 3.5 = 21. Plan 3 on-the-fly range search in Age index, follow pointers to SSN index

SELECT E.Name FROM Emp E WHERE E.SSN<5000 AND E.Age>50 times 3.5 (#pageIDs per page) to get #I/Os in the SSN index. 20 distinct ages divided by 4 to get #ages in [51,60]. E.Name E.SSN<5000 AND E.Age>50 Emp E Query Optimization  Estimate Costs • Cost = 10/4 +  20/4 * 3.5 = 21. Plan 3 on-the-fly range search in Age index, follow pointers to SSN index

SELECT E.Name FROM Emp E WHERE E.SSN<5000 AND E.Age>50 Pick! Query Optimization  Pick Best

Another case study: two tables. • Extended grammar: • Only SFW (no group by, having, nested query) • Simple AND condition (no OR, UNION, EXISTS, IN, …) • Allow two tables (allow conditions like E.did=D.did) • Example query: SELECT E.Name, D.Dname FROM Emp E, Dept D WHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000

SELECT E.Name, D.Dname FROM Emp E, Dept D WHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000 Query Compiling  Parse  Grammar • <SFW> := SELECT <SelList> FROM <TableList> WHERE <CondList> • <SelList> := <Attribute> | <Attribute>, <SelList> • <TableList> := <Table> | <Table>, <Table> • <CondList> := <Condition> | <Condition> AND <CondList> • <Condition> := <Attribute> <op> <value> | <Attribute> = <Attribute> • <op>:= > | < | = | >= | <=

SELECT E.Name, D.Dname FROM Emp E, Dept D WHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000 <Attribute> , <SelList> E.Name <Attribute> D.Dname Query Compiling  Parse  Parse Tree <SFW> SELECT <SelList> FROM <TableList> WHERE <CondList>

SELECT E.Name, D.Dname FROM Emp E, Dept D WHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000 <Table> <Table> , Emp E Dept D Query Compiling  Parse  Parse Tree <SFW> SELECT <SelList> FROM <TableList> WHERE <CondList>

SELECT E.Name, D.Dname FROM Emp E, Dept D WHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000 <Condition> AND <CondList> <Attribute> = <Attribute> <Condition> AND <CondList> E.Did D.Did <Condition> Query Compiling  Parse  Parse Tree <SFW> SELECT <SelList> FROM <TableList> WHERE <CondList>

SELECT E.Name, D.Dname FROM Emp E, Dept D WHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000 E.Name, D.Dname E.Did=D.Did AND E.SSN<5000 AND D.budget=1000  Emp E Dept D Query Compiling  Convert • Algorithm:  then  then . • E.Name. D.Dname(E.Did=D.Did AND E.SSN<5000 AND D.budget=1000(ED) ) • The l.q.p tree:

SELECT E.Name, D.Dname FROM Emp E, Dept D WHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000 E.Name, D.Dname E.Did=D.Did AND E.SSN<5000 AND D.budget=1000  Emp E Dept D Query Compiling  Apply Laws • Always always: (try to) replace  with !

SELECT E.Name, D.Dname FROM Emp E, Dept D WHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000 Query Compiling  Apply Laws • Always always: (try to) replace  with ! • Also, push  down. E.Name, D.Dname E.SSN<5000 AND D.budget=1000 Emp E Dept D

SELECT E.Name, D.Dname FROM Emp E, Dept D WHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000 Query Compiling  Apply Laws • Always always: (try to) replace  with ! • Also, push  down. E.Name, D.Dname E.SSN<5000 D.budget=1000 Emp E Dept D

SELECT E.Name, D.Dname FROM Emp E, Dept D WHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000 Query Compiling  Apply Laws Theory Behind • Let p = predicate with only E attributes q = predicate with only D attributes m = E & D’s common attributes are equal • We have: pqm (E  D) = p(E) q(D)

SELECT E.Name, D.Dname FROM Emp E, Dept D WHERE E.Did=D.Did AND E.SSN<5000 AND D.budget=1000 Query Optimization  Consider Physical Plans • Because join is so important, let’s skip result size estimation for now, and let’s assume selections are not pushed down. E.Name, D.Dname E.SSN<5000 AND D.budget=1000 Emp E Dept D

Four Join Algorithms • Iteration join (nested loop join) • Merge join • Hash join • Join with index

Example E D over common attribute Did • E: • T(E)=10,000 • primary index on SSN, 3 levels. • |E|= 1,000 leaf nodes. • D: • T(D)=5,000 • primary index on Did. 3 levels. • |D| = 500 leaf nodes. • Memory available = 101 blocks

Iteration Join • for every block in E • scan through D; • join records in the E block with records in the D block. • I/O cost = |E| + |E| * |D| = • 1000 + 1000*500 = 501,000. • Works good for small buffer (e.g. two blocks).

Can we do better? Use our memory (1) Read 100 blocks of E (2) Read all of D (using 1 block) + join (3) Repeat until done • I/O cost = |E| + |E|/100 * |D| = • 1000 + 10*500 = 6,000.

Can we do better? Reverse join order: D E. i.e. For every 100 D blocks, go through E. • I/O cost = |D| + |D|/100 * |E| = • 500 + 5*1000 = 5,500.

Merge join (conceptually) (1) if R1 and R2 not sorted, sort them (2) i  1; j  1; While (i  T(R1))  (j  T(R2)) do if R1{ i }.C = R2{ j }.C then outputTuples else if R1{ i }.C > R2{ j }.C then j  j+1 else if R1{ i }.C < R2{ j }.C then i  i+1

Procedure Output-Tuples While (R1{ i }.C = R2{ j }.C) (i  T(R1)) do [jj  j; while (R1{ i }.C = R2{ jj }.C) (jj  T(R2)) do [output pair R1{ i }, R2{ jj }; jj  jj+1 ] i  i+1 ]

Example i R1{i}.C R2{j}.C j 1 10 5 1 2 20 20 2 3 20 20 3 4 30 30 4 5 40 30 5 50 6 52 7

Merge Join Cost • Recall that |E|=1000, |D|=500. And |D| is already sorted on Did. • External sort E: pass 0, by reading and writing E, produces a file with 10 sorted runs. Another read is enough. • No need to write! Can pipeline to join operator. • Cost = 3*1000 + 500 = 3,500.

Hash join (conceptual) • Hash function h, range 0  k • Buckets for R1: G0, G1, ... Gk • Buckets for R2: H0, H1, ... Hk Algorithm (1) Hash R1 tuples into G buckets (2) Hash R2 tuples into H buckets (3) For i = 0 to k do match tuples in Gi, Hi buckets

Simple example hash: even/odd R1 R2 Buckets 2 5 Even 4 4 R1 R2 3 12 Odd: 5 3 8 13 9 8 11 14 2 4 8 4 12 8 14 3 5 9 5 3 13 11

Hash Join Cost • Read + write both E and D for partitioning, then read to join. • Cost = 3 * (1000 + 500) = 4,500.

Query Evaluation