Query Processing

Query Processing

General Overview Relational model - SQL Formal & commercial query languages Functional Dependencies Normalization Physical Design Indexing Query Processing and Optimization

Review Data Retrieval at the physical level: Indices: data structures to help with some query evaluation: SELECTION queries (ssn = 123) RANGE queries (100 <= ssn <=200) Index choices: Primary vs secondary, dense vs sparse, ISAM vs B+-tree vs Extendible Hashing vs Linear Hashing But what about join queries? Or other queries not directly supported by the indices? How do we evaluate these queries? Sometimes, indexes not useful, even for SELECTION queries. When? What decides when to use them? • A: Query Processing (one of the most complex components of a database system)

QP & O SQL Query Query Processor Data: result of the query

QP & O SQL Query Query Processor Parser Query Optimizer Algebraic Expression Execution plan Evaluator Data: result of the query

QP & O Query Optimizer Algebraic Representation Query Rewriter Algebraic Representation Data Stats Plan Generator Query Execution Plan

Query Processing and Optimization Parser / translator (1st step) Input: SQL Query Output: Algebraic representation of query (relational algebra expression) Eg SELECT balance FROM account WHERE balance < 2500 balance(balance2500(account)) or balance balance2500 account Relational Algebra Tree

QP & O Plan Evaluator (last step) Input: Query Execution Plan Output: Data (Query results) Query execution plan Algorithms of operators that read from disk: Sequential scan Index scan Merge-sort join Nested loop join …..

(On-the-fly) sname (On-the-fly) rating > 5 bid=100 (Simple Nested Loops) sid=sid Sailors Reserves SELECT S.sname FROM Reserves R, Sailors S WHERE R.sid=S.sid AND R.bid=100 AND S.rating>5 psname (sbid=100 and rating >5 (Reserves Sailors)) RA Tree: Plan: sname rating > 5 bid=100 sid=sid Sailors Reserves

QP & O Query Rewriting Input: Algebraic representation of query Output: Algebraic representation of query Idea: Apply heuristics to generate equivalent expression that is likely to lead to a better plan e.g.: amount > 2500 (borrower loan) borrower (amount > 2500(loan)) Why is 2nd better than 1st?

QP & O Plan Generator Input: Algebraic representation of query Output: Query execution plan Idea: generate alternative plans on evaluating a query amount > 2500 Estimate cost for each plan Choose the plan with the lowest cost Sequential scan Index scan

(On-the-fly) sname (Sort-Merge Join) sid=sid (Scan; (Scan; write to write to rating > 5 bid=100 temp T2) temp T1) Reserves Sailors Alternative Plans (On-the-fly) sname (On-the-fly) rating > 5 (Index Nested Loops, with pipelining ) sid=sid (Use hash Sailors bid=100 index; do not write result to temp) Reserves Plan 1 Plan 2 Query Rewriting

QP & O Goal: generate plan with minimum cost (i.e., as fast as possible) Cost factors: CPU time (trivial compared to disk time) Disk access time main cost in most DBs Network latency Main concern in distributed DBs Our metric: count disk accesses

Cost Model How do we predict the cost of a plan? Ans: Cost model For each plan operator and each algorithm we have a cost formula Inputs to formulas depend on relations, attributes Database maintains statistics about relations for this (Metadata)

Statistics and Catalogs • Need information about the relations and indexes involved. Catalogstypically contain at least: • # tuples (NTuples) and # pages (NPages) for each relation. • # distinct key values (NKeys) and NPages for each index. • Index height, low/high key values (Low/High) for each tree index. • Catalogs updated periodically. • Updating whenever data changes is too expensive; lots of approximation anyway, so slight inconsistency ok. • More detailed information (e.g., histograms of the values in some field) are sometimes stored.

Metadata Given a relation r, DBMS likely maintains the following metadata: Size (# of tuples) nr Size (# of blocks) br Block size (#tuples) fr (typicallybr =nr / fr ) Tuple size (in bytes) sr Attribute Variance (for each attribute r, # of different values) V(att, r) Selection Cardinality (for each attribute in r, expected size of a selection: att = K (r ) ) SC(att, r)

Example V(balance, account) = 3 V(acct_no, account) = 6 S(balance, account) = 2 ( nr / V(att, r)) naccount = 6 saccount = 33 bytes faccount =4K/33

Some typical plans and their costs A1 (linear search). Scan each file block and test all records to see whether they satisfy the selection condition. Cost estimate (number of disk blocks scanned) = br br denotes number of blocks containing records from relation r If selection is on a key attribute, cost = (br /2) stop on finding record (on the average in the middle of the file) Linear search can be applied regardless of selection condition or ordering of records in the file, or availability of indices Query: att = K (r )

Selection Operation (Cont.) A2 (binary search). Applicable if selection is an equality comparison on the attribute on which file is ordered. Requires that the blocks of a relation are stored contiguously Cost estimate: log2(br) — cost of locating the first tuple by a binary search on the blocks Plus number of blocks containing records that satisfy selection condition EA2 = log2(br) + sc(att, r) / fr -1 Query: att = K (r ) What is the cost if att is a key? EA2 = log2(br)

Example Query: bname =“Kenmore” ( account ) V(bname, account) = 50 naccount = 10K faccount = 20 tuples/block Primary index on bname Key: acct_no Cost Estimates: A1: EA1 = naccount / faccount = 500 I/O’s A2: EA2 = log2(br) + sc(att, r) / fr -1 = 9 + 9 = 18 I/O’s

More Plans for selection What if there is an index on att? We need metadata on size of index (i). DBMS keeps that of: Index height: HTi Index “Fan Out”: fi Average #of children per node (not same as order..) Index leaf nodes: LBi Note: HTi ~  logfi(LBi)  + 1 example: LBi =64, fi=4

More Plans for selection A3: Index scan, Primary Index What: Follow primary index, searching for key K Prereq: Primary index on att, i Cost: EA3 = HTi+ 1, if att is a candidate key EA3 = HTi+ SC(att, r) / fr, if not Query: att = K (r )

A4: Index scan, Secondary Index What: Follow according index, searching for key K Prereq: Secondary index on att, i Cost: if att not a key: EA4 = HTi+ 1 + SC(att, r) Else, if att is a key: EA4 = HTi+ 1 bucket read Index block reads File block reads (in worst case, each tuple on different block)

HTi ... k, ... k ... k k k ...

Cardinalities Cardinality: the size (number of tuples) in the query result Why do we care? Ans: Cost of every plan depends on nr e.g. Linear scan: br =  nr / fr Primary Index: HTi +1 ~ logfi(LBi) +2 ≤ logfi(nr / fr)+2 But, what if r is the result of another query? Must know the size of query results as well as cost Size of att = K (r ) ? SC(att, r)

Query: att = K (r ) • A4: Index scan, Secondary Index What: Follow according index, searching for key K Prereq: Secondary index on att, i Cost: if att not a key: EA4 = HTi+ 1 + SC(att, r) Else, if att is a key: EA4 = HTi+ 1 bucket read Index block reads File block reads (in worst case, each tuple on different block)

HTi ... k, ... k ... k k k ...

Selections Involving Comparisons Query: Att  K (r ) • A5 (primary index, comparison). (Relation is sorted on Att) • For Att  K(r) use index to find first tuple  v and scan relation sequentially from there • For AttK(r) just scan relation sequentially till first tuple > v; do not use index Cost: EA5 =HTi + c / fr (where c is the cardinality of result) HTi k ... k

Query: Att  K (r ) Cardinality: More metadata on r are needed: min (att, r) : minimum value of att in r max(att, r): maximum value of att in r Then the selectivity of Att = K (r ) is estimated as: (or nr /2 if min, max unknown) Intuition: assume uniform distribution of values between min and max min(attr, r) K max(attr, r)

Plan generation: Range Queries Att K (r ) A6: (secondary index, comparison). Cost: EA6 = HTi -1+ #of leaf nodes to read + # of file blocks to read = HTi -1+ LBi * (c / nr) + c, if att is a candidate key HTi ... k+m k, k+1 k+1 ... k+m k

Plan generation: Range Queries A6: (secondary index, range query). If att is NOT a candidate key HTi ... k+m k, k+1 ... k ... k k+1 k+m ...

Cost: EA6 = HTi -1+ #of leaf nodes to read + #of file blocks to read +#buckets to read = HTi -1+ LBi * (c / nr) + c + x

Query Processing