Query Processing – Query Trees

# Query Processing – Query Trees

Télécharger la présentation

## Query Processing – Query Trees

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
##### Presentation Transcript

1. Query Processing – Query Trees

2. Evaluation of SQL • Conceptual order of evaluation • Cartesian product of all tables in from clause • Rows not satisfying where clauses eliminated • Rows grouped by • Groups not satisfying having eliminated • Select clause target list evaluated • If distinct eliminate duplicate rows • Union after each subselect evaluated • Rows sorted in order by

3. Actual Order of Evaluation • Order chosen by query optimizer • Determines most efficient way • Syntax checking phase • scan - identify tokens of text • parse- check syntax • validate - check attributes and relation names • Query optimization phase • create internal representation - query tree • identify execution strategies - plan • Maintains statistics for tables and columns, indexes • choose suitable plan in query to optimize query e.g.  order of execution of ops, use indexes, etc.

4. Evaluation cont’d • Execution phase • query optimizer produces execution plan • code generator generates code • runtime db processor runs query code • To minimize run time • chosen strategy NOT optimal, but reasonably efficient For procedural languages limited need for query optimization

5. Heuristics in Query Optimization • Apply select and project before join or other binary operations. Why? • select and project reduce size • strategy is obvious, but challenge was to show could be done with rules

6. Query Optimization • Canonical form (initial query tree - conceptual order of evaluation) • Leaf nodes are tables • Internal nodes are operations • Begin by separating select conditions from joins (end up with X) • Combine all selects then all projects • transform to final query tree using general transformation rules for relational algebra

7. Query tree for SQL query Select lname From employee, works_on, project Where pname=‘Aquarius’ and pnumber=pno and essn=ssn and bdate > ‘1957-12-31’

8. General Transformation Rules for Relational Algebra • Cascade of s • Commutativity of s • Cascade of p • Commuting s with p • Commutativity of |X| (and X) • Commuting s with |X| (or X) • Commuting p with |X| (or X)

9. General Transformation Rules for Relational Algebra 8. Commutativity of set operations U and ∩ 9. Associativity of |X|, X, U and ∩ 10. Commuting s with set operations 11. The p operation commuted with U 12. Converting a (s, X) sequence into |X|

10. Outline of a Heuristic Algebraic Optimization Algorithm • Use Rule 1 break up conjunctive s’s into cascades of s’s • Use Rules 2,4,6, 10 for commutativity of s to: • move s’s as far down tree as possible • Use Rules 5 and 9 for commutativity and associativity of binary operations to: • Place most restrictive s (and |X|) so executed first • fewest tuples, smallest absolute size or smallest selectivity • But make sure no cartesian products result

11. Outline of a Heuristic Algebraic Optimization Algorithm • Use Rule 12, combine Cartesian product with s to: • create |X| • Use Rules 3, 4, 7, 11 concerning cascade of p’s and commuting p with other ops to: • move down tree as far as possible • Identify subtrees that represent groups of operations than can be executed by a single algorithm

12. Summary of Heuristics • Apply first operations that reduce size of intermediate results • Perform s’s and p’s as early as possible (move down tree as far as possible) • Execute most restrictive s and |X| first (reorder leaf nodes but avoid cartesian products)

13. Multiple table joins • Multiple table joins • Query plan identifies most efficient ordering of joins • may not have to materialize actual table resulting from join • instead use pipelining - successive rows output from one step fed into next plan

14. Converting trees into Query plans • pipelined evaluation • Forward result from an operation directly to next operation • Result from s, placed in buffer • |X| consumes tuples from buffer • Result from |X| pipelined to p

15. Query Tree Question • Should we do a ppname, pnumber then spname = ‘Aquarius’ then ppnumber ? • No, since the operations are done together • the processor would read a row of project, see if pname = ‘Aquarius’ then use pnumber to perform the join.

16. Algorithms • DBMS has general access algorithms to implement select, join, or combinations of ops • Don't have separate access routines for each op • Creating temporary files is inefficient • Generate algorithms for combinations of operations • join, select, project - code is created dynamically to implement multiple operations

17. Materialized table • Think about what operations require utilizing a materialized table • Input to select? • Input to project? • Input to join? • How is this implemented? Next topic