Understanding Query Execution in Database Management: Logical & Physical Query Plans

CS 440 Database Management Systems Query Execution

Queries • SELECT … FROM … WHERE... • Describe what data items we want • Logical query plan • No implementation! • RDBMS converts this to a physical query plan. • It describes how to execute the logical query plan.

Logical Query Plan • Logical Operators • Relational Algebra • Selection, Projection, Join, Union, … • Limited number of operators

Physical Query Plan • Physical operators • Algorithms to implement logical operators. • Logical operator: Join • Hash join, index join, nested loop join, … • Various physical operator for each logical operator

Physical Query Plan • Some physical operations for implicit logical operations • Table scan • Let’s study them to find the most efficient one for a given setting.

Basic Physical Operator: Scan • Read the entire or selected tuples of relation R. • Tuples satisfy some predicates • Table-scan: R resides in the secondary storage, read its blocks one by one. • Index-scan: If there is an index on R, use the index to find the blocks. • More efficient to find tuples that satisfy some predicates

Physical Operators: Making Choices • Logical query plan: what we want. • Physical query plan: how to get it done. • More than one physical plan for each logical plan • How to pick the right one? • We study the costs of physical operators.

Cost Statistics • I/O access is the dominant cost of a physical operator. • B(R): number of blocks of R. • T(R): number of tuples in R. • M: number of blocks that fit in main memory • V(R, A): number of distinct values in attribute A. • Table-scan: B(R)

Sort-Scan • SELECT … FROM … WHERE… ORDER BY • SELECT * FROM Beers ORDER BY name • Used in other physical operators • R has an index on the sorting attribute? • R fits in main memory? • R is too large and does not have index? • Two-pass, multi-way merge sort algorithm

Divide and Conquer • Divide data using physical rule • into smaller chunks that fit into memory • Sort in memory • Merge

Two Pass, Multi-way Merge Sort • Phase 1: Read R in groups of M blocks, sort, and write them as runs of size M on disk. M blocks . . . . . . Disk Disk M blocks of main memory

Two Pass, Multi-way Merge Sort • Phase 2: Merge M – 1 blocks at a time and write the results to disk. • Read one block from each run. • Keep one block for the output. Input 1 . . . . . . Input 2 Output . . . . Input M - 1 Disk Disk M blocks of main memory

Two Pass, Multi-way Merge Sort • Requirement: B(R) <= M (M – 1) or simply B(R) <= M2

Cost of Sort • 2B(R) in the first pass + B(R) in the second pass. • We do not consider the cost of writing the output in physical operators. • The results may be pipelined and never written to disk. • If writing the output on disk: 4B(R).

Cost of Sort • Review: • Table-scan: B(R) • Sort-scan: 3B(R) ( if R does not have index). • Requirement: the relation is clustered. • R is stored in B blocks. • If R is unclustered, its tuples are mixed with tuples from other relations. • Table-scan: T(R). • Sort-scan: T(R) + 2B(R).

Iterator Model • Each physical operator implements three functions: • Open: starts the operation and initializes the data structures. • GetNext: returns the next tuple in the result. • Close: ends the operation and frees the resources. • It enables pipelining • Other option: compute the result of the operator in full and store it in disk or memory: inefficient. • pipelining is not always possible: sort-scan

Operator Categorization • Based on the access method used: • Sorting-based. • Index-based. • Hash-based. • Based on the number of I/O stages: • One pass: reading the blocks only once. Used where the entire relation fits in main memory. • Two pass: relations may not fit in main memory, but they are not extremely large: two pass merge-sort. • Multi-pass: Extremely large relations.

One-pass Algorithms

One-pass Algorithms • Selection s(R) and Projection P(R). • Both can be pipelined. • We need only one block in main memory: M >= 1 • Cost for both is B(R) • No index on the selection attribute • R is clustered on disk. Selection/ Projection Input buffer Output buffer

Duplicate Elimination: d(R) • COUNT( DISTINCT …) • Keep a list of distinct tuples seen so far in main memory • Main memory hash-table, search tree, … . • Requirement: M >= B(d(R)) • We have to effectively estimate the number distinct values in R to plan ahead. • Efficient (probabilistic) estimation algorithms are available. • Cost: B(R) with clustered relation and no index.

Groupingand aggregation: gA, sum(B)(R) SELECT sum(price) FROM Sell GROUP BY(beer) • Keep a list of distinct beers and their sum(price) values in main memory. • The number of distinct beers must not exceed the size of main memory. • Cost: B(R) • Can we use pipelining?

One-pass Algorithms for Set/Bag Operators R ∩ S, R U S, R – S • Bag union? • Just read and combine the tuples. • We need at least one buffer block in main memory. • Cost: B(R) + B(S)

General One-pass Algorithm for Set Operators • Read the smaller relation (R) into main memory. • Build a search data structure over the tuples of R in main memory: hash-table, … • Read the blocks from S and probe its tuples over R’s. • Requirement: M >= B(R). • Cost: B(R) + B(S).

One-pass Algorithms for Set Operators • Set union? • Set intersection? • Set difference? • Join, product, bag difference, bag intersection • See the textbook

One-and-a-half-pass Algorithms

Nested Loop Join: One-and-a-half-pass Algorithm • Tuple-based nested loop join R ∞ S For each tuple r in R For each tuple s in S If r and s match, output (r,s) • R: outer relation, S: inner relation. • We need to fit at least two blocks in main memory • Cost: T(R)T(S) • Can be pipelined using iterator model.

Block-based Nested Loop Join For each (M -1) blocks b in S For each block a in R For each tuple s in b For each tuple r in a If r and s match, output (r,s) • Cost: • Read B(S) blocks from S. • Read B(S) B(R) / (M – 1) blocks from R. • S is better to be the smaller relation.

Two-pass Algorithms

Two-pass Algorithms • We study them according to their access method • Sorting-based Algorithm. • Index-based algorithms. • Hash-based algorithms.

Sorting-based Algorithms: Duplicate Elimination d(R) • Use two-pass multi way merge sort method • Stage 1: Read R in groups of M block, sort, and write them on disk. • Stage 2: Merge M – 1 blocks at a time and output only distinct tuples. • Cost: 3 B(R) • Requirement: B(R) <= M2 • For one-pass algorithm: cost= B(R), B(R) <= M.

Sorting-based Algorithms: Grouping gA, sum(B)(R) • Similar to duplicate elimination. • Sort M blocks and compute sum for sets of tuples in the same group. • Merge M – 1 blocks and update the value of sum. • Cost: 3 B(R) • Requirement: B(R) <= M2

Sorting-based Algorithms: Set Operators • R ∩ S, R U S, R – S • Stage 1: Divide and sort R and S into the buckets of size M. • Stage 2: Merge n runs of R and m runs of S and output the results, where n + m <= M • Cost: 3B(R) + 3B(S) • Requirement: B(R)+B(S) <= M2

Two-pass Join Algorithm based on Sorting • One-pass Join algorithm: min(B(R), B(S)) < M • Read and keep the smaller relation into main memory, go over the larger relation block by block. • Cost: B(R)+B(S) • Nested loop join algorithm: min(B(R), B(S)) > M • Read M-1 blocks of the smaller relation into main memory, read R block by block • Cost: B(R)B(S)/M • Too slow.

Two-pass Join Algorithm based on Sorting • Stage 1: Sort R and S according to the join attribute. • Cost:4B(R) + 4B(S), why? • Stage 2: Load R and S in sorted order, join tuples • Cost: B(R)+B(S) • What if more than M tuples match on join attribute? • We have to use nested loop join algorithm • Cost:5B(R)+5B(S) • Requirements: B(R) <= M2 , B(S) <= M2

Sort-merge Join Algorithm • We like to avoid nested loop join, when the number of matching tuples is more than M. • Sort R and S in M runs (overall) of size M on disk. • Merge and join the tuples. • Cost: 3B(R)+3B(S) • Requirements: B(R) + B(S) <= M2 • More efficient but more strict requirement.

Index-based Algorithms

Selection on equality: sA=a(R) • Clustered index on the selection attribute. • Cost: B(R) / V(R,A), V(R,A) is the number of distinct values of A. • Unclustered index on the selection attribute • Cost:T(R) / V(R,A).

Selection on equality: sA=a(R) • B(R) is 5000, T(R) is 200,000, and V(R,A) is 10: • Index-based selection: • Clustered index on A: B(R) / V(R,A) = 500. • Unclustered index on A: T(R) / V(R,A) = 20,000. • Un-indexed selection: • If R is clustered: B(R) = 5000 • If R is unclustered: T(R) = 200,000. • Observation: for small values of V(R,A), unclustered index is not helpful.

Index-based Join: R∞ S • R has an index over the join attribute. • Read S, for each tuple of S lookup the matching tuple(s) in R. • If S is clustered: • Clustered index on R: B(S) + T(S) B(R) / V(R,A). • Unclustered index on R: B(S) + T(S) T(R) / V(R,A). • Not efficient? • If S is small, we do not need to examine all tuples in R: more efficient than previous algorithms. • If V(R,A) is very large: efficient

Index-based Join: R∞ S • Both R and S have clustered B+ tree indexes over the join attribute. • Best case! • Read blocks from R and S and merge them. • Merge join, zig-zag join • Cost: B(R) + B(S)

Hash-based Algorithms

Hash-based Algorithms: Unary Operators • Partition R into M - 1 buckets of almost equal size using some hashing function. • Store the buckets on disk • All tuples that need to be processed together are in the same bucket. • Duplicate elimination, …. • Load R bucket by bucket in main memory and generate the output.

Relation R Buckets 1 Buffer 2 hash function . . . M-1 Main memory Disk Disk Hash-based Algorithms: Unary Operators • Size of each bucket is almost B(R) / (M – 1) • Does each bucket fit in main memory? • B(R) / (M – 1) <= M, roughly B(R) <= M2 Buffers . . . . . . 1 2 B(R)

Hash-based Algorithm for Grouping g • Stage 1: Partition R into M - 1 buckets • Use the join attribute as the hash key. • Stage 2: Apply grouping to each bucket • Load each bucket in main memory. • All tuples in the same group belong to the same bucket. • Cost: 3 B(R). • Requirement: B(R) <= M2 • Similar to the sort-based algorithm.

Hash-based Algorithms: Binary Operators • Partition R and S into M - 1 buckets of almost equal sizes using some hashing functions. • All tuples that need to be processed together are in the buckets with the same hash value. • Difference, intersection, … • Examine pairs of buckets from R and S • Load the smaller bucket into the main memory and scan the larger bucket.

Hash-based Algorithm for Join • Stage 1: Partition R and S into M - 1 (separate) buckets. • Hash key is the join attribute. • Store the results on disk. • Stage 2: Join the tuples in the pairs of buckets with the same hash value. • Apply a one-pass join algorithm. • Cost: 3 B(R) + 3 B(S). • The smaller bucket must fit in main memory. • Requirement: min( B(R), B(S)) <= M2

Comparing Hash-based and Sort-based Algorithms • Hash-based algorithms need smaller amount of main memory for binary operators • Hash-based: min(B(R), B(S)) <= M2 • Sort-based: B(R) + B(S) <= M2 • Sort-based algorithms generated sorted results • Useful when there is ORDER BY in the query or the following operators need sorted input • It is not easy to generate balanced buckets in hash-based algorithms • More detailed discussion at 15.5.7.

Understanding Query Execution in Database Management: Logical & Physical Query Plans

Understanding Query Execution in Database Management: Logical & Physical Query Plans

Presentation Transcript

CS 440 Database Management Systems

CS 440 Database Management Systems

CS 5 40 : Database Management Systems

CS 540 Database Management Systems

CS 440 Database Management Systems

CS 540 Database Management Systems

CS 440 Database Management Systems

CS 440 Database Management Systems

CS 540 Database Management Systems

CS 440 Database Management Systems

CS 540 Database Management Systems

CS 540 Database Management Systems

CS 440 Database Management Systems

CS 440 Database Management Systems

CS 540 Database Management Systems

CS 440 Database Management Systems

CS 440 Database Management Systems

CS 540 Database Management Systems

CS 540 Database Management Systems

CS 540 Database Management Systems

CS 540 Database Management Systems

CS 540 Database Management Systems

Sea Ice

Sea Ice