One-Pass Algorithm for Efficient Database Operations
This overview explores the one-pass algorithm approach for database operations, highlighting its efficiency in managing various operations with minimal memory usage. It categorizes algorithms based on their resource requirements—single (tuple-at-a-time), full-relation unary operations, and full-relation binary operations. We detail methods for operations like duplicate elimination, grouping, and set operations, providing insights into their execution. Emphasis is placed on the ability to read data only once and the necessity of main memory for optimal performance, making these algorithms suitable for large datasets.
One-Pass Algorithm for Efficient Database Operations
E N D
Presentation Transcript
Query Execution One-pass algorithm for database operations Chetan Sharma 008565661
Overview • One-Pass Algorithm • One-Pass Algorithm Methods: • Tuple-at-a-time, unary operations. • Full-relation, unary operations. • Full-relation, binary operations.
we can divide algorithms for operators into three “degrees” of difficulty and cost: 1) Require at least one of the arguments to fit in main memory.-one pass 2) Some methods work for data that is too large to fit in available main memory but not for the largest imaginable data sets.-two pass 3) Some methods work without a limit on the size of the data.-multipass:recursivegeneralizations of the two-pass algorithms.
One-Pass Algorithm Reading the data only once from disk. Usually, they require at least one of the arguments to fit in main memory
Tuple-at-a-Time These operations do not require an entire relation, or even a large part of it, in memory at once. Thus, we can read a block at a time, use one main memorybuffer, and produce our output. Ex- selection and projection
Tuple-at-a-Time A selection or projection being performed on a relation R
Full-relation, unary operations Now, let us consider the unary operations that apply to relations as a whole , rather than to one tuple at a time: a)Duplicate elimination. b)Grouping .
b) Grouping MIN (a),MAX (a) aggregate, record the minimum or maximum value, respectively, of attribute a seen for any tuple in the group so far. COUNT aggregation, add one for each tuple of the group that is seen. SUM (a), add the value of attribute a to the accumulated sum for its group. AVG (a) is the hard case. We must maintain two accumulations: the count of the number of tuples in the group and the sum of the a-values of these tuples.
b) Grouping When all tuples of R have been read into the input buffer and contributed to the aggregation(s) for their group, we can produce the output by writing the tuple for each group. Note-: that until the last tuple is seen, we cannot begin to create output for aoperation. Thus, this algorithm does not fit the iterator framework very well; The entire grouping has to be done by the Open method before the first tuple can be retrieved
One-Pass Algorithms for Binary Operations All other operations are in this class: set and bag versions of union, intersection, difference, joins, and products. binary operations require reading the smaller of the operands R and S into main memory and building a suitable data structure so tuples can be both inserted quickly and found quickly. to be performed in one pass is: min(B(R),B(S)) <= M
Some examples In each case, we assume R is the larger of the relations, and we house S in main memory. • Set Union: -We read S into M - 1 buffers of main memory and build a search structure where the search key is the entire tuple. -All these tuples are also copied to the output. -Read each block of R into the Mth buffer, one at a time. -For each tuple t of R, see if t is in S, and if not, we copy t to the output. If t is also in S, we skip t. • Set Intersection : -Read S into M - 1 buffers and build a search structure with full tuples as the search key. -Read each block of R, and for each tuple t of R, see if t is also in S. If so, copy t to the output, and if not, ignore t.
examples continued.. • Product • Read S into M — 1 buffers of main memory; no special data structure is needed. Then read each block of R, and for each tuple t of R concatenate t with each tuple of S in main memory. Output each concatenated tuple as it is formed.
Summery • One-Pass Algorithm • One-Pass Algorithm Methods: • Tuple-at-a-time, unary operations. • Full-relation, unary operations. • Full-relation, binary operations.