1 / 17

Timos K. Sellis et al. VLDB 1987 Jae- hoon Kim

The R+-Tree A Dynamic Index for Multi-Dimensional Objects. Timos K. Sellis et al. VLDB 1987 Jae- hoon Kim. Introduction. DBMS store one-dimensional data Integers Real numbers Strings DBMS do not handle sufficiently multi-dimensional data Boxes Polygons

Télécharger la présentation

Timos K. Sellis et al. VLDB 1987 Jae- hoon Kim

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The R+-Tree A Dynamic Index for Multi-Dimensional Objects Timos K. Selliset al. VLDB 1987 Jae-hoon Kim

  2. Introduction DBMS store one-dimensional data • Integers • Real numbers • Strings DBMS do not handle sufficiently multi-dimensional data • Boxes • Polygons • Points in multi-dimensional space

  3. Method for Multi-dimensional Data Common case of multi-dimensional data is points Main idea is divide the whole space into disjoint sub-region Sub-region contains no more than C points • C is capacity of disk page Insertion of new points → partitioning of a region (split)

  4. Classification of known methods Position • Fixed : position of the splitting hyperplane is predetermined (grid file) • Adaptable : data points determine the position of the hyperplane (k-d tree) Dimensionality • 1-d cut : k-d tree • K-d cut : quad-tree, oct-tree Locality • Grid : splits not only the affected region, but also all the regions • Brickwall : restrict the splitting hyperplane to extend solely inside the region

  5. Methods for Rectangles Transform into points in a higher dimension space • 2-d rectangle → a point in 4-d space • k-d trees, or grid file after a rotation of the axes Use space filling curve • Map a k-d space to a 1-d space • Transform k-dimensional object to line segment (z-transform) Divide the original space into sub-regions • Disjoint : can use method mentioned before • Overlapping : cut in two pieces and tag • R-tree : First proposed use of overlapping sub-region

  6. R-Tree a1 a2 Extension of b-tree Height balanced tree Nodes are consist of MBR Guarantee that space utilization is at least 50%

  7. R-Tree Split New entry Requirement of “good” split • Minimize the whole area • Minimize the overlap

  8. R-Tree Insert & Split 8 3 4 1 7 2 A 5 6 B A B 1 2 3 4 5 5 6 7 8

  9. Bad Search in R-Tree 8 3 4 1 7 2 A 5 6 B A B 1 2 3 4 5 6 7 8

  10. R+-Tree Variant of R-tree Avoid overlapping of internal nodes by inserting an object into multiple leaves Leaf node : (oid, RECT) RECT : (xlow, xhigh, ylow, yhigh) Intermediate node : (p, RECT) p → pointer to a lower level node

  11. Properties of R+-Tree Properties • Subtree rooted at the node pointed to by p contains a rectangle R if and only if R is covered by RECT → only exception is when R is at a leaf node • Intermediate node (p1, RECT1) and (p2, RECT2) → overlap between RECT1 , RECT2 is “0” • Root has at least two children unless it is a leaf • All leaves are at the same level

  12. R+-Tree 8 B 3 4 1 C 7 2 A 5 6 A B C 1 2 3 4 6 7 8 4 5

  13. Operations to keep the R+-tree Searching operation • First decompose the search space into disjoint sub-region • Descend the tree until the actual data object are found in the leaves Insertion operation • Searching the tree and adding the rectangle in leaf nodes • Difference from R-tree → add to more than one leaf node Deletion operation • Locating the rectangle that must be deleted and then removing it from leaf node Node Splitting operation • Two sub-nodes cover disjoint areas • Contrary to R-tree → downward propagation

  14. Packing Algorithm Reduce the coverage of “dead space” Reduce the height expansion of R+-tree Packing algorithm • Pack attempts to set up an R+-tree with good search performance • Partition, Sweep, Pack Selection of x_ or y_ cut for Partition • Nearest neighbor • Minimal total x- and y- displacement • Minimal total space coverage accured by the two sub-regions • Minimal number of rectangle splits

  15. Operations to build the R+-tree Partition operation • Decompose the total space into a locally optimal (search performance) • Use the sweep routine that parallel to x or y axis Sweep operation • Used to scan the rectangles and identify points where space partitioning is possible Pack operation • Pack is to organize a R+-tree depends on a set S of rectangles and the fill-factor ff of the tree. • Recursively pack the entries of each level of the tree from bottom up • In each level, partitioning non-leaf nodes and some of the rectangles have been split because of the chosen partition, recursively propagate the split downward and if necessary propagate the changes upward also.

  16. Analysis Disk access for Two-Size Segments : Point Query Disk access for Two-Size Segments : Segment Query

  17. Summary Advantage of R+-tree • Improved search performance, especially in point query • More than 50% saving in disk access Disadvantage of R+-tree • Tree height is more than R-tree • Use more space (duplicate node)

More Related