1 / 29

Leonardo Guerreiro Azevedo Geraldo Zimbrão Jano Moreira de Souza

Approximate Query Processing in Spatial Databases Using Raster Signatures. Leonardo Guerreiro Azevedo Geraldo Zimbrão Jano Moreira de Souza. {azevedo, zimbrao,jano}@cos.ufrj.br. Federal University of Rio de Janeiro. GOALS AND CONTRIBUTIONS.

qamar
Télécharger la présentation

Leonardo Guerreiro Azevedo Geraldo Zimbrão Jano Moreira de Souza

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Approximate Query Processing in Spatial Databases Using Raster Signatures Leonardo Guerreiro Azevedo Geraldo Zimbrão Jano Moreira de Souza {azevedo, zimbrao,jano}@cos.ufrj.br • Federal University of • Rio de Janeiro

  2. GOALS AND CONTRIBUTIONS PROPOSALS OF ALGORITHMS FIRST CONSIDERATIONS 4CRS GOALS AND CONTRIBUTIONS PROPOSALS OF ALGORITHMS FIRST CONSIDERATIONS 4CRS FINAL CONSIDERATIONS FIRST CONSIDERATIONS GOALS AND CONTRIBUTIONS FOUR-COLOR RASTER SIGNATURE (4CRS) PROPOSALS OF ALGORITHMS FINAL CONSIDERATIONS FINAL CONSIDERATIONS Presentation plan FIRST CONSIDERATIONS GOALS AND CONTRIBUTIONS FOUR-COLOR RASTER SIGNATURE (4CRS) PROPOSALS OF ALGORITHMS FINAL CONSIDERATIONS

  3. GOALS AND CONTRIBUTIONS PROPOSALS OF ALGORITHMS FIRST CONSIDERATIONS FIRST CONSIDERATIONS 4CRS An exact answer can demand a long time FINAL CONSIDERATIONS Motivation • There are many cases where a query can take a long time to be processed, for example: • When processing huge volume of data that requires a large number of I/O operations • Disk access time is still higher than memory access time • When processing high complex queries • When accessing remote data due to a slow network link or even temporary non-availability ... ... ...

  4. GOALS AND CONTRIBUTIONS PROPOSALS OF ALGORITHMS FIRST CONSIDERATIONS FIRST CONSIDERATIONS 4CRS FINAL CONSIDERATIONS Motivation • There are many cases where a query can take a long time to be processed, for example: • When processing huge volume of data that requires a large number of I/O operations • Disk access time is still higher than memory access time • When processing high complex queries • When accessing remote data due to a slow network link or even temporary non-availability A fast answer can be more important than an exact response ... ... ...

  5. GOALS AND CONTRIBUTIONS PROPOSALS OF ALGORITHMS FIRST CONSIDERATIONS FIRST CONSIDERATIONS 4CRS FINAL CONSIDERATIONS Motivation The challenge becomes bigger in spatial data environments. 399,0000 segments 475,434 segments

  6. GOALS AND CONTRIBUTIONS PROPOSALS OF ALGORITHMS FIRST CONSIDERATIONS FIRST CONSIDERATIONS 4CRS FINAL CONSIDERATIONS Motivation • Precision of the query can be lessened, and an approximate answer returned to the user • Approximate answers can be quickly computed • Acceptable precision

  7. GOALS AND CONTRIBUTIONS PROPOSALS OF ALGORITHMS FIRST CONSIDERATIONS FIRST CONSIDERATIONS 4CRS FINAL CONSIDERATIONS Motivation There are many approaches on the approximate query processing field, however most of them are not suitable for spatial data. “Research new techniques for approximate query processing that support the uniqueness of spatial data is a major issue in the database field”. (Roddick et al., 2004)

  8. GOALS AND CONTRIBUTIONS PROPOSALS OF ALGORITHMS FIRST CONSIDERATIONS FIRST CONSIDERATIONS 4CRS • Decision Support System • Increasing business competitiveness • More use of accumulated data • Data mining • During drill down query sequence in ad-hoc data mining • Earlier queries in a sequence can be used to find out the interesting queries. • Data warehouse • Performance and scalability when accessing very large volumes of data during the analysis process. FINAL CONSIDERATIONS Scenarios and Applications

  9. GOALS AND CONTRIBUTIONS PROPOSALS OF ALGORITHMS FIRST CONSIDERATIONS FIRST CONSIDERATIONS 4CRS FINAL CONSIDERATIONS Scenarios and Applications • Mobile computing • An approximate answer may be an alternative: • When the data is not available • To save storage space

  10. GOALS AND CONTRIBUTIONS PROPOSALS OF ALGORITHMS FIRST CONSIDERATIONS FIRST CONSIDERATIONS 4CRS Slow Spatial DBMS FINAL CONSIDERATIONS Traditional SDBMS query processing environment New data (inserts or updates) Queries Exact answeres Deleted data

  11. GOALS AND CONTRIBUTIONS PROPOSALS OF ALGORITHMS FIRST CONSIDERATIONS FIRST CONSIDERATIONS 4CRS Queries Approximate Answer + conf. Interval Fast answer Spatial DBMS Exact answer FINAL CONSIDERATIONS SDBMS set-up for providing approximate query answers Approximate Query Processing Engine New data (inserts or updates) Deleted data

  12. GOALS AND CONTRIBUTIONS GOALS AND CONTRIBUTIONS PROPOSALS OF ALGORITHMS FIRST CONSIDERATIONS FIRST CONSIDERATIONS 4CRS FINAL CONSIDERATIONS Goals Execute approximate query processing in Spatial Databases using Raster Signature • Four-Color Raster Signature (4CRS) (Zimbrao and Souza, 1998). Provide fast approximate query answers for queries over spatial data.

  13. GOALS AND CONTRIBUTIONS GOALS AND CONTRIBUTIONS PROPOSALS OF ALGORITHMS FIRST CONSIDERATIONS FIRST CONSIDERATIONS 4CRS FINAL CONSIDERATIONS Contributions Proposals of algorithms for many spatial operations that can be approximately processed using 4CRS • Spatial operators returning numbers • Area, distance, diameter, perimeter… • Spatial predicates • Equal, different, disjoint, area disjoint, inside, meet, adjacent… • Operators returning spatial data type values • Intersection, plus (union), minus, common border… • Spatial operators on set of objects • Sum, closest, decompose, overlay, fusion.

  14. GOALS AND CONTRIBUTIONS GOALS AND CONTRIBUTIONS PROPOSALS OF ALGORITHMS FIRST CONSIDERATIONS FIRST CONSIDERATIONS 4CRS FINAL CONSIDERATIONS Contributions Proposals of algorithms • Approximate Area of Polygon • Distance • Diameter • Perimeter and Contour • Equal and Different • Disjoint, Area Disjoint, Edge Disjoint • Inside (Encloses), Edge Inside, Vertex Inside • Intersects and Intersection • Overlay • Adjacent, Border in Common, Common border • Plus and Sum • Minus • Fusion • Closest • Decompose

  15. GOALS AND CONTRIBUTIONS GOALS AND CONTRIBUTIONS PROPOSALS OF ALGORITHMS FIRST CONSIDERATIONS FIRST CONSIDERATIONS 4CRS 4CRS • Grid resolution can be changed • Precision × Storage requirements FINAL CONSIDERATIONS Four-Color Raster Signature (4CRS) 4CRS is a raster approximation • It is an object representation upon a grid of cells

  16. GOALS AND CONTRIBUTIONS GOALS AND CONTRIBUTIONS PROPOSALS OF ALGORITHMS FIRST CONSIDERATIONS FIRST CONSIDERATIONS 4CRS 4CRS FINAL CONSIDERATIONS Four-Color Raster Signature (4CRS) Each cell stores relevant information using few bits • 4CRS  4 types of cells

  17. GOALS AND CONTRIBUTIONS GOALS AND CONTRIBUTIONS PROPOSALS OF ALGORITHMS FIRST CONSIDERATIONS FIRST CONSIDERATIONS 4CRS 4CRS FINAL CONSIDERATIONS Four-Color Raster Signature (4CRS) - Generation Polygon 4CRS

  18. GOALS AND CONTRIBUTIONS GOALS AND CONTRIBUTIONS PROPOSALS OF ALGORITHMS PROPOSALS OF ALGORITHMS FIRST CONSIDERATIONS FIRST CONSIDERATIONS 4CRS 4CRS Approximate area of polygon Based on the expected area of polygon within cell Approximate area of polygon within window Based on the expected area of polygon within cell Approximate overlapping area of polygon join FINAL CONSIDERATIONS Approximate Area of Polygon Based on the intersection expected area of two types of cells

  19. GOALS AND CONTRIBUTIONS GOALS AND CONTRIBUTIONS PROPOSALS OF ALGORITHMS PROPOSALS OF ALGORITHMS FIRST CONSIDERATIONS FIRST CONSIDERATIONS 4CRS 4CRS FINAL CONSIDERATIONS Approximate area of polygon Approximate area of polygon Approximate area of polygon within cell Expected area (µ) of cell type Expected Area = zero%  µ = 0 E Expected Area = 100%  µ = 1 F Expected Area (0, 0.50] µ = 0.25 W S Expected Area (0.50, 1) µ = 0.75 Grid and polygon are independent from each other

  20. GOALS AND CONTRIBUTIONS GOALS AND CONTRIBUTIONS PROPOSALS OF ALGORITHMS PROPOSALS OF ALGORITHMS FIRST CONSIDERATIONS FIRST CONSIDERATIONS 4CRS 4CRS × µW×E W E × expected area of cells overlapping µS×E S E × µS×W S W × µS×S S S FINAL CONSIDERATIONS Approximate overlapping area of polygonjoin

  21. GOALS AND CONTRIBUTIONS GOALS AND CONTRIBUTIONS PROPOSALS OF ALGORITHMS PROPOSALS OF ALGORITHMS FIRST CONSIDERATIONS FIRST CONSIDERATIONS 4CRS 4CRS FINAL CONSIDERATIONS Approximate overlapping area of polygonjoin Table of expected area of cells overlapping æ ö å ç ÷ = m ´ Approximat e answer cellarea ç ÷ ´ è ø i j ´ i j

  22. GOALS AND CONTRIBUTIONS GOALS AND CONTRIBUTIONS PROPOSALS OF ALGORITHMS PROPOSALS OF ALGORITHMS FIRST CONSIDERATIONS FIRST CONSIDERATIONS 4CRS 4CRS For some proposed algorithms, it is possible to return an approximate answer evaluating only cell types. FINAL CONSIDERATIONS Affinity degree For other algorithms, when evaluating cell types it is also required to compute an approximate value in the interval [0,1] that indicates a true percentage of the response  Affinity deggree: it is based on expected area of cells overlapping (Azevedo et al., 2005). Table of affinity degree

  23. GOALS AND CONTRIBUTIONS GOALS AND CONTRIBUTIONS PROPOSALS OF ALGORITHMS PROPOSALS OF ALGORITHMS FIRST CONSIDERATIONS FIRST CONSIDERATIONS 4CRS 4CRS Sum of affinity degree Trivial case: not equal  overlap of different cell types  resultfalse × µE×E = 1 E E × W W µW×W = 0.0625 × S E × S S µS×S = 0.5625 × S W × µF×F = 1 F F × F S FINAL CONSIDERATIONS Equal Equal algorithm using 4CRS  the approximate answer is equal to the sum of affinity degrees divided by the number of comparisons of pair of objects, if no trivial case occurs.

  24. GOALS AND CONTRIBUTIONS GOALS AND CONTRIBUTIONS PROPOSALS OF ALGORITHMS PROPOSALS OF ALGORITHMS FIRST CONSIDERATIONS FIRST CONSIDERATIONS 4CRS 4CRS Sum of affinity degree Trivial case: different  overlap of different cell types  resulttrue µE×E = 0 × E E µW×W = 1-0.0625 × W W × S E µS×S = 1-0.5625 × S S × S W µF×F = 0 × F F × F S FINAL CONSIDERATIONS Different Different algorithm is opposite to equal algorithm Affinity degree is equal to the 1 - affinity degrees

  25. GOALS AND CONTRIBUTIONS GOALS AND CONTRIBUTIONS PROPOSALS OF ALGORITHMS PROPOSALS OF ALGORITHMS FIRST CONSIDERATIONS FIRST CONSIDERATIONS 4CRS 4CRS W × Case I: At least one overlap of F Trivial case: Not disjoint (exact answer) S F × S S E W × Case II: Only overlap of Disjoint (partial answer) Affinity degree += 1 E S F × W W Disjoint (partial answer) Affinity degree += 1 – expected area(type1,type2) Case III: weak × weak weak × strong × W S FINAL CONSIDERATIONS Disjoint Disjoint: two objects are disjoint if they have no portion in common

  26. GOALS AND CONTRIBUTIONS GOALS AND CONTRIBUTIONS PROPOSALS OF ALGORITHMS PROPOSALS OF ALGORITHMS FIRST CONSIDERATIONS FIRST CONSIDERATIONS 4CRS 4CRS FINAL CONSIDERATIONS Distance Distance can be estimate from 4CRS signatures computing the distance among cells corresponding to polygons’ borders (Weak and Strong cells). Distance = average of the minimum and maximum distances ... ... ... Minimum distance Maximum distance (a) ( b ) ( c )

  27. GOALS AND CONTRIBUTIONS IMPL. AND EVAL. ALGORITHMS EXPERIMENTAL RESULTS FIRST CONSIDERATIONS 4CRS GOALS AND CONTRIBUTIONS IMPL. AND EVAL. ALGORITHMS EXPERIMENTAL RESULTS FIRST CONSIDERATIONS 4CRS FINAL CONSIDERATIONS FINAL CONSIDERATIONS Conclusions Goal • Provide an estimated result in orders of magnitude less time than the time to compute an exact answer, along with a confidence interval for the answer. Proposals • Use raster approximations for approximate query processing in spatial databases • Use 4CRS signature to process the queries over polygons, avoiding accessing the real data. • Proposal many algorithms for approximate processing • Use expected area of polygons (Azevedo et al., 2005) to estimate responses

  28. GOALS AND CONTRIBUTIONS IMPL. AND EVAL. ALGORITHMS EXPERIMENTAL RESULTS FIRST CONSIDERATIONS 4CRS GOALS AND CONTRIBUTIONS IMPL. AND EVAL. ALGORITHMS EXPERIMENTAL RESULTS FIRST CONSIDERATIONS 4CRS FINAL CONSIDERATIONS FINAL CONSIDERATIONS Future work Implement and evaluate algorithms involving other kinds of datasets, for example, points and polylines, and combinations of them: • point × polyline, polyline × polygon and polygon × polyline. The experimental evaluation is not addressed in this work; it is on going work developed on Secondo (Güting et al., 2005) which is an extensible DBMS platform for research prototyping and teaching.

  29. Approximate Query Processing in Spatial Databases Using Raster Signatures Leonardo Guerreiro Azevedo Geraldo Zimbrão Jano Moreira de Souza {azevedo, zimbrao,jano}@cos.ufrj.br • Federal University of • Rio de Janeiro

More Related