1 / 20

Taking Constraints out of Constraint Databases

Taking Constraints out of Constraint Databases. Dina Goldin University of Connecticut Applications of Constraint Databases Paris, France, June 2004. queries. Table-based Logical Layer. Physical Layer. Relational Databases. Codd[70] provided an additional level of abstraction

Télécharger la présentation

Taking Constraints out of Constraint Databases

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Taking Constraints out of Constraint Databases Dina GoldinUniversity of Connecticut Applications of Constraint Databases Paris, France, June 2004

  2. queries Table-based Logical Layer Physical Layer Relational Databases Codd[70] provided an additional level of abstraction between physical data and queries queries Customized data layout for each application

  3. Advantages of Relational Model • Data model: Uniform table-based representation for all data at logical level • Data independence: Can modify physical layer without affecting queries • Simple set-of-points semantics, RA=RC • Efficient indexing methods A commercial success in the 1980s!

  4. Object-Relational Databases • Disadvantages of RDBs: • only good for traditional, “administrative” data • OO technology corrects this: • encapsulate non-administrative data • provide methods to access it • Object-relational databases provide this technology within a relational framework. They are the latest commercial success.

  5. Outline • Introduction • relational, OR data models • GIS systems: • CDB technology to the rescue • Constraint Databases: • it’s not just about constraints • one more level of abstraction • Constraint-backed databases: • practical considerations • getting constraint-backed technology right

  6. Geographic Information Systems • Until recentlly, leading commercial systems for spatial data • Not database systems per se • cannot manage non-geographic data • no ad-hoc querying (users perform built-in operations or execute predefined queries) • single-layered architecture (no data independence when writing queries) • in-memory (no index stuctures)

  7. Newer Approaches to Managing Spatial Data • Marrying GIS and object-relational databases • Example: Oracle Spatial Data Option • Full power of a relational DB plus… • Spatial data • encapsulated as new data types within the OR framework • same data types as in ARC/Info (leading GIS system) • Spatial operations • as methods over the new data types • based on GIS operations • Spatial data access structures • based on bounding boxes

  8. Data Separation in OR/GIS Databases • Spatial data stored in spatial relations • predefined set of spatial data types (point, region, etc…) • each relation is a set of spatial objects of one type, with a key • predefined set of operations over spatial objects • “Traditional” data stored in regular relations • Including thematic/descriptive data pertaining to spatial objects • Spatial & administrative data are logically separate • only keys of spatial objects to correlate between them • spatial data processing limited to predefined types and operators • Separation applies to query output as well • limited query expressiveness Can constraint databases offer a better solution?

  9. Constraint Databases • Contribution of KKR[90,95] • Key idea: Allow relations that include infinitely many points • “Finite relations are generalized to finitely representable relations” [GK96] • Generalized: original term for tuples and relations with infinite semantics • We now prefer the term constraint for such tuples and relations Goal: next commercial success (for GIS applications)

  10. queries Table-based Logical Layer Physical Layer Revisiting the Logical Layer • Components of the logical database layer: • set-of-tuples data semantics • implementation-independent (logical) data representation • Relational databases • finite semantics • trivial one-to-one correspondence between the two components • Constraint databases: • infinite semantics • correspondence between data semantics and data representation no longer trivial Infinite semantics of finitely representable data imply an additional level of abstraction; we need to separate logical layer into two

  11. Logical Layer: (queries defined over this layer) finite set-of-point semantics;table-based representation; Implementation-independent Abstract Logical Layer:(queries defined over this layer) infinite set-of-point semantics Concrete Logical Layer: Finite data representation; implementation-independent Physical Layer: File-based data storage; indexing structures, data access methods; implementation-dependent Additional Level of Abstraction RDB to CDB: from two layers to three

  12. Outline • Introduction • relational, OR data models • GIS systems: • CDB technology to the rescue • Constraint Databases: • it’s not just about constraints • one more level of abstraction • Constraint-backed databases: • practical considerations • getting constraint-backed technology right

  13. Concrete Data Model in CDBs • Requirements for the concrete layer • clean set-of-point semantics • efficient (index-based) data access methods • not required to use constraints (queries are over the abstract layer, so actual choice of representation is transparent to user) • Pure Constraint Databases • concrete layer is constraint-based • examples: CDB/CQA (query algebra), MLPQ (logic programming) • Constraint-backed databases • concrete layer is not purely constraints • data may be represented geometrically

  14. Practical Considerationsof GIS Applications • Data input/output is not based on constraints • data often obtained by digitization (generates points and segments) • geometrical, visual, some standard spatial format… • in pure CDBs, converted to constraints • Spatial features are never straight lines or convex polytopes • many short segments • frequent local change of direction • broken up into many constraint tuples (convex cells) per spatial object • Continuous (real time) data visualization • most users do NOT want to see constraints, but a GUI • visualization requires spatial outline (boundary points) • constraints need to be converted back to geometrical representation • conversions carry heavy performance penalty (not real-time) • Experience shows that practical systems are not pure • E.g. Dedale uses geometrical representations, explicitly translating to the constraint representation for the constraint engine [GSSG03]

  15. Geometric Data Representation • In the physical layer, need for geometry-based representations recognized early on • KKR90 suggested computational geometry algorithms as evaluation primitives • Examples of geometric representations: • Points • Polylines: for trajectories, regions • Triangulated Irregular Networks (TINS): for terrains (2.5 dimensional) • Efficient visualization • Efficient query evaluation • If region R(x,y) is stored as a sequence of points that outline it, pXR can be obtained by finding extrema of X-coordinates for these points. • Bounding boxes equally easy to compute.

  16. Role of Constraints in Constraint-Backed Databases Define query semantics (abstract level) • for proving query correctness • to spare users from ad-hoc operators with arbitrary restrictions • Provide default data model (concrete level) • one of the available data representations • e.g. when data is truly multidimensional • For data integration • as intermediate representation between non-compatible systems

  17. DEDALE • Not a pure constraint database • Nesting takes place at abstract level LandUse(lname,geom[x,y]) Flight(fname,traj[t,x,y,a]) Country(cname,geom[x,y,h]) • Queries use nest and unnest operations explicitly • Geometric representation in the concrete layer • geom in Country is represented as a TIN • traj in Flight is represented as a set of sample points along the flight path • Data model does not separate spatial and administrative data

  18. R0 := SELECT t=t1 from Flight • R1 := PROJECT R0 on fname,x,y • R0 := JOIN LandUse and Rect • R0 := JOIN LandUse and Rect • R1 = PROJECT R0 on lname • R2 = JOIN R1 and LandUse DEDALE vs. CQA/CDB • LandUse(lname,geom[x,y]) • Flight(fname,traj[t,x,y,a]) • Country(cname,geom[x,y,h]) • LandUse(lname,x,y) • Flight(fname,t,x,y,a) • Country(cname,x,y,h) • Over which location were the airplanes flying at time t1? MAP lX [X.fname, px,y ( st=t1 (X.traj))] (Flight) • Return the part of the parcels contained in rectangle Rect(x,y) MAP lX [X.lname, X.geom ∩ Rect] (LandUse) • Return all land parcels that have a point in Rect(x,y) plname,geom (MAP lX [X.lname, X.geom, s(x,y) in Rect (X.geom)] (LandUse)) Output limited to 2 spatiotemporal dimensions (3 in case of interpolated attributes) Pure constraint DB not practical

  19. Getting Constraint-Backed Systems Right • Clean semantics and full expressiveness of constraint databases • Geometrical representation issues not a user concern • though expert users may want to take more control • System support for three-tier architecture • More sophisticated than for pure constraint databases, or for current spatial databases • Query processing engine must • choose the best concrete representation for output queries, among those supported by system • select query evaluation strategies in the presence of a wider mix of possible representations and techniques • take into account storage and visualization • perhaps maintain multiple representations for the same data?

  20. Questions?

More Related