1 / 57

Object-relational database systems

Object-relational database systems. Yong Yao CS632 April 17, 2001. Content. Introduction Michael Stonebraker: Inclusion of New Types in Relational Data Base Systems. Michael Stonebraker, Greg Kemnitz: The Postgres Next Generation Database Management System.

rosie
Télécharger la présentation

Object-relational database systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Object-relational database systems Yong Yao CS632 April 17, 2001

  2. Content • Introduction • Michael Stonebraker: Inclusion of New Types in Relational Data Base Systems. • Michael Stonebraker, Greg Kemnitz: The Postgres Next Generation Database Management System. • PREDATOR System Design Document: A detailed description of internal design decisions and implementation details.

  3. Introduction • The relational model • Dominant mainstream approach • Work well on business data processing • But things have changed drastically, nontraditional problems.

  4. Taxonomy of DBMS Application Yes No Use SQL data Simple Complex complexity

  5. Simple data Yes No Use SQL data Simple Complex complexity

  6. OODBMS Yes No Use SQL data Simple Complex complexity • A programming language with a type system • Persistent object

  7. ORDBMS Yes No Use SQL data Simple Complex complexity

  8. Features • Base type extension • Inheritance • Complex object

  9. Inclusion of New Types in Relational Database Systems Michael Stonebraker EECS Dept. University of California, Berkeley

  10. Example • Represent a set of boxes in the DBMS and a simple query- find all the boxes that overlap the unit square(0,1,0,1). • RDBMS: Create box(id=i4, x1=f8, x2=f8,y1=f8,y2=f8) select * from box where ! (box.x2<=0 or box.x1>=1 or box,y2<=0 or box.y1>=1) • Too hard to understand • Too complex to optimize • Too many clauses to check

  11. Example (cont’d) Create box(id=i4,desc=box) select * from box where box.desc!!”0,1,0,1” • Box: new data type • “!!”: overlap operator with two operands of data type box, return a boolean

  12. Operators for Boxes

  13. Motivation • Needs of business data processing applications Geographic Information System • New Types: Point, Line, Polygon; • New operators: distance, intersection ; • New access methods: R-trees, KDB trees;

  14. Conditions • The definition of user-defined data types • The definition of new operators for these data types • The implementation of new access methods for data types • Optimized query processing for commands containing new data types and operators

  15. Definition of New Types • Follow a registration process define type-name length = value, input = file-name, output = file-name • Occupy a fixed amount of space • Conversion routines

  16. Definition of New Operators

  17. Safety loophole • Problem: • an ADT routine which has an error can overwrite DBMS data structures • Unclear whether such errors are due to bugs in the user routines or in the DBMS • Solutions: • Run in separate address space • Build a language processor • Provide two environments

  18. New access methods • Users can add new access methods to support user-defined data types • Goal: extensibility • Interface: • Conditions for the operators • Information on the data types of operators • Define set of operators

  19. Example: B-tree • Conditions for operators:

  20. Information on the data types of operators (B-tree) • “<=“ is required • Type: specific type, fixed, variable, fix-var, type1, type2

  21. New set of operators (B-tree) F1=(value - low-key) / (high-key – low-key) F2=(high-key – value) / (high-key – low-key)

  22. Implementing New Access Methods • A collection of procedure calls • open (relation-name) • close (descriptor) • get-next(descriptor, OPR, value, tuple-id) • insert (descriptor,tuple) • delete(descriptor, tuple-id) • replace(descriptor, tuple-id, new-tuple) • build(descriptor,keyname,OPR)

  23. Hard problem- work with transaction management • log pages – simple, but may suffer from performance penalty • log events • Event-oriented interface • REDO(T) • UNDO(T) • LOG(event-type, event-data)

  24. Access path selection • Four pieces of information for optimization • Stups: estimate the expected number of records …where rel-name.field-name OPR value • A second selectivity factor S: …where relname-1.field-1 OPR relname-2.field-2 • Feasibility of merge-sort • Feasibility of hash-join Define operator token= AE, …… Stups=1 S= min(N1.N2). merge-sort with AL, hash-join

  25. Generate query processing plan • relname-1.field-1 OPR relname-2.field-2 • Merge sort • Iterative substitution • Hash Join • Relname.field-name OPR value • any access method with field-name as a key • Secondary index • Sequential search

  26. Summary • How to extend an abstract data type • How to define new access method • Integrate new access method with transaction management • Generation of optimized query processing plan

  27. The POSTGRES Next-GenerationDatabase Management System Michael Stonebraker Greg Kemnitz

  28. Three kinds of services forDBMS • Traditional data management • Simple data type • Object management • Complex data types, e.g. bitmaps, icons, text… • Knowledge management • Store and enforce a collection of rules

  29. Example- newspaper layout • store and manipulate text and graphics, Bill customers for advertisement. • Data: customer information • Object: text, pictures and icons • Rules: control newspaper layout

  30. POSTGRES Data Model-Design criteria • Traditional relational DBMSs data model: a collection of named relations • Orientation toward database access from a query language • Interact with database by query language-POSTQUEL • User defined functions

  31. Design criteria • Orientation toward multilingual access • Two selections • One language tightly coupled to the system • Multilingual • POSTGRES is programming language neutral

  32. Design criteria • Small number of concepts • As few as possible • Four constructs: • Class • Inheritance • Type • Function

  33. Data Model - Class • Class(constructed type or relation) : a named collection of instances of objects. • Instances(record or tuple) : has the same collection of attributes create EMP(name=string, salary= float, age=int)

  34. Data model - Inheritance • Inherit data elements from other classes create EMP(name=string, salary= float, age=int) create SALESMAN(quota=float) inherits EMP • Multiple inheritance – only create objects without ambiguity EMP SALESMAN

  35. Three kinds of classes • Real class: instances are stored in the database • Derived class :view • Version: store differential relative to its base class • Initially has all instances of the base class • Freely updated to diverge from the base class • Updates to the version do not affect the base class • Updates to the base class are reflected in the version • Supported by the rule system

  36. Version Example create version my-EMP from EMP Two classes generated: EMP-MINUS(deleted-OID) EMP-PLUS(all-fields-in EMP, replaced-OID) • Retrieve: retrieve EMP-PLUS instead • Insert: All new instances for EMP or my-EMP will be added into EMP-PLUS; • Delete: move records from EMP-PLUS to EMP-MINUS

  37. Data model - Types • Base types: • Int, float, string • construct new base types. • Arrays of base types: • Composite types: • Construct complex objects: attributes contain other instances as part or all of their value

  38. Composite types • Contains zero or more instances of the same class create EMP (… , manager=EMP, coworkers=EMP) • Set: value is a collection of instances from all classes create EMP(… , hobbies=set) {softball, skiing, skating…}

  39. Data Model - Functions • C functions • Arbitrary C procedures • Can not be optimized by the POSTGRES • Argument: base types or composite types • Inherited down the class hierarchy • overpaid(EMP) • overpaid(SALESMAN)

  40. Functions • Operators • Utilize index • Functions with one or two operands • {ALT, ALE, AE, AGT, AGE} • Allow new access methods • POSTQUEL functions • Any collection of commands in the POSTQUEL Define function high-pay returns EMP as Retrieve (EMP.all) Where EMP.salary>50000

  41. POSTQUEL • Set-oriented query language • Support nested queries • Transitive closure • Support for inheritance • Support for time travel • Allow a user to run historical query retrieve (EMP.salary) from EMP [T] where EMP.name=“Sam” • Maintain two different physical collections of records • Vacuum cleaner: a daemon moves records

  42. The Rules System • Requirements: • referential integrity • View management • Triggers • Integrity constraints • Protection • Version control • Design a general purpose rules system

  43. Syntax of rules ON event (TO) object WHERE on new EMP.salary where POSTQUEL-qualification EMP.name=“Fred” THEN DO [instead] then do replace POSTQUEL-command(s) E (salary=new.salary) from E in EMP where E.name=“Joe” • Event: retrieve, replace, delete, new … • Object: name of a class or class.column • POSTQUEL-qualification: normal qualification • POSTQUEL-commands: a set of POSTQUEL commands

  44. Forward and Backward chaining • Rules specify additional actions, and these actions may activate other rules on new EMP.salary where on retrieve to EMP.salary where EMP.name=“Fred” EMP.name=“Joe” then do replace then do instead retrieve E(salary=new.salary) (EMP.salary) from E in EMP where EMP.name=“Fred” where E.name=“Joe”

  45. Implementation of Rules • Record level processing • Rules system is called when individual records are accessed • Place a marker on the record • Query rewrite module • Perform poorly for a large number of small-scope rules • Desirable when there are a small number of larger-scope rules

  46. When rules are activated • Immediate-same transaction • Immediate-different transaction • Deferred-same transaction • Deferred-different transaction • Currently only implements the first option

  47. Storage System • ‘no-overwrite’ storage manager • Old record remains in the database whenever an update occurs • Instantaneous crash recovery • Support time travel • stable main memory required

  48. Performance • Better than UCB-INGRES

  49. But • Compare to the Cattell system, loses by about a factor of two.

  50. Comments • POSTGRES allows an application designer to trade off performance for data independence • Imports only specific user functions into its address space

More Related