220 likes | 326 Vues
IELM 511: Information System design. Introduction. Part 1. ISD for well structured data – relational and other DBMS. Info storage (modeling, normalization) Info retrieval (Relational algebra, Calculus, SQL) DB integrated API’s. Part 2. ISD for systems with non-uniformly structured data.
 
                
                E N D
IELM 511: Information System design Introduction Part 1. ISD for well structured data – relational and other DBMS Info storage (modeling, normalization) Info retrieval (Relational algebra, Calculus,SQL) DB integrated API’s Part 2. ISD for systems with non-uniformly structured data Basics of web-based IS (www, web2.0, …) Markup’s, HTML, XML Design tools for Info Sys: UML Part III: (one out of) API’s for mobile apps Security, Cryptography IS product lifecycles Algorithm analysis, P, NP, NPC
Agenda Brief introduction to DBMS architecture Some other DB types
client cpu client disk RAM client DB architecture components Any DB has three (or four) essential components: 1. The process (DBMS application) 2. System memory (RAM on the computer where the process is running) 3. Permanent data store (usually on a Hard disk) and, (for most DBs), 4. A computer network. Typical client-server DB model (e.g. the one we use in our labs) Why do we need any other model ?
DB architectures: need for other models Very large number of queries per minute  parallel systems cpu cpu RAM RAM disk cpu RAM cpu RAM disk cpu RAM disk cpu RAM disk cpu RAM disk Shared nothing Shared disk RAM cpu cpu disk cpu Note: a parallel DB may or may not store data in different locations, or across a network. disk cpu Shared memory
DB architectures: need for other models.. Network is not reliable, but services are critical  Distributed DB (e.g. Park-n-Shop) Distributed DB - a single logical database that is split into fragments, each fragment is controlled by a separate DBMS cpu RAM frag 1 cpu RAM frag n cpu RAM frag 2 Each site can process queries that access local data as well as on other computers in the network. Data is distributed, but transparent to the user !
Distributed DBs, examples A distributed system appears to the user as a “centralized system” - Users do not need to worry about network details - Users are not concerned with where the data is stored, or redundantly stored data - Tables may be stored in multiple fragments, but this is transparent to the user  The DBMS must have special functions to provide this functionality !
Distributed DBs, advantages - Increased reliability and availability - Improved performance of local data - Easier expansion
Distributed DBs, examples Extra functionality embedded in distributed DBMS: Keeping track of distributed data Distributed query processing Distributed transaction management Maintaining consistent copies of replicated data Distributed database recovery Security - user access authorization Distributed catalog management Examples of fragmented data: - Some tables are only stored at some sites - Vertical fragmenting of tables e.g. some columns of a table in one site, others in another site. - Horizontal fragmenting of tables e.g. different rows stored in different sites.
Distributed DB examples Multiple stores belonging to same retail chain (e.g. Park-n-Shop) Multiple branches of same bank Domain name service (DNS) for internet
Need for even more (other) models: Object Oriented DBs Why have Object Oriented Databases ? - Need for more complex applications - Need for additional data modeling features - Increased use of object-oriented programming languages since 1990 Commercial OO Database products: Ontos, Gemstone, O2 ( Ardent), Objectivity, Objectstore ( Excelon), Versant, Poet, Jasmine (Fujitsu-GM)
Object Oriented DBs Main idea in OODB: DB objects should have a direct correspondence to real-world objects Advantage: Objects maintain their integrity and identity  ease of modeling, maintenance Object is composed of: Data (values of attributes) and Behavior (methods or operations) Relational DB: simple program objects (tables), data about single object may be spread over multiple tables (e.g. account data in our Bank DB) OODB: program objects can be arbitrarily complex; however, all data and functions related to one object are stored together.
Some Key concepts of OODBs Encapsulation At the time when an object is defined, the user must define - All data and it type - All operations a user can apply to the object. Contrast this with Relational DBs: Data is defined, but operators are system functions, not specific to objects. Operator Polymorphism - Each object encapsulates its own methods - Different objects may have some similar actions (e.g. subtract some amount from a ‘loan’ object, or from an ‘account’ object.) - Polymorphism allows same operator name to be used by different objects (Note: actual functions are different, although the do similar things). Constructors, Destructors Object instances are created (equivalent of inserting row(s) in RDB) by constructors and deleted (equivalent of deleting row(s) in RDB) by using destructors
Some Key concepts of OODBs… Object hierarchy and inheritance Objects can be organized in hierarchical structure (e.g. ‘account’ object is a super-class of ‘savings_account’ and ‘checking_account’ objects). Objects of a sub-class inherit attributes (and values) from parent classes in the hierarchy.
Practical situation of OODBs Commercial OO Database products: Ontos, Gemstone, O2 ( Ardent), Objectivity, Objectstore ( Excelon), Versant, Poet, Jasmine (Fujitsu-GM) Commercial success and penetration: < 1% of total market. Possible Reasons: OODBs were introduced in 1990s, by which time RDBs dominated most markets. Switching costs too high. Operator efficiency cannot match RDB. OODBs lack the simplicity and universality of SQL. Oracle provides support for Object-Relational DB for special applications. - Try to capture the best of RDB and OODB
Object-Relational DBs Main features: - User-Defined Types, Object ID’s, Nested Tables No standard implementation among different DB vendors. Most common interface standard: SQL-99 User Defined Types (UDT): CREATE TYPE <typename> AS ( attribute_1 data-type_1, … ); Subsequently, a table may be defined in terms of UDT’s: CREATE TABLE <table name> OF <typename>; UDT and nested tables allow design of DB to appear more like real-world objects (internally, the DB, e,g, Oracle, may convert these into regular tables.)
Spatial and Temporal Databases The most recent advances in Data storage field are in areas of - Spatial Databases - Temporal Databases
Spatial Databases Motivating example 1: Google maps or GPS programs - Storage: a ‘map’, possibly with different models, e.g. terrain, road,… - Queries: Find object of type x ‘near’ point p; Find shortest route from point p to point q; Is point p in zone (e.g. district, or country) z ? Motivating example 2: DB models for medical applications: CT scans - Storage: CT scan of a human brain - Queries: Find a path from point p to point q along artery A; Find cell-cluster of type tumor_x;
Spatial databases.. Provide Spatial Data Types (SDT) in model and in query language - Point, Line, Region - Relationship(s) between them: point p is on line L DBMS provides support for SDTs: - Spatial indexing (for quickly locating, e.g., point in region) - Spatial joins
Spatial databases.. Example 1. (fast response with spatial indexing): Find all electronics factories in PRD area SELECT fname FROM factories f WHERE f.location inside PRD.area Example 2. Spatial join: a join that compares any two joined objects based on a predicate on their spatial attribute values For each highway passing through PRD, find all factories within < 2 Km. SELECT h.highway, f.fname FROM highways h, factories f WHERE h.route intersects PRD.area anddistance( h.route, f.location) < 2 Km
Temporal databases Most DBs record data; if the data is available in the DB, then it is ‘true’ if not, then it is ‘not true’. Temporal DBs record not only data, but specifically store validity time window for all data. Thus, each data record has two time stamps: - Transaction time - Valid time Motivating example 1: Internet games, e.g. second life - Storage: similar to RDB, but with additional valid time for each cell. - Queries: Was Anton in coffee_shop at same time as Dave?
Concluding remarks Other than RDBM, several other DB types have been used successfully. Advantages of these types depend on the usage: When data is handled by many geographically separated, localized operations, it may be better to use Distributed DBs When the application is space/geography related, instead of building special APIs, Spatial DBs may be used. When data validity and time of events is important, Temporal DBs may be useful (e.g. internet games, cyber-crime detections, …)
References and Further Reading Chaps. 16, 18, 21 Silberschatz, Korth, Sudarshan, Database Systems Concepts, McGraw Hill Chaps. 20, 24, 25, 27 Elmasri and Navathe, Fundamentals of Database Systems, Addison-Wesley Next: IS for non-structured data