330 likes | 549 Vues
CIS 507 Database Programming. Database Concepts DBMS Models. Database. A DATABASE is an organized collection of related data McFadden-Hoffer a shared collection of logically related data designed to meet the information needs of multiple users in an organization Kroenke
E N D
CIS 507Database Programming Database Concepts DBMS Models
Database A DATABASE is an organized collection of related data • McFadden-Hoffer • a shared collection of logically related data • designed to meet the information needs of multiple users in an organization • Kroenke • a self-describing collection of integrated records • self-describing: in addition to the user’s source data, contains a description of its own structure • collection of integrated records: user data, metadata (data about data), indexes to represent relationships among data and improve performance, data about applications that use the database • Rob-Coronel • a shared integrated computer structure that houses a collection of • end-user data--i.e. raw facts of interest to the end-user • metadata through which the data are integrated
File Management Systems • The predecessor to the modern database management system • Problems • Maintenance: Data created, managed, and accessed primarily through 3GL (COBOL, PL/I) • Data dependent: all components sensitive to changes made to data structure or storage and retrieval methods • Data Redundancy: uncontrolled duplication of data • Data Inconsistency • Data Anomalies • Sharing: inconsistent standards made it impossible to share data
DBMS A Database Management System (DBMS) is general purpose software and hardware facility to: • Create, delete, reorganize, and manipulate data in a database • Store, retrieve, share, and maintain data in a database • Maintain relationships between the database components • Provide security and procedures relating to privilege and access. • The integrity of all the updates and transactions that are carried out. • interface for the access, deletion and addition of data and for redefining the relationships within the database. A DBMS is a collection of programs that manages the database structure and controls access to the data stored in the database.
DBMS Disadvantages • DBMSs are complex; • Need for explicit backup and control; • Costs associated with development and operation can be substantial; • Consolidation of an entire business’ information resources can create a high level of vulnerability.
The Database System Environment • Hardware: computer, storage, networks, devices • Software: OS, DBMS, Applications, Untilities • People: • System Administrator • Database Administrator • Database Designers • Systems Analysts and Programmers • End-Users • Procedures • Data
Database Systems Types • Number of Users: • Single-user: usually desktop • Multi-user: workgroup (small); enterprise (large) • Location: • Centralized: all data stored in a database at a single site • Distributed: database is distributed across several sites • Type and Use: • Production (transactional): designed to support day-to-day use • Decision Support: designed to make tactical and strategic decisions at middle- and high-management levels • Decision Support Systems (data warehouse): use of historical data from many sources to make decisions such as pricing, sales forecasts, marketing positioning (e.g. structural estimates for insurance by underwriters)
DBMS Functions • Data Dictionary Management • Data Storage Management--Data Storage Definition Language (DSDL) • Data Transformation and Presentation • Database Control Language (DBCL) • Security Management • Multi-User Access Control • Backup and Recovery Management • Data Integrity Management • Data Access Languages • Data Definition Language (DDL) • Data Manipulation Language (DML) • Application Program Interfaces • COBOL, C, PASCAL, Visual Basic • Administrative Utilities • Data Communication Interfaces • queries, reports, email through web browsers
Features of a good DBMS • Open ended--can be extended • Flexible--can be changed • Efficient • Easy to use • Security should be built-in. • Data independence
Models • A database is a model of a user’s model of reality (Kroenke) • Many different types of models involved in databases • Reality • Objects, Properties, Relationships • Unique Identifier • ANSI/SPARC • Conceptual Model • External Model • Internal Model • Physical Model
ANSI/SPARC ArchitectureAmerican National Standards Institute/Standards Planning and Requirements Committee Conceptual Model (Database Administrator View) External Model (end-user views) Internal Model (end-user views) Physical Model (storage view)
Conceptual Model • Global view of data • Enterprise-wide view as seen by DBA • Conceptual schema • basic blueprint for the database design • frequently represented with E-R diagrams • Hardware and software independent
External Model • Accessed by • Application programmer • End-user • External Schema • User’s authorized view of the data • A subset of the Conceptual Schema or a logical view of the Conceptual Schema • Hardware Independent; software dependent
Internal Model • Implementation of Conceptual Schema • Hierarchical Model DBMS • Network (CODASYL) Model DBMS • Relational Model DBMS • Object-Oriented Model DBMS • Semantic Model DBMS • Hardware independent; software dependent
Physical Model • Description of how data is to be stored • Definition of physical storage devices • Definition of physical access methods • Hardware and software dependent
Modeling Reality • Common Conceptual Modeling Terms: • Entity: a person, place, event, or thing for which data is to be collected • Attributes: properties or characteristics of an entity which describe the entity in the context of interest • Identifier: a means of distinguishing one entity from another • Entity Class: a collection of all entities of the same type, i.e. entities that have exactly the same properties • Relationship: an association among entities in the same or different classes • Internal Modeling Terms: unique to Internal (implementation) model • Hierarchical • Network (CODASYL) • Relational • OODBMSSemantic • Physical Model: strategy for storage and access is unique to the internal model
ADVISOR ADVISEE INSTRUCTOR STUDENT FACULTY OFFICE Relationships • One-to-Many • Advisor may have many Advisees • Advisee has but one Advisor (our choice) • Many-to-Many • Instructor may have many Students • Student may have many Instructors • One-to-One • Faculty is assigned to one office • An office is assigned to one faculty
Data Integrity Constraints Measures taken to ensure data is accurate • Business Constraints: rules that must be satisfied for the business • example: managers vacation days shall not exceed 20 • Entity Integrity Constraint: there is an attribute of the entity that is used to uniquely identify that entity • example: student id or ss# • Static Domain Constraint: a value for a property can only be one of the items in a predefined list • example: faculty may only be instructor, assistant professor, associate professor, professor • Referential Integrity Constraint: in a one-to-many association, an entity on the many side must be associated (reference) an entity on the one side • example: in an advising relationship, a student’s advisor must be a faculty member
Data Independence • Physical Data Independence: • Application programs and terminal activities remain logically unimpaired whenever any changes are made in either storage representation or access methods. • Logical Data Independence: • Application programs and terminal activities remain logically unimpaired when information preserving changes of any kind are made to the conceptual design
Hierarchical Model • Entity Segment • Entity Class Segment type • Attributes: Fields • Identifier Value-bearing field or disk address reference • Relationships • Conceptual Model--1-1 and 1-many in parent- child relationship; some support for two parents for same child • Physical Model--uses child/twin pointer strategy for 1-many • Data Access 3GL products--COBOL, PL/I, C, Pascal • Commercial Products IMS (DL/I) and Focus
DBMS Models - Hierarchical A cruise ship reservations system Ports of Departure Miami Los Angeles New York Names of Ships QE 2 The Love Boat The Oriana Sailing Dates April 15 May 30 July 15 Cabin Numbers A-1 A-2 A-3 The only way into the system is via the port, searching for ships or dates is inefficient - very rigid and inflexible.
Hierarchical Model • Advantages • Common database makes sharing practical • Security is provided and enforced • Supports some data independence • Referential integrity maintained through parent-child relationship • Very efficient for data models that are hierarchical (one-to-many) • Many hierarchical type applications are on mainframes
Hierarchical Model • Disadvantages • Knowledge of physical level required • Does not support logical data independence and does not support all physical data independence operations • Not all problems are one-to-many types • Problems with multiple parent implementation • Problems with anomalies for parent deletion • Application development in 3GL time-consuming • Support programs are not part of the DBMS • “System created by programmers for programmers!”
Network (CODASYL) Model • Entity Record • Entity Class Record type • Attributes: Data items • Identifier Value-bearing field or disk address reference • Relationships • Conceptual Model--1-1 and 1-many in owner-member set relationship; some provide elementary many-to-many relationships • Physical Model--same type records: doubly-linked, ringed structure • owners: additional references to first & last associated member in each set • members: additional references to associated owner in each set • Data Access 3GL products--COBOL, PL/I, C, Pascal • Commercial Products DBMS-10, IDMS (Cullinet), IDS (Honeywell), TOTAL, IMAGE, MDBS-III
Journalism 101 Film 200 TV 210 Courses Instructors D. Barry R. DeNiro D. Rather Students Student A Student B Student C Student D Student E DBMS Models - Network A college class scheduling system Here the relationships are much more flexible but the complexity of link management makes this approach unwieldy and slow.
Network Model • Advantages • Can be used to directly implement one-to-one, one-to-many, and some many-to-many relationships • Access, navigation is superior to hierarchical model • Enforces referential integrity through owner-member relationship • Achieves some physical data independence
Network Model • Disadvantages • Difficult to design and use • Does not support logical data independence • Very complex--not for the novice • Navigation is achieved at the record level
Relational Model • Entity Row (Tuple) • Entity Class Table (Relation) • Attributes: Column (?dimension?) • Identifier Value-bearing field or generated value • Relationships • Conceptual Model--1-1 and 1-many relationships • Physical Model--uses foreign key to link parent to child • Data Access • 4GL-- SQL • 3GL products--COBOL, PL/I, C, Pascal • Commercial Products ACCESS, ORACLE, DB2, SQL/DS, RBASE 500, INGRES, SYBASE
... ... Information Systems Joe Bloe MIS1100 0970000 ... ... Business Statistics Julie King MIS1150 0970010 ... John Smith 0970012 ... Anne Oether 0970015 ... John Smith 0970035 ... MIS1100 0970000 ... MIS1100 0970010 ... MIS1100 0970015 ... MIS1150 0970000 ... MIS1150 0970012 ... MIS1150 0970035 Relational Database Structure Unit Table Student Table Unit_Student Relationship Table Data relating to the relationship is stored in the relational table. Recording the semester of enrolment, marks, and the grade for each student along with the relationship places logically related data in one location.
Relational Model • Advantages • User can focus on only the logical view • Powerful query capabilities from 4GL—SQL • Ad hoc query capability • Aggregate processing as opposed to record at a time • Standardization of language • Creation, management, and data manipulation language • Easier to make changes to the logical design without affecting applications (Logical Data Independence)
Relational Model • Disadvantages • More powerful computers are needed because so much is done for the user • Ease of use creates a false sense of security in the area of design