160 likes | 363 Vues
More on the Relational Database Model. We have presented details of four basic steps involved in relational database design: Identification of entities Identification of candidate key fields and actual key fields Creation of entity-relationship diagram Design of relational database
E N D
More on the Relational Database Model • We have presented details of four basic steps involved in relational database design: • Identification of entities • Identification of candidate key fields and actual key fields • Creation of entity-relationship diagram • Design of relational database • We will examine the four basic steps above in more detail, focusing on: • Rules for transactional database design • Ensuring integrity of data in tables • Basic operations for combining/analyzing data in tables • Defining indexes • Cleaning dirty data MIS 90-728 Lecture Notes
Database Design for Transactions • The central role of most databases is to record transactions: • Cars are towed; • Students are assigned to classes; • Clients in a social services office have their needs assessed. • These entities are useful only if there are other entities that explain: • Who provides the service; • Who receives the service; • How the service is characterized. • In addition, a linking entity set can be created to handle a many-to-many relationship: • Multiple students can attend multiple classes; • Multiple cars can undergo multiple inspection processes. MIS 90-728 Lecture Notes
1 M M 1 (undergoes) (performs) INSPECTION VEHICLE GARAGE Products-in-Columns Form • A particular object is used in a fixed (small) number of transactions • Example: car inspection Inspection data E-R Diagram Relational Database Design GARAGE(Garage ID, Garage Name, ...) VEHICLE (VIN, Make, Year, ...) INSPECTION (Inspection Number, VIN@, Garage ID@, Inspection Date, Brakes, Lights, Steering) Recall that one-to-many relationships are implemented in relational databases as foreign keys MIS 90-728 Lecture Notes
1 M M M (requests) (used in) M 1 SALESPERSON CUSTOMER PRODUCT SALE Classical Linking Table Form • A particular object is used in a variable (large) number of transactions • Example: a storesells 5,000 different products (SKUs). Can we use products-in-columns form? (rings up) • Problems with a many-to-many relationship: • Data redundancy: • System efficiency and output errors: MIS 90-728 Lecture Notes
(appears in) 1 (requests) M (appears in) (conducts) M 1 SALESPERSON SPLINK SALE CUSTOMER PRODUCT Products-in-Rows Form • Devise a “bridge” (linking) entity that replaces one many-to-many relationship with two one-to-many relationships. • Example: create an SPLINK table with attributes based only on ProductID and SalesID E-R Diagram Relational Database Design SALESPERSON (Employee ID, Employee First Name, Employee Last Name, ...) CUSTOMER (Customer ID, Customer First Name, Customer Last Name, ...) PRODUCT CODE (Product Code, Product Description) SALE (Sale ID, Transaction Date, Customer ID@, Employee ID@) SPLINK (Sale ID@, Product Code@ Quantity, Price) tblSPLINK MIS 90-728 Lecture Notes
One-to-One Relationships • A one-to-one relationship occurs when an instance of one attribute is associated with one (or possibly zero) instances of another attribute. • For example, a TEACHER entity can be associated with one instance of the OFFICE entity and vice-versa 1 1 TEACHER OFFICE Why might one define distinct entities in a one-to-one relationship? MIS 90-728 Lecture Notes
Generalization Hierarchies • A generalization hierarchy occurs if one entity is uniquely associated with two or more attributes, and that relationship can be characterized as two or more one-to-one entity relationships • For example, a TEACHER entity could be linked with the entities ASSIGNED_PC, ASSIGNED_MAC and ASSIGNED_WORKST TEACHER 1 1 1 G G 1 1 1 ASSIGNED_PC ASSIGNED_MAC ASSIGNED_WORKST MIS 90-728 Lecture Notes
Properties of Keys and Data Integrity Recall the definition of a primary key: In addition, a primary key must: • Take non-null values at all times • Determine all non-key attributes These properties represent entity integrity for the RDBM. There are a variety of other keys that can appear in RDBMs: • Superkey: • Candidate key: • Secondary key: A foreign key field in a given table must either: • be null (no value has yet been assigned), or • take a value equal to one of the key values in the related table This property represents referential integrity for the RDBM. MIS 90-728 Lecture Notes
Code Attributes • Codes are a way of classifying attribute values that ensure uniformity and consistency • Example: consider an attribute of the STUDENT entity called ST_Dept Without codes, the department “Electrical and Computer Engineering” could be recorded as: “Elec./Comp. Eng.”, “ECE”, “E&CE”, etc. • Codes reduce key entry errors, inconsistency and obsolescence. • Three types of code implementation schemes: • Self-documenting: • Cryptic codes and descriptions: • All codes in a single table: MIS 90-728 Lecture Notes
Relational Database Table Operations A number of operations that can be performed on one or more tables form the foundation for queries: • Union • Intersect • Difference • Product • Select • Project MIS 90-728 Lecture Notes
Relational Database Table Operations • Join: combine information from two or more tables linked by common attributes • Example: MIS 90-728 Lecture Notes
1 1 1 When is Data Redundancy a Good Thing? • Generally, aside from foreign keys, there should be as little repetition of entity attributes between tables as possible. • Sometimes, however, it is advantageous to have certain fields appear in more than one table as non-FK fields: • Line_Price represents the price of the item at the time it was purchased; • Line_Number allows reporting of purchases in the order they were actually made MIS 90-728 Lecture Notes
Multiple indexes for a table are defined easily in Access Indexes and Data Retrieval Primary key fields distinguish every table row (entity) from each other, enabling easy lookups of particular rows by that primary key value. But what if we wish to identify particular rows of a table quickly through non-primary key values? For example, we wish to identify all invoices that have been created on a certain date, but do not wish to examine each row of the table to find a date that matches the one in mind. Define a table index, or “pointer” to rows of the table based on values of the designated attribute. MIS 90-728 Lecture Notes
Data Cleaning • Database development often entails input of data from multiple, unreliable sources: • mainframe databases • spreadsheets • paper files • verbal descriptions • Foreign data must be preprocessed (“cleaned”) before the database may be updated: • maintain consistent lists of codes • identify duplicate records • identify records common to two or more tables • update primary keys via missing foreign key values Data cleaning may be the most important task in database implementation MIS 90-728 Lecture Notes
A Note on Names • Entities: • Names should be simple and descriptive and capitalized for clarity, e.g. EMPLOYEE, JOB CLASSIFICATION, etc. • Attribute names should be title case and correspond to natural language, e.g. First Name, Rank, Salary • Tables • Names can match the names of corresponding entities in title case, e.g. Employee • OR follow so-called “Hungarian convention”: “tbl” preceding name, e.g. tblEmployee, tblJobClassification • Fields • Field names may be identical in format to Attribute names • OR be linked explicitly to table name, e.g. Emp_Fname, Emp_LName Most RDBMs in practice use Hungarian convention and field names linked to table names. Whatever you do, be consistent! MIS 90-728 Lecture Notes
Packages such as Microsoft Access, FoxPro and Oracle support this functionality Bringing It All Together in a Commercial RDBM Package • Good relational database software packages allows the user to: • Record and display the design of every table, including field names, descriptions, types, ranges, key fields, foreign keys and indexes in a data dictionary; • Record and display relationships between tables; • Support at least the fundamental relational functions SELECT, PROJECT and JOIN; MIS 90-728 Lecture Notes