1 / 77

Lecture 14: Database Design

Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002 http://www.sims.berkeley.edu/academics/courses/is202/f02/. Lecture 14: Database Design. SIMS 202: Information Organization and Retrieval. Lecture Overview. Review

boone
Télécharger la présentation

Lecture 14: Database Design

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002 http://www.sims.berkeley.edu/academics/courses/is202/f02/ Lecture 14: Database Design SIMS 202: Information Organization and Retrieval

  2. Lecture Overview • Review • Databases and Database Design • Database Life Cycle • ER Diagrams • Database Design • Normalization • Web-Enabled Databases

  3. Lecture Overview • Review • Databases and Database Design • Database Life Cycle • ER Diagrams • Database Design • Normalization • Web-Enabled Databases

  4. Models (1) Application 1 Application 2 Application 3 Application 4 External Model External Model External Model External Model Application 1 Conceptual requirements Application 2 Conceptual requirements Conceptual Model Logical Model Internal Model Application 3 Conceptual requirements Application 4 Conceptual requirements

  5. Database System Life Cycle Physical Creation 2 Conversion 3 Design 1 Growth, Change, & Maintenance 6 Integration 4 Operations 5

  6. Another View of the Life Cycle Integration 4 Operations 5 Design 1 Physical Creation 2 Conversion 3 Growth, Change 6

  7. Database Design Process Application 1 Application 2 Application 3 Application 4 External Model External Model External Model External Model Application 1 Conceptual requirements Application 2 Conceptual requirements Conceptual Model Logical Model Internal Model Application 3 Conceptual requirements Application 4 Conceptual requirements

  8. Entity • An Entity is an object in the real world (or even imaginary worlds) about which we want or need to maintain information • Persons (e.g.: customers in a business, employees, authors) • Things (e.g.: purchase orders, meetings, parts, companies) Employee

  9. Attributes Birthdate First Age Employee Middle Name SSN Last Projects • Attributes are the significant properties or characteristics of an entity that help identify it and provide the information needed to interact with it or use it (This is the Metadata for the entities)

  10. Relationships • Relationships are the associations between entities • They can involve one or more entities and belong to particular relationship types • One to One • One to Many • Many to Many

  11. Relationships Project Attends Student Class Supplies project parts Supplier Part

  12. Types of Relationships Assigned Assigned Assigned Employee Employee Employee Project Truck Project • Concerned only with cardinality of relationship 1 1 n 1 m n Chen ER notation

  13. More Complex Relationships Manager Assigned Employee Project Manages Employee Evaluation Employee Project 1/1/1 1/n/n n/n/1 SSN Date Project 1 4(2-10) Manages 1 Is Managed By n

  14. Weak Entities Part# Invoice # Invoice# Quantity Contains Order Order-line Rep# • Owe existence entirely to another entity

  15. Supertype and Subtype Entities Manages Employee Is one of Sales-rep Clerk Sold Other Invoice

  16. Many to Many Relationships SSN Proj# Proj# Hours Is Assigned Project Assignment Project Assigned SSN Employee

  17. Lecture Overview • Review • Databases and Database Design • Database Life Cycle • ER Diagrams • Database Design • Normalization • Web-Enabled Databases

  18. Database Design Process Application 1 Application 2 Application 3 Application 4 External Model External Model External Model External Model Application 1 Conceptual requirements Application 2 Conceptual requirements Conceptual Model Logical Model Internal Model Application 3 Conceptual requirements Application 4 Conceptual requirements

  19. Database Design Process Application 1 Application 2 Application 3 Application 4 External Model External Model External Model External Model Application 1 Conceptual requirements Application 2 Conceptual requirements Conceptual Model Logical Model Internal Model Application 3 Conceptual requirements Application 4 Conceptual requirements

  20. Requirements Analysis • Conceptual Requirements • Systems Analysis Process • Examine all of the information sources used in existing applications • Identify the characteristics of each data element • Numeric • Text • Date/time • Etc. • Examine the tasks carried out using the information • Examine results or reports created using the information

  21. Database Design Process Application 1 Application 2 Application 3 Application 4 External Model External Model External Model External Model Application 1 Conceptual requirements Application 2 Conceptual requirements Conceptual Model Logical Model Internal Model Application 3 Conceptual requirements Application 4 Conceptual requirements

  22. Conceptual Design • Conceptual Model • Merge the collective needs of all applications • Determine what Entities are being used • Some object about which information is to maintained • What are the Attributes of those entities? • Properties or characteristics of the entity • What attributes uniquely identify the entity • What are the Relationshipsbetween entities • How the entities interact with each other?

  23. Developing a Conceptual Model • Overall view of the database that integrates all the needed information discovered during the requirements analysis • Elements of the Conceptual Model are represented by diagrams, Entity-Relationship or ER Diagrams, that show the meanings and relationships of those elements independent of any particular database systems or implementation details • Can also be represented using other modeling tools (such as UML)

  24. Database Design Process Application 1 Application 2 Application 3 Application 4 External Model External Model External Model External Model Application 1 Conceptual requirements Application 2 Conceptual requirements Conceptual Model Logical Model Internal Model Application 3 Conceptual requirements Application 4 Conceptual requirements

  25. Logical Design • Logical Model • How is each entity and relationship represented in the Data Model of the DBMS • Hierarchic? • Network? • Relational? • Object-Oriented?

  26. Database Design Process Application 1 Application 2 Application 3 Application 4 External Model External Model External Model External Model Application 1 Conceptual requirements Application 2 Conceptual requirements Conceptual Model Logical Model Internal Model Application 3 Conceptual requirements Application 4 Conceptual requirements

  27. Physical Design • Internal Model • Choices of index file structure • Choices of data storage formats • Choices of disk layout

  28. Database Design Process Application 1 Application 2 Application 3 Application 4 External Model External Model External Model External Model Application 1 Conceptual requirements Application 2 Conceptual requirements Conceptual Model Logical Model Internal Model Application 3 Conceptual requirements Application 4 Conceptual requirements

  29. Database Application Design • External Model • User views of the integrated database • Making the old (or updated) applications work with the new database design

  30. Lecture Overview • Review • Databases and Database Design • Database Life Cycle • ER Diagrams • Database Design • Normalization • Web-Enabled Databases

  31. Normalization • Normalization theory is based on the observation that relations with certain properties are more effective in inserting, updating and deleting data than other sets of relations containing the same data • Normalization is a multi-step process beginning with an “unnormalized” relation • Hospital example from Atre, S. Data Base: Structured Techniques for Design, Performance, and Management

  32. Normal Forms • First Normal Form (1NF) • Second Normal Form (2NF) • Third Normal Form (3NF) • Boyce-Codd Normal Form (BCNF) • Fourth Normal Form (4NF) • Fifth Normal Form (5NF)

  33. Normalization Unnormalized Relations First normal form Functional dependencyof nonkey attributes on the primary key - Atomic values only Second normal form No transitive dependency between nonkey attributes Third normal form Boyce- Codd and Higher Full Functional dependencyof nonkey attributes on the primary key All determinants are candidate keys - Single multivalued dependency

  34. Unnormalized Relations • First step in normalization is to convert the data into a two-dimensional table • In unnormalized relations data can repeat within a column

  35. Unnormalized Relations

  36. First Normal Form • To move to First Normal Form a relation must contain only atomic values at each row and column • No repeating groups • A column or set of columns is called a Candidate Key when its values can uniquely identify the row in the relation

  37. First Normal Form

  38. 1NF Storage Anomalies • Insertion: A new patient has not yet undergone surgery -- hence no surgeon # -- Since surgeon # is part of the key we can’t insert • Insertion: If a surgeon is newly hired and hasn’t operated yet -- there will be no way to include that person in the database • Update: If a patient comes in for a new procedure, and has moved, we need to change multiple address entries • Deletion (type 1): Deleting a patient record may also delete all info about a surgeon • Deletion (type 2): When there are functional dependencies (like side effects and drug) changing one item eliminates other information

  39. Second Normal Form • A relation is said to be in Second Normal Form when every nonkey attribute is fully functionally dependent on the primary key • That is, every nonkey attribute needs the full primary key for unique identification

  40. Second Normal Form

  41. Second Normal Form

  42. Second Normal Form

  43. 1NF Storage Anomalies Removed • Insertion: Can now enter new patients without surgery • Insertion: Can now enter Surgeons who haven’t operated • Deletion (type 1): If Charles Brown dies the corresponding tuples from Patient and Surgery tables can be deleted without losing information on David Rosen • Update: If John White comes in for third time, and has moved, we only need to change the Patient table

  44. 2NF Storage Anomalies • Insertion: Cannot enter the fact that a particular drug has a particular side effect unless it is given to a patient • Deletion: If John White receives some other drug because of the penicillin rash, and a new drug and side effect are entered, we lose the information that penicillin can cause a rash • Update: If drug side effects change (a new formula) we have to update multiple occurrences of side effects

  45. Third Normal Form • A relation is said to be in Third Normal Form if there is no transitive functional dependency between nonkey attributes • When one nonkey attribute can be determined with one or more nonkey attributes there is said to be a transitive functional dependency • The side effect column in the Surgery table is determined by the drug administered • Side effect is transitively functionally dependent on drug so Surgery is not 3NF

  46. Third Normal Form

  47. Third Normal Form

  48. 2NF Storage Anomalies Removed • Insertion: We can now enter the fact that a particular drug has a particular side effect in the Drug relation • Deletion: If John White recieves some other drug as a result of the rash from penicillin, but the information on penicillin and rash is maintained • Update: The side effects for each drug appear only once

  49. Boyce-Codd Normal Form • Most 3NF relations are also BCNF relations • A 3NF relation is NOT in BCNF if: • Candidate keys in the relation are composite keys (they are not single attributes) • There is more than one candidate key in the relation, and • The keys are not disjoint, that is, some attributes in the keys are common

  50. Most 3NF Relations Are Also BCNF – Is This One?

More Related