1 / 65

Normalization & The Relational Model

Prof. Ray Larson UC Berkeley SIMS. Normalization & The Relational Model. SIMS 202: Information Organization and Retrieval. Lecture Overview. Review Databases and Database Design Database Life Cycle ER Diagrams Database Design Relational Operations Normalization Discussion Questions.

cree
Télécharger la présentation

Normalization & The Relational Model

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Prof. Ray Larson UC Berkeley SIMS Normalization & The Relational Model SIMS 202: Information Organization and Retrieval

  2. Lecture Overview • Review • Databases and Database Design • Database Life Cycle • ER Diagrams • Database Design • Relational Operations • Normalization • Discussion Questions

  3. Lecture Overview • Review • Databases and Database Design • Database Life Cycle • ER Diagrams • Database Design • Relational Operations • Normalization • Discussion Questions

  4. Models (1) Application 1 Application 2 Application 3 Application 4 External Model External Model External Model External Model Application 1 Conceptual requirements Application 2 Conceptual requirements Conceptual Model Logical Model Internal Model Application 3 Conceptual requirements Application 4 Conceptual requirements

  5. Database System Life Cycle Physical Creation 2 Conversion 3 Design 1 Growth, Change, & Maintenance 6 Integration 4 Operations 5

  6. Another View of the Life Cycle Integration 4 Operations 5 Design 1 Physical Creation 2 Conversion 3 Growth, Change 6

  7. Database Design Process Application 1 Application 2 Application 3 Application 4 External Model External Model External Model External Model Application 1 Conceptual requirements Application 2 Conceptual requirements Conceptual Model Logical Model Internal Model Application 3 Conceptual requirements Application 4 Conceptual requirements

  8. Entity • An Entity is an object in the real world (or even imaginary worlds) about which we want or need to maintain information • Persons (e.g.: customers in a business, employees, authors) • Things (e.g.: purchase orders, meetings, parts, companies) Employee

  9. Attributes Birthdate First Age Employee Middle Name SSN Last Projects • Attributes are the significant properties or characteristics of an entity that help identify it and provide the information needed to interact with it or use it (This is the Metadata for the entities)

  10. Relationships • Relationships are the associations between entities • They can involve one or more entities and belong to particular relationship types • One to One • One to Many • Many to Many

  11. Relationships Project Attends Student Class Supplies project parts Supplier Part

  12. Types of Relationships Assigned Assigned Assigned Employee Employee Employee Project Truck Project • Concerned only with cardinality of relationship 1 1 n 1 m n Chen ER notation

  13. Database Design Process Application 1 Application 2 Application 3 Application 4 External Model External Model External Model External Model Application 1 Conceptual requirements Application 2 Conceptual requirements Conceptual Model Logical Model Internal Model Application 3 Conceptual requirements Application 4 Conceptual requirements

  14. Database Design Process Application 1 Application 2 Application 3 Application 4 External Model External Model External Model External Model Application 1 Conceptual requirements Application 2 Conceptual requirements Conceptual Model Logical Model Internal Model Application 3 Conceptual requirements Application 4 Conceptual requirements

  15. Requirements Analysis • Conceptual Requirements • Systems Analysis Process • Examine all of the information sources used in existing applications • Identify the characteristics of each data element • Numeric • Text • Date/time • Etc. • Examine the tasks carried out using the information • Examine results or reports created using the information

  16. Database Design Process Application 1 Application 2 Application 3 Application 4 External Model External Model External Model External Model Application 1 Conceptual requirements Application 2 Conceptual requirements Conceptual Model Logical Model Internal Model Application 3 Conceptual requirements Application 4 Conceptual requirements

  17. Conceptual Design • Conceptual Model • Merge the collective needs of all applications • Determine what Entities are being used • Some object about which information is to maintained • What are the Attributes of those entities? • Properties or characteristics of the entity • What attributes uniquely identify the entity • What are the Relationshipsbetween entities • How the entities interact with each other?

  18. Developing a Conceptual Model • Overall view of the database that integrates all the needed information discovered during the requirements analysis • Elements of the Conceptual Model are represented by diagrams, Entity-Relationship or ER Diagrams, that show the meanings and relationships of those elements independent of any particular database systems or implementation details • Can also be represented using other modeling tools (such as UML)

  19. Database Design Process Application 1 Application 2 Application 3 Application 4 External Model External Model External Model External Model Application 1 Conceptual requirements Application 2 Conceptual requirements Conceptual Model Logical Model Internal Model Application 3 Conceptual requirements Application 4 Conceptual requirements

  20. Logical Design • Logical Model • How is each entity and relationship represented in the Data Model of the DBMS • Hierarchic? • Network? • Relational? • Object-Oriented?

  21. Database Design Process Application 1 Application 2 Application 3 Application 4 External Model External Model External Model External Model Application 1 Conceptual requirements Application 2 Conceptual requirements Conceptual Model Logical Model Internal Model Application 3 Conceptual requirements Application 4 Conceptual requirements

  22. Physical Design • Internal Model • Choices of index file structure • Choices of data storage formats • Choices of disk layout

  23. Database Design Process Application 1 Application 2 Application 3 Application 4 External Model External Model External Model External Model Application 1 Conceptual requirements Application 2 Conceptual requirements Conceptual Model Logical Model Internal Model Application 3 Conceptual requirements Application 4 Conceptual requirements

  24. Database Application Design • External Model • User views of the integrated database • Making the old (or updated) applications work with the new database design

  25. Lecture Overview • Review • Databases and Database Design • Database Life Cycle • ER Diagrams • Database Design • Relational Operations • Normalization • Discussion Questions

  26. Relational Algebra Operations • Restrict • Project • Product • Union • Intersect • Difference • Join • Divide

  27. Restrict • Extracts specified tuples (rows) from a specified relation (table) • Restrict is AKA “Select”

  28. Project • Extracts specified attributes(columns) from a specified relation.

  29. Join (Natural or Inner) Join A1 B1 A2 B1 A3 B2 B1 C1 B2 C2 B3 C3 A1 B1 C1 A2 B1 C1 A3 B2 C2 • Builds a relation from two specified relations consisting of all possible concatenated pairs, one from each of the two relations, such that in each pair the two tuples satisfy some condition. (E.g., equal values in a given col.)

  30. ER Diagram: Acme Widget Co. Emp# Employee Wage ISA Hourly Sales-Rep Sales Part# Writes Cust# Invoice# Quantity Contains Orders Contains Customer Invoice Line-Item Part Invoice# Rep# Part# Count Cust# Price

  31. Join Items for Relational DB Line_item Parts Customer Invoice

  32. Relational Operations • What is the name of the customer who ordered Large Red Widgets? • Restrict “large red widget” row from Part as temp1 • Join temp1 with Line-item on Part # as temp2 • Join temp2 with Invoice on Invoice # as temp3 • Join temp3 with Customer on cust # as temp4 • Project Company from temp4 as answer

  33. SQL • Database Definition and Querying • Can be used as an interactive query language • Can be imbedded in programs • Relational Calculus combines Restrict, Project and Join operations in a single command: SELECT

  34. SELECT • Syntax: SELECT [DISTINCT] attr1, attr2,…, attr3 FROM rel1 r1, rel2 r2,… rel3 r3 WHERE condition1 {AND | OR} condition2 ORDER BY attr1 [DESC], attr3 [DESC]

  35. SQL SELECT SELECT c.COMPANY FROM Customer c, Parts p, Invoice i, Line_Items z WHERE c.Cust# = i.Cust# AND i.Invoice# = z.Invoice# AND z.Part# = p.Part# AND p.Name = “large red widget”;

  36. Lecture Overview • Review • Databases and Database Design • Database Life Cycle • ER Diagrams • Database Design • Relational Operations • Normalization • Discussion Questions

  37. Normalization • Normalization theory is based on the observation that relations with certain properties are more effective in inserting, updating and deleting data than other sets of relations containing the same data • Normalization is a multi-step process beginning with an “unnormalized” relation

  38. Normal Forms • First Normal Form (1NF) • Second Normal Form (2NF) • Third Normal Form (3NF) • Boyce-Codd Normal Form (BCNF) • Fourth Normal Form (4NF) • Fifth Normal Form (5NF)

  39. Normalization Unnormalized Relations First normal form Functional dependencyof nonkey attributes on the primary key - Atomic values only Second normal form No transitive dependency between nonkey attributes Third normal form Boyce- Codd and Higher Full Functional dependencyof nonkey attributes on the primary key All determinants are candidate keys - Single multivalued dependency

  40. Unnormalized Relations • First step in normalization is to convert the data into a two-dimensional table • In unnormalized relations data can repeat within a column • (The following is a highly contrived example that has only a very vague resemblance to the implementation of the Phone/Photo project database from IS202 in 2004 …)

  41. Unnormalized Relations

  42. First Normal Form • To move to First Normal Form a relation must contain only atomic values at each row and column • No repeating groups • A column or set of columns is called a Candidate Key when its values can uniquely identify the row in the relation

  43. First Normal Form

  44. 1NF Storage Anomalies • Insertion: A new person has not yet taken a picture -- hence no Picture # -- Since Picture # is part of the key we can’t insert • Insertion: If a Person is are known and likely to be photographed, but haven’t been yet -- there is be no way to include that person in the database • Update: If a Person changes status (e.g. Mary Jones becomes a Student) we have to change multiple rows in the database • Deletion (type 1): Deleting a Person record may also delete all info about People in the pictures • Deletion (type 2): When there are functional dependencies (like Object and Object_features) changing one item eliminates other information

  45. Second Normal Form • A relation is said to be in Second Normal Form when every nonkey attribute is fully functionally dependent on the primary key • That is, every nonkey attribute needs the full primary key for unique identification

  46. Second Normal Form Person Table

  47. Second Normal Form People Table

  48. Second Normal Form Picture Table

  49. 1NF Storage Anomalies Removed • Insertion: Can now enter new Persons who haven’t yet taken pictures • Insertion: Can now enter People who haven’t been photographed • Deletion (type 1): If Charles Brown withdraws his photos the corresponding tuples from Person and Picture tables can be deleted without losing information on David Rosen • Update: If John White takes a third picture, and has changed status (e.g., graduate), we only need to change the Person table in one place

  50. 2NF Storage Anomalies • Insertion: Cannot enter the fact that a particular object has a particular feature unless it is associated with a particular picture • Deletion: If John White describes some other object that Beth Little has while shopping, we lose the fact that the bookbag is blue • Update: If the features of an object change change we have to update multiple occurrences of object features

More Related