1 / 26

Data Normalization

Data Normalization. N ormal is not something to aspire to, it's something to get away from. ~ Jodie Foster ~. Activity: Sample User Report. Can you build the underlying data model from this ? How many tables? What are the relationships?. Understanding Functional Dependence.

clodia
Télécharger la présentation

Data Normalization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Normalization Normal is not something to aspire to, it's something to get away from. ~ Jodie Foster ~

  2. Activity: Sample User Report Can you build the underlying data model from this? How many tables? What are the relationships?

  3. Understanding Functional Dependence • For attributes A and B, B is functionally dependent on A means each value in column A determines one and only one value in column B. • Written: A B • A determines B • B is the determinant • Ex:SSN Name(Name is functionallydependent on SSN)

  4. Normalization Lingo • Prime attribute = Any attribute which is a primary key, or in the case of a composite key is part of a PK • Non-Prime Attribute = Any attribute which is not part of the PK. • Key Attribute = Prime Attribute • Non-Key Attribute = Non-Prime Attribute

  5. Normalization and FD • Technically, normalization is just the analysis of Functional Dependencies of all columns with respect to the primary key. • There are three “levels” of analysis: • Functional Dependence – any non-prime attributes which as FD on the PK. • Partial Functional Dependence – any non-key attributes which are FD on part of the PK. • Transitive Functional Dependence – any non-key attributes which are FD on some other non-key attribute(s).

  6. Activity: IYCDTYCN! Identify the: • Primary Key? • Prime Attributes? • Non-Prime Attributes Identify the: • Functional Dependencies (WRT the PK) • Partial Functional Dependencies (WRT part of the PK) • Transitive Functional Dependencies (WRT some non-prime attribute)

  7. The Dependency Diagram • The Dependency Diagram is a Very Useful Tool. It depicts the dependencies which exist among the attributes.

  8. Normal Forms • A Normal Form represents the current “state” of the data model. • There are 4 basic normal forms: • Zero Normal Form (0NF) • Non-key attributes exist which are not FD on PK. • First Normal Form (1NF) • All non-key attributes FD on entire PK. • Second Normal Form (2NF) • In 1NF and • No partial functional dependencies exist. • Third Normal Form (3NF) • In 2NF and • No transitive functional dependencies exist.

  9. First Normal Form (1NF) • Definition: • All non-key attributes must be FD on the entire PK. (There must be PKFD for all attributes.) • Rule: • Move each non-key FD column into its own new table. • How to Apply the Rule: For each non-key FD column: • Place non-FD column into a new table • Copy the PK (or part of it) from the original table into the new table. This will be a FK in the new table. • Assign a PK to the new table (typically a composite key of the original Non-FD column and the FK.)

  10. 1NF: Example 1/2 • What’s wrong with this data model? • What should be PK be? Why? • Is there an attribute not FD on the PK? • Is it in 1NF already? • What if Erin takes up bass fishing? • I’m planning a ski trip, whom should I contact? • (How do I know Hobby3, skiing and not Hobby1)?

  11. 1NF: Example 2/2 • What was done: • Hobbies table created. Contains the originally non FD column, “hobby” • The PK (FID) was copied into the hobbies table. • The PK of the Hobbies table is the combination of FID and Hobby. • Questions: • Is this in 1NF? • Can you reproduce the previous data model from this one? • Who likes skiing? Basketball?

  12. Second Normal Form (2NF) • Definition: • The data model must be in 1NF AND • No partial functional dependencies can exist. • Rule: • Move each partially FD non-key column into its own new table. • How to Apply the Rule: For each partial dependency: • Move all partially FD columns into a new table • Copy the determinant into the new table. • Make the determinant of the partial dependency: • The PK for the new table, FK to the existing table.

  13. 2NF: Example 1/2 • What’s wrong with this data model? • What should be PK be? Why? • Do any partial dependencies exist? • Where? • What is the determinant for each, if any? • Is it in 1NF already? 2NF? • I made a mistake, 81HLV3 is a Power edge 5500, not a 4400?

  14. 2NF: Example 2/2 • What was done: • Serial Num + SWID is the primary key. • Servers, Software tables created from partial dependencies, where Serial Num,SWID are the determinants. • Serial Num, is the PK for Servers, SWID is the PK for Software, each are also FK’s for the SWInstallation table • Questions: • Is this in 2NF? • Can you reproduce the previous data model from this one?

  15. Third Normal Form (3NF) • Definition: • The data model must be in 2NF AND • No transitive functional dependencies can exist. • Rule: • Move each transitive FD non-key column into its own new table. • How to Apply the Rule: For each transitive dependency: • Move all transitive FD columns into a new table. • Copy the determinant column into the new table. • Make the determinant of the transitive dependency: • The the PK for the new table. • The FK for the original table.

  16. 3NF: Example 1/2 • What’s wrong with this data model? • What should be PK be? Why? • Do any transitive dependencies exist? • Where? • What is the determinant for each, if any? • Is it in 1NF already? 2NF? 3NF? • I made a mistake, Koors phone number is 4905? • What’s wrong?

  17. 3NF: Example 2/2 • What was done: • Beer ID is the PK. • All transitive dependencies moved into a new table, Distributors. • Distrib ID is the determinant. PK of Distributors table, FK in original Beer table. • Questions: • Is this in 3NF? • Can you reproduce the previous data model from this one?

  18. Higher Normal Forms Yes, there IS more… … and it will blow your mind. 

  19. Boyce-Codd Normal Form (BCNF) • Rule: Eliminate key-transitive dependencies • A table in BCNF Means: • The table is in 3NF • It includes no Non-Key attribute which determines a key attribute, or part of a key attribute.

  20. BCNF: An Example

  21. Fourth Normal Form (4NF) • RULE: Eliminate multiple sets of multi-valued dependencies. • A table in 4NF Means: • The table is in 3NF • It includes no sets of attributes which contain multi-valued dependencies.

  22. 4NF: An Example Figure 4.15 Set of Tables in 4NF Figure 4.14 Multivalued Dependencies

  23. How “far” should one Normalize? • For relational databases: • 1NF is required, at minimum for practical RDBMS implementations. • The majority of the time data models are normalized to 3NF. • Sometimes certain tables are left in 1NF or 2NF, for performance or practical reasons. • Higher normal forms BCNF, 4NF are rare. • In General, the Higher the NF of your DM: • The more complicated the internal DM • The more “programming” required to reproduce the external DM. • But, the lesser the chance for data anomalies!! • It’s a total trade-off: • Database complexity vs. data anomalies.

  24. Mike’s “Road To 3NF” To normalize correctly, follow this process for each table in the data model: Designate acandidate key Any partialdependencies? Party Hard ! n 2NF PKFD for allattributes? Any transitivedependencies? y n 1NF 3NF y Apply2NF Rule n y Apply1NF Rule Apply3NF Rule

  25. Normalization Summary Cheat Sheet • 0NF 1NF (Resolve non FD) • 1NF 2NF (Resolve Partial FD) • 2NF 3NF (Resolve Transitive FD) O O N O N1 O N2 O O N

  26. Data Normalization Questions?

More Related