1 / 27

Normalization

Normalization. Also called “loss-less decomposition” Process of optimizing table structures to eliminate redundancy and avoid anomalies and problems with extensibility. Supports the golden rule: Each fact should be stored in the database only once.

dorjan
Télécharger la présentation

Normalization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Normalization • Also called “loss-less decomposition” • Process of optimizing table structures to eliminate redundancy and avoid anomalies and problems with extensibility. • Supports the golden rule: Each fact should be stored in the database only once. • Does not provide the solution to all design problems but provides a solid foundation.

  2. Normal Forms • 1st Normal Form • 2nd Normal Form • 3rd Normal Form • BCNF • 4th Normal Form • 5th Normal Form • Domain-Key Normal Form

  3. 1st Normal Form First Normal Form is violated if: • The relation has no identifiable primary key. • Any attempt has been made to store a multi-valued fact in a tuple.

  4. 1st NF - Example Evaluate the design solutions on the next four slides for: • Query-ability • Join-ability • Constrain-ability • Extensibility (of Language Domain) • Extensibility (of Schema)

  5. 1NF Example – Schema 1 (correct) Programs Table Employees Table EMPID LANGUAGE EMPID LNAME FNAME SEX DEPT PHONE SALARY 23 COBOL 23 Jones Mark M ITR 555-1087 45000 23 JAVA 25 Smith Sara F FINC 555-2222 55000 23 SQL 26 Billings David M ACTG 555-4356 42000 31 SQL 31 Dance Ivanna F ACTG 444-4887 60000 32 JAVA 32 Jones Mary F ITR 555-8745 70000 32 SQL 35 Barker Bob M ACTG 555-6565 44000 32 VB 36 Woods Robin M ITR 555-9812 90000 32 COBOL 37 Jones Mary F FINC 555-1234 56000 36 VB 36 SQL 36 JAVA Languages Table 37 COBOL 37 SQL NAME FULLNAME COBOL COmmon Business Oriented Language JAVA JAVA SQL Structured Query Language VB Visual Basic

  6. 1NF Example – Schema 2 (incorrect) Employees Table EMPID LNAME FNAME SEX DEPT PHONE SALARY LANGUAGES 23 Jones Mark M ITR 555-1087 45000 COBOL, JAVA, SQL 25 Smith Sara F FINC 555-2222 55000 26 Billings David M ACTG 555-4356 42000 31 Dance Ivanna F ACTG 444-4887 60000 SQL 32 Jones Mary F ITR 555-8745 70000 JAVA, SQL, VB, COBOL 35 Barker Bob M ACTG 555-6565 44000 36 Woods Robin M ITR 555-9812 90000 VB, SQL, JAVA 37 Jones Mary F FINC 555-1234 56000 COBOL, SQL Languages Table NAME FULLNAME COBOL COmmon Business Oriented Language JAVA JAVA SQL Structured Query Language VB Visual Basic

  7. 1NF Example – Schema 3 (incorrect) Employees Table EMPID LNAME FNAME SEX DEPT PHONE SALARY LANG1 LANG2 LANG3 LANG4 23 Jones Mark M ITR 555-1087 45000 COBOL JAVA SQL 25 Smith Sara F FINC 555-2222 55000 26 Billings David M ACTG 555-4356 42000 31 Dance Ivanna F ACTG 444-4887 60000 SQL 32 Jones Mary F ITR 555-8745 70000 JAVA SQL VB COBOL 35 Barker Bob M ACTG 555-6565 44000 36 Woods Robin M ITR 555-9812 90000 VB SQL JAVA 37 Jones Mary F FINC 555-1234 56000 COBOL SQL Languages Table NAME FULLNAME COBOL COmmon Business Oriented Language JAVA JAVA SQL Structured Query Language VB Visual Basic

  8. 1NF Example – Schema 4 (incorrect) Employees Table EMPID LNAME FNAME SEX DEPT PHONE SALARY COBOL JAVA SQL VB 23 Jones Mark M ITR 555-1087 45000 T T T F 25 Smith Sara F FINC 555-2222 55000 F F F F 26 Billings David M ACTG 555-4356 42000 F F F F 31 Dance Ivanna F ACTG 444-4887 60000 F F T F 32 Jones Mary F ITR 555-8745 70000 T T T T 35 Barker Bob M ACTG 555-6565 44000 F F F F 36 Woods Robin M ITR 555-9812 90000 F T T T 37 Jones Mary F FINC 555-1234 56000 T F T F Languages Table NAME FULLNAME COBOL COmmon Business Oriented Language JAVA JAVA SQL Structured Query Language VB Visual Basic

  9. 2nd Normal Form Second Normal Form is violated if: • First Normal Form is violated • If there exists a non-key field(s) which is functionally dependent on a partial key. partial key non-key

  10. 2NF Example – Raw Data JE #1 02-JAN-2003 100 Cash 310 Smith-Capital (owner investment) 20,000 20,000 JE #2 03-JAN-2003 100 Cash 220 Notes Payable (borrowed money) 30,000 30,000 JE #3 03-JAN-2003 120 Supplies 100 Cash 220 Notes Payable (purchased supplies) 5,000 1,000 4,000

  11. 2NF Example – Violation Transactions Table JENO LINENO DATE DESCRIPTION ACCTNO ACCTNAME AMOUNT 1 1 02-JAN-2003 Owner investment 100 Cash 20,000 1 2 02-JAN-2003 Owner investment 310 Smith-Capital (20,000) 2 1 03-JAN-2003 Borrowed money 100 Cash 30,000 2 2 03-JAN-2003 Borrowed money 220 Notes Payable (30,000) 3 1 03-JAN-2003 Purchased Supplies 120 Supplies 5,000 3 2 03-JAN-2003 Purchased Supplies 100 Cash (1,000) 3 3 03-JAN-2003 Purchased Supplies 220 Notes Payable (4,000) Is there a non-key field which is functional dependent on a partial key?

  12. 2NF Example – ViolationFDs that indicate violation of 2NF JENO LINENO DATE DESCRIPTION ACCTNO ACCTNAME AMOUNT 1 1 02-JAN-2003 Owner investment 100 Cash 20,000 1 2 02-JAN-2003 Owner investment 310 Smith-Capital (20,000) 2 1 03-JAN-2003 Borrowed money 100 Cash 30,000 2 2 03-JAN-2003 Borrowed money 220 Notes Payable (30,000) 3 1 03-JAN-2003 Purchased Supplies 120 Supplies 5,000 3 2 03-JAN-2003 Purchased Supplies 100 Cash (1,000) 3 3 03-JAN-2003 Purchased Supplies 220 Notes Payable (4,000)

  13. 2NF Example – Corrected Journal_Entry Table JENO DATE DESCRIPTION 1 02-JAN-2003 Owner investment 2 03-JAN-2003 Borrowed money 3 03-JAN-2003 Purchased Supplies Transactions Table JENO LINENO ACCTNO ACCTNAME AMOUNT 1 1 100 Cash 20,000 1 2 310 Smith-Capital (20,000) 2 1 100 Cash 30,000 2 2 220 Notes Payable (30,000) 3 1 120 Supplies 5,000 3 2 100 Cash (1,000) 3 3 220 Notes Payable (4,000)

  14. 3rd Normal Form Third Normal Form is violated if: • Second Normal Form is violated • If there exists a non-key field(s) which is functionally dependent on another non-key field(s). non-key non-key Note: A candidate key is not a non-key field.

  15. 3NF Example – Violation Journal_Entry Table Are there any non-key fields which functional determine another non-key field? JENO DATE DESCRIPTION 1 02-JAN-2003 Owner investment 2 03-JAN-2003 Borrowed money 3 03-JAN-2003 Purchased Supplies Transactions Table JENO LINENO ACCTNO ACCTNAME AMOUNT 1 1 100 Cash 20,000 Are there any redundant facts? 1 2 310 Smith-Capital (20,000) 2 1 100 Cash 30,000 2 2 220 Notes Payable (30,000) 3 1 120 Supplies 5,000 3 2 100 Cash (1,000) 3 3 220 Notes Payable (4,000)

  16. 3NF Example – ViolationFD that indicates violation of 3NF Journal_Entry Table • Anomalies if not corrected: • update (if name of account 100 changes it must be changed in multiple places risking inconsistancy) • deletion (can't delete JE#3 and its transactions without losing information about account 120) • insertion (can't set up a new account, Jones-capital, for a new partner unless we first have a transaction involving that account. JENO DATE DESCRIPTION 1 02-JAN-2003 Owner investment 2 03-JAN-2003 Borrowed money 3 03-JAN-2003 Purchased Supplies JENO LINENO ACCTNO ACCTNAME AMOUNT 1 1 100 Cash 20,000 1 2 310 Smith-Capital (20,000) 2 1 100 Cash 30,000 2 2 220 Notes Payable (30,000) 3 1 120 Supplies 5,000 3 2 100 Cash (1,000) 3 3 220 Notes Payable (4,000)

  17. 3NF Example – Corrected Journal_Entry Table Accounts Table JENO DATE DESCRIPTION ACCTNO ACCTNAME 1 02-JAN-2003 Owner investment 100 Cash 2 03-JAN-2003 Borrowed money 120 Supplies 3 03-JAN-2003 Purchased Supplies 220 Notes Payable 310 Smith-Capital Transactions Table JENO LINENO ACCTNO AMOUNT 1 1 100 20,000 1 2 310 (20,000) 2 1 100 30,000 2 2 220 (30,000) 3 1 120 5,000 3 2 100 (1,000) 3 3 220 (4,000)

  18. 3NF Example – CorrectedFinal Dependencies JENO DATE DESCRIPTION ACCTNO ACCTNAME 1 02-JAN-2003 Owner investment 100 Cash 2 03-JAN-2003 Borrowed money 120 Supplies 3 03-JAN-2003 Purchased Supplies 220 Notes Payable 310 Smith-Capital JENO LINENO ACCTNO AMOUNT 1 1 100 20,000 All non-key fields are FD on the PK and only the PK. 1 2 310 (20,000) 2 1 100 30,000 2 2 220 (30,000) 3 1 120 5,000 3 2 100 (1,000) 3 3 220 (4,000)

  19. BCNF Normal Form Boyce-Codd Normal Form is violated if: • Third Normal Form is violated • If there exists a partial key which is functionally dependent on a non-key field(s). non-key partial-key

  20. BCNF ExampleSemantics • A student can have more than one major • A student has a different advisor for each major. • Each advisor advises for only one major.

  21. BCNF Example – Violation Student_Majors Table SID MAJOR ADVISOR 1 PHYSICS EINSTEIN 1 BIOLOGY LIVINGSTON 2 PHYSICS BOHR 2 COMPUTER SCIENCE CODD 3 PHYSICS EINSTEIN 4 BIOLOGY LIVINGSTON 4 ACCOUNTING PACIOLI 5 PHYSICS EINSTEIN 6 PHYSICS BOHR 6 BIOLOGY DARWIN 7 COMPUTER SCIENCE CODD 7 BIOLOGY DARWIN Does this relation violate third normal form? Are there any redundant facts?

  22. BCNF Example – ViolationFD that violates BCNF SID MAJOR ADVISOR It is important that you convince yourself that major does not FD advisor. 1 PHYSICS EINSTEIN 1 BIOLOGY LIVINGSTON 2 PHYSICS BOHR 2 COMPUTER SCIENCE CODD 3 PHYSICS EINSTEIN 4 BIOLOGY LIVINGSTON 4 ACCOUNTING PACIOLI 5 PHYSICS EINSTEIN 6 PHYSICS BOHR 6 BIOLOGY DARWIN 7 COMPUTER SCIENCE CODD 7 BIOLOGY DARWIN

  23. BCNF Example – Corrected Advisors Table ADVISOR MAJOR BOHR PHYSICS Student_Advisors Table CODD COMPUTER SCIENCE DARWIN BIOLOGY SID ADVISOR EINSTEIN PHYSICS 1 EINSTEIN LIVINGSTON BIOLOGY 1 LIVINGSTON PACIOLI ACCOUNTING 2 BOHR 2 CODD 3 EINSTEIN Note that the if the original key, counter-intuitively, in schema 1 had been defined as SID & ADVISOR this would have been a 2NF violation. 4 LIVINGSTON 4 PACIOLI 5 EINSTEIN 6 BOHR 6 DARWIN 7 CODD 7 DARWIN

  24. 4th Normal Form 4th Normal Form is violated if: • Boyce Codd Normal Form is violated • If there exists a partial key which has multiple independent multi-valued functional dependencies to other partial keys. partial-key1 partial-key2 partial-key3

  25. 4NF Example – Violation Instruments_Languages Name Instrument Language Fred Piano French Fred Flute Italian Fred Flute Spanish Jane Piano French Jane Oboe French Sam Piano French Sam Oboe Spanish Sam Flute Spanish

  26. 4NF Example – Violation Name Instrument Language Fred Piano French Fred Flute Italian Fred Flute Spanish Jane Piano French Jane Oboe French Sam Piano French Sam Oboe Spanish Sam Flute Spanish Does this relation violate 1st, 2nd, 3rd, or BCNF? Are there any redundant facts?

  27. 4NF Example – Correction LanguagesSpoken InstrumentsPlayed Name Language Name Instrument Fred French Fred Piano Fred Italian Fred Flute Fred Spanish Jane Piano Jane French Jane Oboe Sam French Sam Piano Sam Spanish Sam Oboe Sam Flute

More Related