1 / 139

Lecture 03: Normal Forms, Constraints, Views

Lecture 03: Normal Forms, Constraints, Views. Wednesday, October 12, 2011. Announcements. HW1: was due Monday HW2: due next Monday. Outline and Reading Material. Schema Normalization: 19.1 – 19.7 Read only about BCNF, skip 3NF Note: you need BCNF for HW2

oihane
Télécharger la présentation

Lecture 03: Normal Forms, Constraints, Views

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 03:Normal Forms, Constraints, Views Wednesday, October 12, 2011 Dan Suciu -- CSEP544 Fall 2011

  2. Announcements • HW1: was due Monday • HW2: due next Monday Dan Suciu -- CSEP544 Fall 2011

  3. Outline and Reading Material • Schema Normalization: 19.1 – 19.7 • Read only about BCNF, skip 3NF • Note: you need BCNF for HW2 • Constraints and triggers: 3.2, 3.3, 5.8 • Views: 3.6 • Answering queries using views: Sec. 1,2,3 Some material is NOT covered in the book.Read the slides carefully: we will not cover all in class! Dan Suciu -- CSEP544 Fall 2011

  4. Schema Normalization Dan Suciu -- CSEP544 Fall 2011

  5. Normal Forms • 1st Normal Form = all tables are flat • 2nd Normal Form = obsolete • Boyce Codd Normal Form = will study • 3rd Normal Form = see book • 4th Normal Form = golden standard Dan Suciu -- CSEP544 Fall 2011

  6. First Normal Form (1NF) • A database schema is in First Normal Form if all tables are flat Student Student Takes Course May needto add keys

  7. name buys Person Product price name ssn Relational Schema Design Conceptual Model: Relational Model: plus FD’s Normalization: Eliminates anomalies Dan Suciu -- CSEP544 Fall 2011

  8. Data Anomalies When a database is poorly designed we get anomalies: Redundancy: data is repeated Updated anomalies: need to change in several places Delete anomalies: may lose data when we don’t want Dan Suciu -- CSEP544 Fall 2011

  9. Relational Schema Design Recall set attributes (persons with several phones): One person may have multiple phones, but lives in only one city Anomalies: • Redundancy = repeat data • Update anomalies = Fred moves to “Bellevue” • Deletion anomalies = Joe deletes his phone number: what is his city ?

  10. Relation Decomposition Break the relation into two: Anomalies have gone: • No more repeated data • Easy to move Fred to “Bellevue” (how ?) • Easy to delete all Joe’s phone number (how ?)

  11. Relational Schema Design(or Logical Design) Main idea: • Start with some relational schema • Find out its functional dependencies • Use them to design a better relational schema Dan Suciu -- CSEP544 Fall 2011

  12. Functional Dependencies What is a functional dependency? • It is a kind of constraint • In theory one should find the FDs during requirement analysis, then apply rigorously the normalization steps that we discuss next • In practice one rarely follows this strict prescription Dan Suciu -- CSEP544 Fall 2011

  13. Functional Dependencies Notation: A1, A2, …, An B1, B2, …, Bm Meaning: If two tuples agree on the attributes A1, A2, …, An then they must also agree on the attributes B1, B2, …, Bm Dan Suciu -- CSEP544 Fall 2011

  14. When Does an FD Hold Definition: A1, ..., Am  B1, ..., Bn holds in R if: t, t’  R, (t.A1=t’.A1  ...  t.Am=t’.Am  t.B1=t’.B1  ...  t.Bn=t’.Bn ) R t if t, t’ agree here then t, t’ agree here t’

  15. Examples An FD holds, or does not hold on an instance: EmpID  Name, Phone, Position Position  Phone but not Phone  Position Dan Suciu -- CSEP544 Fall 2011

  16. Example Position  Phone Dan Suciu -- CSEP544 Fall 2011

  17. Example but not Phone  Position Dan Suciu -- CSEP544 Fall 2011

  18. Example FD’s are constraints: On some instances they hold On others they don’t namecolor categorydepartment color, categoryprice Does this instance satisfy all the FDs ? Dan Suciu -- CSEP544 Fall 2011

  19. Example namecolor categorydepartment color, categoryprice What about this one ?

  20. An Interesting Observation namecolor categorydepartment color, categoryprice If all these FDs are true: name, categoryprice Then this FD also holds: Why ??

  21. Goal: Find ALL Functional Dependencies • Anomalies occur when certain “bad” FDs hold • We know some of the FDs • Need to find all FDs, then look for the bad ones Dan Suciu -- CSEP544 Fall 2011

  22. Armstrong’s Rules (1/3) A1, A2, …, An B1, B2, …, Bm Splitting rule and Combing rule Is equivalent to A1, A2, …, An B1 A1, A2, …, An B2 . . . . . A1, A2, …, An Bm Dan Suciu -- CSEP544 Fall 2011

  23. Armstrong’s Rules (2/3) Trivial Rule A1, A2, …, An Ai where i = 1, 2, ..., n Why ? Dan Suciu -- CSEP544 Fall 2011

  24. Armstrong’s Rules (3/3) Transitive Closure Rule A1, A2, …, An B1, B2, …, Bm If and B1, B2, …, Bm  C1, C2, …, Cp A1, A2, …, An C1, C2, …, Cp then Why ? Dan Suciu -- CSEP544 Fall 2011

  25. Dan Suciu -- CSEP544 Fall 2011

  26. Example (continued) 1. namecolor 2. categorydepartment 3. color, categoryprice Start with these: Infer these:

  27. Example (continued) 1. namecolor 2. categorydepartment 3. color, categoryprice Answers: THIS IS TOO HARD ! There is an easier way.

  28. Closure of a set of Attributes Given a set of attributes A1, …, An The closure, {A1, …, An}+ = the set of attributes Bs.t. A1, …, An B namecolor categorydepartment color, categoryprice Example: name+ = {name, color}{name, category}+ = {name, category, color, department, price}color+ = {color}

  29. Closure Algorithm X={A1, …, An}. repeat untilX doesn’t change do: if B1, …, Bn C is a FD and B1, …, Bn are all in X then add C to X. Example: namecolor categorydepartment color, categoryprice {name, category}+ = ? [ …in class…]

  30. Practice at Home… A, B  C A, D  E B  D A, F  B R(A,B,C,D,E,F) Compute {A,B}+ X = {A, B, } Compute {A, F}+ X = {A, F, } Dan Suciu -- CSEP544 Fall 2011

  31. Why Do We Need Closure • With closure we can find all FD’s easily • To check if X ® A • Compute X+ • Check if A Î X+ Dan Suciu -- CSEP544 Fall 2011

  32. Practice at Home Find all FD’s implied by: A, B  CA, D  B B  D

  33. Practice at Home Find all FD’s implied by: A, B  CA, D  B B  D Step 1: Compute X+, for every X: A+ = A, B+ = BD, C+ = C, D+ = D AB+ =ABCD, AC+=AC, AD+=ABCD, BC+=BCD, BD+=BD, CD+=CD ABC+ = ABD+ = ACD+ = ABCD (no need to compute– why ?) BCD+ = BCD, ABCD+ = ABCD

  34. Practice at Home Find all FD’s implied by: A, B  CA, D  B B  D Step 1: Compute X+, for every X: A+ = A, B+ = BD, C+ = C, D+ = D AB+ =ABCD, AC+=AC, AD+=ABCD, BC+=BCD, BD+=BD, CD+=CD ABC+ = ABD+ = ACD+ = ABCD (no need to compute– why ?) BCD+ = BCD, ABCD+ = ABCD Step 2: Enumerate all FD’s X  Y, s.t. Y X+ and XY = : AB  CD, ADBC, ABC  D, ABD  C, ACD  B

  35. Keys A superkey is a set of attributes X = A1, ..., Ans.t. for any other attribute B, we have X  B Note: X is a superkey if X+ = [all attributes] A key is a minimal superkey X How do we compute all keys X ?

  36. Example Product(name, price, category, color) name, category  price category  color What are the keys ? Dan Suciu -- CSEP544 Fall 2011

  37. Example Product(name, price, category, color) name, category  price category  color What are the keys ? (name, category) + = name, category, price, color Hence X = (name, category) is a key

  38. Practice at Home Enrollment(student, address, course, room, time) student  address room, time  course student, course  room, time Find all keys Dan Suciu -- CSEP544 Fall 2011

  39. Read at Home We can have more than one key. Example: relation R(A,B,C) satisfies ABCBCA ABCBAC or Find all keys

  40. Boyce-Codd Normal Form There are no“bad” FDs: A relation R is in BCNF if: Whenever X B is a non-trivial dependency,then X is a superkey. Equivalently: A relation R is in BCNF if: " X, either X+ = X or X+ = [all attributes] Dan Suciu -- CSEP544 Fall 2011

  41. Example SSN  Name, City The only key is:{SSN, PhoneNumber}Hence SSN  Name, City is a “bad” dependency Alternatively: SSN+ = Name, City and is neither SSN nor all attributes

  42. BCNF Decomposition Algorithm Normalize(R)find X s.t.: X ≠ X+ ≠ [all attributes]if (not found) then“R is in BCNF”let Y = X+ - X; Z = [all attributes] - X+ decompose R into R1(X  Y) and R2(X  Z)Normalize(R1); Normalize(R2); Y X Z X+

  43. Find X s.t.: X ≠X+ ≠ [all attributes] Example BCNF Decomposition Person(name, SSN, age, hairColor, phoneNumber) SSN  name, age age  hairColor Dan Suciu -- CSEP544 Fall 2011

  44. Find X s.t.: X ≠X+ ≠ [all attributes] Example BCNF Decomposition Person(name, SSN, age, hairColor, phoneNumber) SSN  name, age age  hairColor Iteration 1: Person: SSN+ = SSN, name, age, hairColor Decompose into: P(SSN, name, age, hairColor) Phone(SSN, phoneNumber) name,age,hairColor phoneNumber SSN Dan Suciu -- CSEP544 Fall 2011

  45. Find X s.t.: X ≠X+ ≠ [all attributes] Example BCNF Decomposition Person(name, SSN, age, hairColor, phoneNumber) SSN  name, age age  hairColor What arethe keys ? Iteration 1: Person: SSN+ = SSN, name, age, hairColor Decompose into: P(SSN, name, age, hairColor)Phone(SSN, phoneNumber) Iteration 2: P: age+ = age, hairColor Decompose: People(SSN, name, age)Hair(age, hairColor)Phone(SSN, phoneNumber) Dan Suciu -- CSEP544 Fall 2011

  46. R(A,B,C,D) A  BB  C Practice at Home R(A,B,C,D) A+ = ABC ≠ ABCD Dan Suciu -- CSEP544 Fall 2011

  47. R1(A,B,C) B+ = BC ≠ ABC R2(A,D) R12(A,B) R11(B,C) R(A,B,C,D) A  BB  C Practice at Home R(A,B,C,D) A+ = ABC ≠ ABCD What arethe keys ? What happens if in R we first pick B+ ? Or AB+ ? Dan Suciu -- CSEP544 Fall 2011

  48. Decompositions in General R(A1, ..., An, B1, ..., Bm, C1, ..., Cp) R1(A1, ..., An, B1, ..., Bm) R2(A1, ..., An, C1, ..., Cp) R1 = projection of R on A1, ..., An, B1, ..., Bm R2 = projection of R on A1, ..., An, C1, ..., Cp Dan Suciu -- CSEP544 Fall 2011

  49. Theory of Decomposition Sometimes it is correct: Lossless decomposition

  50. Incorrect Decomposition Sometimes it is not: What’sincorrect ?? Lossy decomposition

More Related