1 / 60

Keys and Functional Dependency

CS157A. Lecture 14. Keys and Functional Dependency. Prof. Sin-Min Lee Department of Computer Science San Jose State University. Data Normalization. Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that avoid unnecessary duplication of data.

maj
Télécharger la présentation

Keys and Functional Dependency

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS157A Lecture 14 Keys and Functional Dependency Prof. Sin-Min Lee Department of Computer Science San Jose State University

  2. Data Normalization • Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that avoid unnecessary duplication of data. • The process of decomposing relations with anomalies to produce smaller, well-structured relations. • Primary Objective: Reduce Redundancy,Reduce nulls, • Improve “modify” activities: • insert, • update, • delete, • but not read • Price: degraded query, display, reporting

  3. Functional Dependency and Keys • Functional Dependency: The value of one attribute (the determinant) determines the value of another attribute. • Candidate Key: Each non-key field is functionally dependent on every candidate key.

  4. Functional dependency • a constraint between two attributes (columns) or two sets of columns • A  B if “for every valid instance of A, that value of A uniquely determines the value of B” • or …A B if “there exists at most one value of B for every value of A”

  5. Functional Dependencies R X Y Z • FDs defined over two sets of attributes: X, Y Ì R • Notation: X à Y reads as “X determines Y” • If X à Y, then all tuples that agree on X must also agree on Y 1 2 3 2 4 5 1 2 4 1 2 7 2 4 8 3 7 9

  6. Functional Dependencies (example) X Y Z X Y Z 1 2 3 2 4 5 1 2 4 1 2 7 2 4 8 3 7 9

  7. … functional dependency • some examples • SSN  Name, Address, Birthdate • VIN  Make, Model, Color • note: the LHS is the determinant • so functional dependency is the technical term for determines

  8. Candidate Keys • an attribute (or set of attributes) that uniquely identifies a row • primary key is a special candidate key • values cannot be null • e.g. • ENROLL (Student_ID, Name, Address, …) • PK = Student_ID • candidate key = Name, Address

  9. … candidate key • a candidate key must satisfy: • unique identification. • implies that each nonkey attribute is functionally dependent on the key (for not(A  B) to be true, A must occur more than once (with a different B), or A must map to more than one B in a given row) • nonredundancy • no attribute in the key can be deleted and still be unique • minimal set of columns (Simsion)

  10. keys and dependencies EMPLOYEE1 (Emp_ID, Name, Dept_Name, Salary) determinant functional dependency

  11. EMPLOYEE2 (Emp_ID, Course_Title, Name, Dept_Name, Salary, Date_Completed) not fully functionally dependant on the primary key

  12. determinants & candidate keys • candidate key is always a determinant (one way to find a determinant) • determinant may or may not be a candidate key •  candidate key is a determinant that uniquely identifies the remaining (nonkey) attributes • determinant may be • a candidate key • part of a composite candidate key • nonkey attribute

  13. Introduction • Data integrity maintained by various constraints on data • Functional dependencies are application constraints that help DB model real-world entity • Join dependencies are a further constraint that help resolve some FD constraint limitations

  14. Normal Forms provide database designers with: • A formal framework for analyzing relation schemas based on their keys and on the functional dependencies among their attributes. • A series of tests that can be carried out on individual relation schemas so that the relational database can be normalized to any degree.

  15. Keys • superkey:a superkey is a set of attributes S  R={A1,A2,….An} with the property that no two tuples t1 and t2 in any relation state r of R will have t1[S] = t2[S]. • A key K is a superkey with the additional property that removal of any attribute from K will cause K not to be a superkey anymore.

  16. Keys • The difference between a key and a superkey is that a key has to be “minimal”. • Example: • {SSN} is a key for EMPLOYEE, whereas {SSN}, {SSN,ENAME}, {SSN, ENAME, BDATE} are all superkeys.

  17. Keys • If a relation schema has more than one “minimal” key, each is called a candidate key.

  18. Keys • one of the candidate keys is designated to be the primary key. • Each relation schema must have a primary key. • For example, {SSN} is the only candidate key for EMPLOYEE, so it is also the primary key.

  19. R(A B C D E) • FD1. A -> C • FD2. BC ->D • FD3. E ->AB • result = A • By FD1. A -> C A result • result = {A, C} By FD2. BC -> D BC result result = {A, C} By FD3. E ->AB E result result = {A, C}  {A}+ = {A, C}

  20. Similarly {B}+ = {B} • {C}+ = {C} • {D}+ = {D} • {E}+ = {E,A,B,C,D} • E is a candidate key Now, we see {AB}+ = {ABCD} {AC}+ = {AC} {AD}+ = {ACD} {BC}+ = {BCD} {BD}+ = {BD} {CD}+ = {CD} {ABC}+ = {ABCD} {ABD}+ = {ABCD} {BCD}+ = {BCD} {ACD}+ = {ACD}

  21. What is the largest normal form of this table? R(A B C D E) FD1. A ->C FD2. BC ->D FD3. E ->AB Answer: {E} is the only candidate key of R The non-prime attributes are: A, B, C, D As FD!. A->C, we have transitive dependency. Thus R(ABCD) is 2NF but not 3NF

  22. What is Normalization? • The purpose of normalization is to produce a stable set of relations that is a faithful model of the operations of the enterprise. By following the principles of normalization, we can achieve a design that is highly flexible, allowing the model to be extended when needed to account for new attributes, entity sets, and relationships.

  23. Normal Forms • A relation is in specific normal form if it satisfies the set of requirements or constraints for that form. All of the normal forms are nested in that each satisfies the constraints of the previous one but is a "better" form because each eliminates flaws found in the previous

  24. Steps in Normalization

  25. 1NF • relation is in first normal form if it contains no multivalued attributes • remove repeating groups to a new table as already demonstrated, “carrying” the PK as a FK

  26. First Normal Form ( 1NF ) • the domains of attributes must include only atomic(simple, indivisible) valuesand the value of any attribute in a tuple must be a single value from the domain of the attribute.

  27. First Normal Form ( 1NF ) • example: Department DNAME DNUMBER DMGRSSN DLOCATIONS research 5 333445555 {Bellaire , Sugarland Houston} Administration 4 987654321 {Stafford} Headquarters 1 888665555 {Houston} • the domain of DLOCATIONS contains atomic values, but some tuples can have a set of these values. In this case, • DNUMBER x->DLOCATIONS. • The domain of DLOCATIONS contains sets of values and hence in non-atomic.

  28. Our Example in 1NF PROJ_NUM PROJ_NAME EMP_NUM EMP_NAME JOB_CLASS CHG_HOUR HOURS • Key (PROJ_NUM, EMP_NUM) • Given PROJ_NUM • PROJ_NAME is determined • Given EMP_NUM • EMP_NAME, JOB_CLASS, and CHG_HOUR are determined

  29. 2NF • a relation is in second normal form if it is in first normal form AND every nonkey attribute is fully functionally dependant on the primary key • i.e. remove partial functional dependencies, so no nonkey attribute depends on just part of the key

  30. EMPLOYEE2 (Emp_ID, Course_Title, Name, Dept_Name, Salary, Date_Completed) not fully functionally dependant on the primary key

  31. Second Normal Form ( 2NF ) • it is based on the concept of full functional dependency. • A functional dependency XY is a full functional dependency , for any attribute A  X, {X - {A}}  Y.

  32. Second Normal Form • A relation is in second normal form (2NF) if and only if it is in first normal form and all the nonkey attributes are fully functionally dependent on the key.

  33. Second Normal Form • A table is in second normal form (2NF) if: • It is in 1NF • It includes no partial dependencies. No attribute is dependent on only a portion of the primary key.

  34. 2NF • a relation is in 2NF if it is in 1NF and any one of these is true: • the PK consists of only 1 attribute • all attributes are part of the PK (no nonkey attributes) • every nonkey attribute is functionally dependant on the whole PK

  35. 2NF (Example) A B C D 2 Candidate Keys R with key{AB} is NOT 2NF R with key{AC} is NOT 2NF

  36. Second Normal Form ( 2NF ) fd1 fd2 • {SSN, PNUMBER}HOURS is a fully dependency (neither SSNHOURS nor PNUMBERHOURS holds). fd3

  37. Second Normal Form ( 2NF ) EMP_PROJ • The functional dependencies fd1,fd2,fd3 lead to the decomposition of EMP_PROJ into the three relation schemas EP1,EP2,EP3, each of which is in 2NF. fd1 fd2 fd3 2NF NORMALIZATION EP2 EP3 EP1 fd2 fd1 fd3

  38. 1NF 2NF • EMPLOYEE2 (Emp_ID, Course_Title, Name, Dept_Name, Salary, Date_Completed) •  • EMPLOYEE1 (Emp_ID, Name, Dept_Name, Salary) • and • EMP_COURSE (Emp_ID, Course_Title, Date_Completed) • EMPLOYEE1 satisfies condition1 • EMP_COURSE satisfies condition3

  39. 3NF • a relation is in third normal form if it is in 2NF, AND no transitive dependencies exist • transitive dependency is a functional dependency between nonkey attributes

  40. transitive dependency transitive dependency

  41. … transitive dependency • same problems • insertion anomaly (no salesman without a customer) • deletion anomaly (if a salesman is assigned to only 1 customer, and the customer is deleted, we lose the salesman!) • modification (update) anomaly (reassign salesperson to region)

More Related