Mastering Database Normalization: Key Concepts and Practical Examples
Explore the intricacies of database normalization with a focus on key concepts such as canonical covers, functional dependencies (FDs), and various normal forms (3NF, BCNF). This guide offers insights into lossless-join decomposition, dependency preservation, and redundancy reduction. Learn to identify extraneous attributes, apply closure of attributes, and achieve optimal schema design while ensuring that essential properties of decomposition are maintained. This comprehensive overview is essential for students and professionals aiming to deepen their understanding of relational database systems.
Mastering Database Normalization: Key Concepts and Practical Examples
E N D
Presentation Transcript
Reading Assignments • Database Systems The Complete Book: Chapters 3.6, 3.7, 3.8 • Following lecture slides are modified from Jeff Ullman’s slides for Fall 2002 -- Stanford CSCE 520
Closures • Closure of attributes (A+): find keys • Closure of FDs (S+): projection of FDs to decompositions of schema • Canonical cover: minimize the number of functional dependencies • Important for updates CSCE 520
Canonical Cover • Combine FDs if possible and eliminate extraneous attributes: Given a set of FDs S, and an FD XY • Attribute A is extraneous in X if S logically implies (S-{XY}) {(X-A)Y}. • Attribute A is extraneous in Y if the set of FDs (S-{XY}) {X(Y-A)} logically implies S. CSCE 520
Canonical Cover • No functional dependency in Sc contains an extraneous attribute • Each left side of a functional dependency in Sc is unique. CSCE 520
Example Canonical Cover • Given: • A BC • B C • A B • AB C • Combine 1 and 3 into: A BC • From 2 and 4: A is extraneous in 4: BC • C is extraneous in 1: A B • Result: AB, BC CSCE 520
Problems of Relational Database Design • Loss of information (lossless-join) • Inability to represent certain information (dependency preservation) • Repetition of information (normal forms) CSCE 520
Desirable Properties of Decomposition – Lossless-join • Lossless-Join: Let R be a relation schema, S a set of FDs on R, R1 and R2 a decomposition of R. R1 and R2 form a lossless-join decomposition if at least one of the following functional dependencies are in S+ • R1 R2 R1 • R1 R2 R2 CSCE 520
Create Lossless-Join Decomposition • Let R be the DB Schema and f: X Y an FD in S+ • Decompose R into two relations • R1(X,Y) • R2(R-Y) • Continue for each decomposed relation until cannot be decomposed any more CSCE 520
Desirable Properties of Decomposition – Dependency Preserving • Dependency Preservation: all dependencies that hold on the original schema should be able to test on individual schemas after decomposition. • Testing dependency preservations on projections is straight forward. CSCE 520
Testing dependency preservations on decomposition • Let S be the original set of FDs on R, and S’ the union of FDs on projections R1, R2,…,Rn. • A decomposition is dependency preserving if S’+=S+ CSCE 520
Desirable Properties of Decomposition – Avoid Redundancy FD: O.Name O.Address, O.Phone D.Name, D.Breed D.Color, D.Age CSCE 520
Decomposition and FDs • Functional dependencies: can be used in designing a relational database to remove the undesired properties • Normalization using FDs: • Boyce-Codd Normal Form (BCNF) • 3rd Normal Form (3NF) CSCE 520
Boyce-Codd Normal Form • A relation is in Boyce-Codd Normal Form if for all FDs X A in S+ over R at least one of the followings hold: • X A is a trivial FD • if X A is a nontrivial FD then X is a superkey for schema R • Example: • R(Name,Breed,Date, Kennel) • FD: Name,Breed,Date Kennel • R is in BCNF CSCE 520
Decomposition into BCNF • Compute S+ (for FDs in S); • if there is a R that is not BCNF then - let XA a non-trivial FD on R s.t. XR is not in S+ and X A = ; - Decompose R into: • R1(R-A) • R2(X,A) CSCE 520
BCNF • Not every BCNF decomposition is dependency preserving • Example: • FDs: AB C and C B • Keys: {A,B} and {A,C} • C B is BCNF violation, therefore need to decompose to R1=(AC) and R2=(BC) • Decomposition cannot enforce AB C! CSCE 520
Third Normal Form • Modifies BCNF conditions so no need to decompose in this problem situation • An attribute is prime if it is member of any key • X A violates 3NF if and only if X is not a superkey and also A is not a prime. CSCE 520
3NF Conditions • A relation is in 3NF if for all FDs X A in S+ over R at least one of the followings hold: • X A is a trivial FD • if X A is a nontrivial FD then X is a superkey for schema R • Each attribute B in A-X is contained in a candidate key for R CSCE 520
3NF and BCNF • BCNF gives • Lossless-join • No-redundancy • Dependency preservation not always possible • 3NF gives • Lossless-join • Dependency preservation • May have null values (transitive dependencies) CSCE 520