Normal Forms in Database Design

Schema Refinement: Normal Forms

Normal Forms • Given <R, F>, a relation schema R together with a set of FD’s, we want to determine if R is ina “good” shape! • If not, we need to decompose R into smaller “good” relations;  How to measure this goodness and how to achieve it? • To address these issues, we need to study normal forms • If a relation schema is in some normal form, we know that it is in some “good” shape, in the sense that it won’t suffer from certain kinds of (redundancy) problems.

BCNF 3NF 2NF 1NF Normal Forms • The normal forms based on FD’s are • First normal form (1NF) • Second normal form (2NF) • Third normal form (3NF) • Boyce-Codd normal form (BCNF) • These normal forms have increasingly restrictive requirements

First & Second Normal Forms A relation scheme is said to be in first normal from (1NF) if the values in the domain of each attribute of the relation are atomic. In other words, only one value is associated with each attribute and the value is not a set of values or a list of values. A database scheme is in first normal form if every relation scheme included in the database scheme is in 1NF. A relation scheme R<S,F> is in second normal from (2NF) if it is in the 1NF and if all nonprime attributes are fully functionally dependent on the relation key(s). A database scheme is in second normal form if every relation scheme included in the database scheme is in second normal form.

Third Normal Form Let R be a relation schema, F a set of FD’s on R, X⊆ R, and A∈ R. • We say R w.r.t. F is in third normal form (3NF), if for each FD X  A in F, at least one of the following conditions holds: • A  X (that is, X  A is a trivial FD), or • Xis a superkey, or • IfXis not a key, thenAis part of some key of R • To determine whether <R, F> is in 3NF: • For every non-trivial FD X  A in F, we check whether X is a superkey. If not, we then check whether its RHS, A, is part of any key of R. If both conditions fail, we conclude that R is not in 3NF w.r.t. F.

Boyce-Codd Normal Form Let R be a relation schema, F a set of FD’s on R, X⊆ R, and A∈ R. • We say R w.r.t. F is in Boyce-Codd normal form (BCNF), if for each FD X  A in F, at least one of the following holds: • A  X (that is the FD is trivial) or • Xis a superkey • To determine whether <R, F> is in BCNF or not, we check every non-trivial FD inF. • If there exists a FD X  A in F such that X+ ≠ R, then R is not in BCNF. Otherwise, we say R is BCNF w.r.t. F

Decomposition into BCNF • Consider <R, F>, where R is in 1NF. • If R is not in BCNF, we can always obtain a lossless-joindecomposition of R into a collection of BCNF relations • However, it may not always be dependency preserving • The basic step of a BCNF algorithm: Suppose X  A  Fis a FD violating the BCNF requirement, where X  R and AR • Decompose R into XA and R – A • If either R – A or XA is not in BCNF, decompose it further

Example R = ABCDE F = { A B, C D } A B R1 = AB F1 = { A B } R2 = ACDE F2= { C D} C D R21 = CD F21 = { C D } R22 = ACE F22 = { }

Decomposition into 3NF • We can always obtain a lossless-join, dependency-preserving decomposition of a relation into 3NF relations. How? • We discuss 2 approaches to decompose <R, F>. First: • Approach 1: Follow the binary decomposition method for BCNF Let R = { R1, R2, . . . Rn} be the result. Recall that this is always lossless-join, but may not preserve the FD’s; so need to fix it? • Identify the set N of FD’s in F that are lost (i.e., not preserved) • For each FD X  A in N, create a relation schema XA and add it to R • A refinement step: if there are several FD’s with the same LHS, e.g., X  A1,X  A2, . . . , X  Ak, we create just one relation with schema XA1…Ak That is, we replace these k FD’s (having the same LHS) with a single equivalent FD X  A1…Ak and create just one relation instead of k relation schemas XA1, … ,XAk

Example (3NF Decomposition) R = ABCDE F = { BD E, C B , CE A } BD E R1 = BDE F1 = { BD E } R2 = ABCD F2 = {C B, CD A } C B R21 = CB F21 = { C B } R22 = ACD F22= { CD A} • CE A is not preserved, since A∉ {CE}+ w.r.t. F1⋃F21⋃ F22  We add to R, a new relation R3 = CEA with F3= {CE A}

Example (using a different order) R = ABCDE F = { BD E, C B , CE A} • This decomposition is dependency preserving, and of course lossless-join CE A R1 = CEA F1 = { CE A } R2 = BCED F2 = { C B , BD E} BD E R22 = BCD F22 = { C B} R21 = BDE F21 = { BD E } C B R221 = BC F221 = { C  B } R222 = CD F222 = 

Decomposition into 3NF • Previous (binary decomposition approach): • Lossless-join √ • May not be dependency preserving. If so, then add extra relations XA, one for each FD X→ A we lost • Now, the synthesis approach • Dependency preservation √ • However, may not be lossless-join. If so, we need to add to R, only one extra relation schema that includes the attributes that form any key of R What would be the FDs on this newly added relation?

Decomposition into 3NF (synthesis) Consider relation schema <R, F> • The synthesis approach: • Get a canonical cover Fcof F • For each FD X  A in Fc, add schema XA to R • If the decomposition R is not lossless, need to fix it. Add to R an extra relation schema containing just those attributes that form any key of R

Example • R=( A,B,C ) • F = { A B, CB } • Decompose R into R1 = ( A,B ) and R2 = ( B,C ) • This decomposition is not lossless Add R3 = ( A,C ) • The decomposition R= {R1, R2, R3} is both lossless and dependency-preserving

Ann Algorithm to Check Lossless join Suppose relation R{A1 , . . . , Ak} is decomposed into R1,. . . , Rn To determine if this decomposition is lossless, we use a table, L[ 1 … n] [ 1 . . . k] Initializing the table: for each relation Rido for each attribute Aj do if Ajis an attributein Ri then L [ i ][ j ]  aAj elseL [ i ][ j ]  biAj

Algorithm to Check Lossless (cont’d) repeat for each FD X  Y in Fdo: if∃ rows i and j such that L [ i ] == L [ j ], for each attribute in X, then for ∀ column t corresponding to an attribute At in Ydo: if L [ i ][ t ] == aAt thenL [ j ][ t ]  aAt else if L [ j ][ t ] == aAt thenL [ i ][ t ]  aAt else L [ j ][ t ] L [ i ][ t ] untilno change The decomposition is lossless if, after performing this algorithm, L contains a row of all a’s. That is, if there exists a row i in L such that: L [ i ][ t ] == aAt for every column t corresponding to each attribute At in R

Examples • Given ≺R,F≻, where R=( A,B,C,D ), and F = { A B, AC,CD } is a set of FD’s on R • Is the decomposition R = {R1, R2} lossless, where R1 = ( A,B, C ) and R2 = ( C ,D)? • To be discussed in class • Now consider S=( A,B,C, D,E ) and theset G of FD’s on S, where G = { ABCD, AE, CD } • Is decomposition of S= {S1, S2,S3} lossless, where S1 = ( A,B, C ), S2 = ( B,C, D ), and S3 = ( C,D, E )? • To be discussed in class

Dependency-Preserving Checking • Let ≺R,F≻, where F = {X1 Y1,…, Xn Yn}. • Let R ={ R1,…,Rk } be a decomposition of R and Fi be the projection of F on Ri Below is an algorithm that decides dependency preservation. preservedTRUE for each FD X  Y in F and while preserved == TRUE do begin compute X+ under F1 . . .  Fn ; ifY⊈ X+ then preservedFALSE; end

Example • Consider R=( A,B,C,D ), F = { A B, BC,CD } • Is the decomposition R = {R1, R2} dependency-preserving, where R1= ( A,B ), F1 = { A B }, R2= ( A,C ,D), and F2 = { C D, AD, AC }? • Check if A  B is preserved • Compute A+ under { A B } { C D, AD, A C} • A+ = { A, B, D } • Check if B  A+ • Yes • A B is preserved • Check if B  C is preserved • Compute B+ under { A B } { C D, AD, A C} • B+ = { B } • Check if C  B+ • No • B  C is not preserved The decomposition is not dependency-preserving

Normal Forms in Database Design

Normal Forms in Database Design

Presentation Transcript

Schema Refinement and Normal Forms

Schema Refinement and Normal Forms

Schema Refinement and Normal Forms

Schema Refinement and Normal Forms

Schema Refinement and Normal Forms

Schema Refinement and Normal Forms

Schema Refinement and Normal Forms Chapter 19

Schema Refinement: Normal Forms

Schema Refinement and Normal Forms

Schema Refinement and Normal Forms Chapter 19

Schema Refinement and Normal Forms

LECTURE 5: Schema Refinement and Normal Forms

CHAPTER 19 SCHEMA, REFINEMENT AND NORMAL FORMS

Schema Refinement and Normal Forms

Schema Refinement and Normal Forms

Schema Refinement and Normal Forms

Schema Refinement and Normal Forms

Schema Refinement and Normal Forms

Schema Refinement and Normal Forms

Schema Refinement and Normal Forms Chapter 19

Schema Refinement and Normal Forms