440 likes | 573 Vues
This chapter elaborates on fundamental concepts in relational database design, focusing on normalization principles, including First Normal Form (1NF), Boyce-Codd Normal Form (BCNF), and Third Normal Form (3NF). It emphasizes the significance of functional dependencies, the process of decomposition of schemas, and explores common pitfalls such as redundancy and loss of information. The importance of achieving lossless-join decomposition is highlighted, along with practical examples, particularly within a banking enterprise context, illustrating the management of complex relationships and attributes in database schemas.
E N D
Chapter 7: Relational Database Design • First Normal Form • Functional Dependencies • Decomposition • Boyce-Codd Normal Form • Third Normal Form
First Normal Form • R is in first normal form if the domains of all attributes of R are atomic • In object relational/object oriented databases, attributes can be composite or multi-valued • But in relational databases, composite attributes will need to be flatten out and multi-valued attributes need to be represented by another relation
Pitfalls in Relational Database Design • We can create tables to represent an ER design in many different ways • Combine attributes differently to create tables • Why do we choose some ways over the others? • Redundancy • Inability to represent certain information • E.g. relationships among attributes
Example • Consider the relation schema:Lending-schema = (branch-name, branch-city, assets, customer-name,loan-number, amount) • Redundancy, why? • Inability to represent certain information, why? • Cannot store information about a branch if no loans exist • Can use null values, but they are difficult to handle.
Why Redundancy Is Bad? • Wastes space • Complicates updating, introducing possibility of inconsistency of assets value • We know why inability to represent certain information is bad.
Decomposition • Decompose the relation schema Lending-schema into: Branch-schema = (branch-name, branch-city,assets) Loan-info-schema = (customer-name,loan-number, branch-name, amount)
Lossless-join decomposition • All attributes of an original schema (R) must appear in the decomposition (R1, R2): R = R1 R2 • For all possible relations r on schema R r = R1 (r) R2 (r)
Non Lossless-Join Decomposition • Decomposition of R = (A, B) • R1 = (A) R2 = (B) A B A B 1 2 1 1 2 B(r) A(r) r A B A (r) B (r) We do not loss any tuple but we lose the relationship between A and B 1 2 1 2
Relational DB Design Process • Decide whether a particular relation R is in “good” form. • In the case that a relation R is not in “good” form, decompose it into a set of relations {R1, R2, ..., Rn} such that • each relation is in good form • the decomposition is a lossless-join decomposition • Our theory is based on: • functional dependencies • multivalued dependencies
Functional Dependencies • Constraints on the set of legal relations. • Require that the value for a certain set of attributes determines uniquely the value for another set of attributes. • A functional dependency is a generalization of the notion of a key.
Functional Dependencies • Let R be a relation schema R and R • The functional dependency holds onR if and only if for any legal relations r(R), whenever any two tuples t1and t2 of r agree on the attributes , they also agree on the attributes . That is, t1[] = t2 [] t1[ ] = t2 [ ]
Example • Example: Consider r(A,B) with the following instance of r. • On this instance, AB does NOT hold, but BA does hold. • Note: function dependency needs to hold for any possible instance of a relation. • 4 • 1 5 • 3 7
More Example • Consider the schema: Loan-info-schema = (customer-name,loan-number, branch-name, amount). We expect this set of functional dependencies to hold: loan-numberamount loan-number branch-name but would not expect the following to hold: loan-number customer-name (why?)
Keys Defined by Functional Dependencies • K is a superkey for relation schema R if and only if K R • K is a candidate key for R if and only if • K R, and • for no K, R
Example Keys • Lending-schema = (branch-name, branch-city, assets, customer-name,loan-number, amount) • Superkeys? • Candidate keys?
Functional Dependencies • A functional dependency is trivial if it is satisfied by all instances of a relation • E.g. • customer-name, loan-number customer-name • customer-name customer-name • In general, is trivial if
Closure of Attribute Sets • Given a set of attributes a, define the closureof aunderF (denoted by a+) as the set of attributes that are functionally determined by a under F:ais in F+ a+
Algorithm to compute a+ • The closure of a under F result := a;while (changes to result) do for each in F do begin if result • then result := result end
Example of Attribute Set Closure • R = (A, B, C, G, H, I) • F = {A BA C CG HCG IB H} • (AG)+ 1. result = AG 2. result = ABCG (A C and A B) 3. result = ABCGH (CG H and CG AGBC) 4. result = ABCGHI (CG I and CG AGBCH)
Example Closures • Lending-schema = (branch-name, branch-city, assets, customer-name,loan-number, amount) • Closure of attribute set (loan-number)? • Closure of attribute set (customer-name, loan-number)? • Closure of attribute set (branch-name)? • Given: • loan-number branch-name, amount • branch-name branch-city, assets • customer-name,loan-number Lending-schema
Testing for superkey and candidate key Testing functional dependencies Computing closure of F Uses of Attribute Closure
Testing for Keys • To test if is a superkey, we compute +, and check if +contains all attributes of R. • To test if is a candidate key, first make sure is a superkey. Then make sure no subset of is a superkey • In the previous example, is AG a candidate key? • Is AG a super key? • Does AG R? == Is (AG)+ R • Is any subset of AG a superkey? • Does AR? == Is (A)+ R • Does GR? == Is (G)+ R
Example: Testing for keys • Lending-schema = (branch-name, branch-city, assets, customer-name,loan-number, amount) • Is (loan-number) a superkey/candidate key? • Is (customer-name, loan-number) a superkey/candidate key? • Is (branch-name) a superkey/candidate key? • Is (customer-name, loan-number,branch-name) a superkey/candidate key? • Given: • loan-number branch-name,amount • branch-name branch-city, assets • customer-name,loan-number Lending-schema
Testing Functional Dependencies • To check if a functional dependency holds (or, in other words, is in F+), just check if +.
Example: Testing for functional dependencies • Lending-schema = (branch-name, branch-city, assets, customer-name,loan-number, amount) • Does loan-number branch-city, assets hold? • Does customer-name, loan-number, amount amount hold? • Given: • loan-number branch-name,amount • branch-name branch-city, assets • customer-name,loan-number Lending-schema
Computing Closure of F • But what is a closure of F? • Given a set F set of functional dependencies, there are certain other functional dependencies that are logically implied by F. • E.g. If A B and B C, then we can infer that A C • The set of all functional dependencies logically implied by F is the closure of F. • We denote the closure of F by F+.
Armstrong’s Axioms • Sound and Complete rules: • if , then (reflexivity) • if , then (augmentation) • if , and , then (transitivity) • These rules are • sound (generate only functional dependencies that actually hold) and • complete (generate all functional dependencies that hold).
Example • R = (A, B, C, G, H, I)F = { A B, A C, CG H, CG I, B H } • some members of F+ • A H • by transitivity from A B and B H • AG I • by augmenting A C with G, to get AG CG and then transitivity with CG I • CG HI • from CG H and CG I : “union rule” can be inferred from • definition of functional dependencies, or • Augmentation of CG I to infer CG CGI, augmentation ofCG H to inferCGI HI, and then transitivity
Closure of Functional Dependencies • Derived rules from Armstrong’s axioms • If holds and holds, then holds (union) • If holds, then holds and holds (decomposition) • If holds and holds, then holds (pseudotransitivity)
When we compute the closure of attribute set , we already implicitly used Armstrong’s axioms • Let us just use the closure of attribute set to calculate the closure of functional dependency set
Now Compute Closure of F • For each R, we find the closure +, and for each S +, we output a functional dependency S.
Example: Closure of F • Lending-schema = (branch-name, branch-city, assets, customer-name,loan-number, amount) • How to compute the closure of F (F is given below)? • Given F: • loan-number branch-name,amount • branch-name branch-city, assets • customer-name,loan-number Lending-schema
We just talked about the maximal set of functional dependencies that can be derived from a functional dependent set F • Closure of F • What about the minimal set of functional dependencies that is equivalent functional dependent set F • Canonical cover of F
Canonical Cover • Intuitively, a canonical cover of F is a “minimal” set of functional dependencies equivalent to F, having no redundant dependencies or redundant parts of dependencies • What does “equivalent” mean? • What does “minimal” mean?
Equivalent Means: • Give two functional dependency sets F and F’, they are equivalent if and only if: • F logically implies all functional dependencies in F’ • F’ logically implies all functional dependencies in F • We use to represent logically imply and to represent equivalent • How to test if F F’? • For each F’, test if it holds in F • You may need to calculate a+ in F
Minimal Means: • No redundant functional dependencies! • Sets of functional dependencies may have redundant dependencies that can be inferred from the others • Eg: A C is redundant in: {A B, B C, A C} • Parts of a functional dependency may be redundant • E.g. on RHS: {A B, B C, A CD} can be simplified to {A B, B C, A D} • E.g. on LHS: {A B, B C, AC D} can be simplified to {A B, B C, A D}
Extraneous Attributes • For any in F • Attribute A is extraneous ifF (F – {}) {( – A) }. • Attribute A is extraneous if F (F – {}) { (– A)} • Use attribute closures to check equivalence
Examples • Given F = {AC, ABC } • B is extraneous in AB C because {AC, AB C} is equivalent to {AC, AC } = {AC} • Given F = {AC, ABCD} • C is extraneous in ABCD because {AC, ABCD} is equivalent to {AC, ABD}
Canonical Cover • A canonical coverfor F is a set of dependencies Fc such that • F logically implies all dependencies in Fc, and • Fclogically implies all dependencies in F, and • No functional dependency in Fccontains an extraneous attribute, and • Each left side of functional dependency in Fcis unique.
Compute a canonical cover for F repeatUse the union rule to replace any dependencies in F11 and 12 with 112 Find a functional dependency with an extraneous attribute either in or in If an extraneous attribute is found, delete it from until F does not change • Note: Union rule may become applicable after some extraneous attributes have been deleted, so it has to be re-applied
Example • R = (A, B, C)F = {A BC, B C, A B,ABC} • Combine A BC and A B into A BC • Set is now {A BC, B C, ABC} • Is B extraneous in ABC • {A BC, B C, ABC} ? {A C, B C, ABC} NO • Is C extraneous in ABC • {A BC, B C, ABC} ? {A B, B C, ABC} Yes • Set is now {A B, B C, ABC} • Is A is extraneous in ABC • {A B, B C, ABC} ? {A B, B C, BC} Yes • Set is now {A B, B C} • The canonical cover is: {A B, B C}
Example: Canonical Cover of F • Lending-schema = (branch-name, branch-city, assets, customer-name,loan-number, amount) • How to compute the Canonical Cover of F (F is given below)? • Given F: • loan-number branch-name,branch-city,assets,amount • branch-name branch-city, assets • customer-name,loan-number Lending-schema