Relational Database Design Guidelines for Efficient Schema Organization
Learn about Informal Design Guidelines for Relational Database Schema, First to Third Normal Forms, Functional Dependencies, Keys, and Avoiding Anomalies. Ensure optimized data storage and integrity.
Relational Database Design Guidelines for Efficient Schema Organization
E N D
Presentation Transcript
Relational Database Design • Informal design guideline • First Normal Form • Functional Dependencies • Key • Normal form • Second normal form • Third Normal Form • Boyce-Codd Normal Form
Informal design guideline Design a relation schema so that • It is easy to explain its meaning • avoid redundant data • No update anomalies • Avoid null values • Relations can be joined without spurious tuples.
Design guideline : Clear meaning • Design a relation schema so that it is easy to explain its meaning. Do not combine attributes from multiple entity types and relationship types into a single relation. • Figure 14.4
Design guideline: Avoid redundant information • One of goal of schema design is to minimize the storage space that the base relations occupy. • Figure 14.4
Design guideline: no update anomalies • Insertion anomalies • Delete anomalies • Modification anomalies
Design guideline: avoid null values • Null value can: • Waste space • Will give a wrong results of aggregation operations • Null value can have multiple interpretations. • As far as possible, avoid placing attributes in a base relation whose values may frequently be null.
Generation of spurious tuples • Figure 14.5 • Figure 14.6
First Normal Form • Domain is atomic if its elements are considered to be indivisible units • Examples of non-atomic domains: • Set of names, composite attributes • A relational schema R is in first normal form if the domains of all attributes of R are atomic • We assume all relations are in first normal form. • Figure 14.8
Functional Dependencies • Constraints on the set of legal relations. • Require that the value for a certain set of attributes determines uniquely the value for another set of attributes. • A functional dependency is a generalization of the notion of a key.
FD definition • Let R be a relation schema X R and Y R • The functional dependency X Yholds onR if and only if for any legal relations r(R), whenever any two tuples t1and t2 of r agree on the attributes , they also agree on the attributes . That is, t1[] = t2 [] t1[ ] = t2 [ ] • Example: Consider r(A,B) with the following instance of r. • On this instance, AB does NOT hold, but BA does hold. • 4 • 1 5 • 3 7
Use of Functional Dependencies (I) • We use functional dependencies to: • test relations to see if they are legal under a given set of functional dependencies. • If a relation r is legal under a set F of functional dependencies, we say that rsatisfies F.
Use of Functional Dependencies (I) • Functional dependencies allow us to express constraints that cannot be expressed using superkeys. Consider the schema: Loan-info-schema = (customer-name, loan-number, branch-name, amount). We expect this set of functional dependencies to hold: loan-numberamount loan-number branch-name but would not expect the following to hold: loan-number customer-name
Use of Functional Dependencies (II) • Note: A specific instance of a relation schema may satisfy a functional dependency even if the functional dependency does not hold on all legal instances. • For example, a specific instance of Loan-schema may, by chance, satisfy loan-number customer-name.
Inference rules for Functional Dependencies • if X Y, then Y X (reflexivity) • if X Y, then ZX ZY(augmentation) • if X Y, and Y Z, then X Z(transitivity) • if X YZ , then X Y (decomposition) • if X Y, and X Y, then X YZ (union)
Inference rules for Functional Dependencies • A functional dependency is trivial if it is satisfied by all instances of a relation • E.g. • customer-name, loan-number customer-name • customer-name customer-name • In general, is trivial if
Closure of Attribute Sets • Given a set of attributes a, define the closureof aunderF (denoted by a+) as the set of attributes that are functionally determined by a under F: • Algorithm to compute a+, the closure of a under F result := a;while (changes to result) do for each in F do begin if result then result := result end
Example of Attribute Set Closure • R = (A, B, C, G, H, I) • F = {A BA C CG HCG IB H} • (AG)+ 1. result = AG 2. result = ABCG (A C and A B) 3. result = ABCGH (CG H and CG AGBC) 4. result = ABCGHI (CG I and CG AGBCH)
Key • K is a superkey for relation schema R if and only if K R • K is a candidate key for R if and only if • K R, and • for no K, R
Key: find a key • R = (A, B, C, G, H, I) • F = {A BA C CG HCG IB H} • Is AG a candidate key? • Is AG a super key? • Does AG R? == Is (AG)+ R • Is any subset of AG a superkey? • Does AR? == Is (A)+ R • Does GR? == Is (G)+ R
Normal form: second normal form • Problem of first normal form • Figure 10.10
Second normal form • A relation schema R is in second normal form if evey non prime attribute A in R is not partially dependent on any key of R. • Formally: A relation schema R is in second normal form (2NF) if for all non trivial function dependent X A in F+ If A is not contained in any candidate key for R, and X is not a candidate key, then not exist any cnadidate key that contains X.
Third Normal Form • Problem of second normal form • A relation schema R is in third normal form (3NF) if for all: X A in F+at least one of the following holds: • X Ais trivial (i.e., A X) • X is a superkey for R • A is contained in a candidate key for R.
Third Normal form (cont) • Example • R = (J, K, L)F = {JK L, L K} • Two candidate keys: JK and JL • R is in 3NF JK L JK is a superkeyL K K is contained in a candidate key • Banker-schema = (branch-name, customer-name, banker-name) banker-name branch name branch name customer-name banker-name
BC normal form • Some redundancy in thrid normal form. • Fig 10.12
BC normal form definition A relation schema R is in BCNF with respect to a set F of functional dependencies if for all functional dependencies in F+ of the form XY, where X R and Y R,at least one of the following holds: • X Yis trivial (i.e., Y X) • X is a superkey for R