130 likes | 307 Vues
This article provides an in-depth look at functional dependencies (FDs) within relational databases. It explains the concept of FDs, including the notation and the rules that govern their implications and closure. You will learn how to determine whether a relation satisfies a set of FDs, and the process to compute the closure of a set of attributes. The discussion also covers practical rules such as trivial implication, augmentation, transitivity, and union, along with examples to solidify understanding of how to derive new dependencies from existing ones.
E N D
Functional Dependencies (FDs) Let r(R) be a relation and let t r, then the restriction of t to X R, written t[X], is the projection of t onto X. BC ABC t[BC] = <2, 5> = {(B:2), (C:5)} R(A B C) 1 2 3 2 2 5 2 3 5 4 4 5 r = t = Let R be a relational schema and let X R and Y R. X Y is a functional dependency or FD. A relation r(R) satisfies the FD X Y (or X Y holds) if for any two tuples t1 and t2 in r, t1[X] = t2[X] t1[Y] = t2[Y]; alternatively (and equivalently) if for every (sub)tuple s in Xr, |X=sXYr| = 1. A C B C AB C AC B
FD Implication Let r(R) and let F be a set of FDs over R. Then r satisfies F if each FD in F holds for r. F may imply that other FDs also hold. F implies X Y if X Y holds for every relation that satisfies F. F = {A B, B C} r satisfies F. F implies A C. r = A B C D 1 2 3 4 5 6 7 8 5 6 7 9 0 1 2 3 Proof: 1. Let s[A] = t[A] 2. s[A] = t[A] s[B] = t[B] given, A B 3. s[B] = t[B] 1 & 2, modus ponens 4. s[B] = t[B] s[C] = t[C] given, B C 5. s[C] = t[C] 3 & 4, modus ponens
F+ Let S be a set of attributes. If F is a set of FDs over S, the set of all FDs implied by F is called the closure of F, denoted F+. When we assert an FD X Y, we mean X Y F+. Rules for computing F+: (trivial implication) Y X X Y e.g., Name City Name (accumulation) X Y, W Z, W Y X YZ e.g., GuestNr Name City, Name RoomNr GuestNr Name City RoomNr (projection) X Y, Z Y X Z e.g., GuestNr Name City RoomNr GuestNr RoomNr We compute F+ by a least-fixed-point process.
Sound and Complete Rules The implication rules for F+ are: sound: the derived FDs hold for any relation satisfying F; complete: repeated application of the rules derives all implied FDs. Proof of Soundness. (Y X X Y): Let s[X] = t[X]. Then since Y X, s[Y] = t[Y]. (X Y, W Z, W Y X YZ): Let s[X] = t[X]. Then since X Y, s[Y] = t[Y]. Then since W Y, s[W] = t[W] and since W Z, s[Z] = t[Z]. Now, since s[Y] = t[Y] and s[Z] = t[Z], s[YZ] = t[YZ]. (X Y, Z Y X Z): Let s[X] = t[X]. Then since X Y, s[Y] = t[Y], and since Z Y, s[Z] = t[Z]. Proof of Completeness (basic idea): Assume an FD X Y holds but is not in F+. We can show, however, that X Y can be computed by trivial implication, accumulation, and projection and thus contradict our assumption.
Example of F+ If the set of attributes is ABC and the set of FDs F = {A B, B C}, then F+ = {A B, B C A A, B B, C C, AB AB, AC AC, BC BC, ABC ABC, AB A, AB B, AC A, AC C, BC B, BC C, ABC A, ABC B, ABC C, ABC AB, ABC AC, ABC BC, A BC, A C, A AB, A AC, A ABC, B BC, AB C, AB AC, AB BC, AB ABC, AC B, AC AB, AC BC, AC ABC}
Additional FD-Implication Rules (augmentation) X Y XZ YZ RoomNr Cost RoomNr NrDays Cost NrDays (transitivity) X Y, Y Z X Z GuestNr Name, Name RoomNr GuestNr RoomNr (union) X Y, X Z X YZ RoomNr NrBeds, RoomNr Cost RoomNr NrBeds Cost
Checking for X YF+ • Generate F+ and see if X Y is present (expensive) • Derive X Y from F, or determine that it’s not derivable • Derivation sequence for X Y: sequence of FDs • Each FD is given in F or follows by a sound derivation rule • X Y is the last FD in the sequence • Examples: R = ABCD, F = {A B, B C}, AD C F+?, BC A F+? 1. A B given 2. B C given 3. A C transitivity, 1 & 2 4. AD CD augmentation, 3 5. AD C projection, 4 1. BC B trivial implication 2. B C given 3. BC C transitivity, 1 & 2 . . . How do we know we cannot derive BC A?
TAP Derivation Sequence • A particular derivation sequence always works! • List the given FDs • T: Trivial Implication • A: Accumulation (repeated zero or more times with FDs in F) • P: Projection (if needed) • Examples: R = ABCD, F = {A B, B C}, AD C F+?, BC A F+? 1. A B given 2. B C given 3. BC BC T How do we know we cannot derive BC A? Accumulation yields nothing more, and projection cannot yield A on the rhs, and BC A F+ iff there is a TAP derivation sequence for BC A. 1. A B given 2. B C given 3. AD AD T 4. AD ADB A 5. AD ADBC A 6. AD C P
X+ — Closure of a Set of Attributes • X+ = maximal accumulation in a TAP derivation sequence starting with X. • Algorithm for X+ given a set of FDs F: • 1. Start with X+ = X. • 2. If Y Z F and Y X+, X+ becomes X+Z. • 3. Repeat 2 until no more changes to X+ (least fixed point). • Examples: R = ABCD, F = {A B, B C} AD+ = ABCD BC+ = BC BD+ = BCD D+ = D A+ = ABC
X Y F+ iff Y X+ • Significant observation! • X Y F+looks like a problem requiring exponential time • BUT has a polynomial-time solution (linear with well-chosen data structures) • This is an example of the essence of good computer science.
X+ and Hypergraph Reachability To test X Y F+, mark the vertices in X and see if the vertices in Y are reachable following directed edges. A D F+? Yes A G F+? No E CD F+? Yes A BH F+? No AE HG F+? Yes . . .
FD Equivalence • Two sets of FDs F & G are equivalent, written F G, if F implies each FD in G and conversely. • If F G, then F+ = G+. F = {A B, AB C, C D} G = {A BC, C D, A D} In F, A+ = ABCD A BC & A D and C+ = CD C D. In G, A+ = ABCD A B, AB+ = ABCD AB C, and C+ = CD C D. F+ G+ F+ G+ F G F G F G G F (G+ F+ F+ G+) F+ = G+
Keys and FDs Let U be a set of attributes, and let F be a set of FDs over U. Let R U be a relational schema. A subset K of R (K need not be a proper subset of R) is a superkey of R if K R F+ and is a candidate key (minimal key) of R if there does not exist a proper subset K of K such that K R F+. Example: U = ABCDE and F = {A B, B A, AB C, D BC}. Schema Candidate Keys AB A, B CE CE ABCD D ABCDE DE