1 / 136

PART 2

PART 2. RELATIONAL DATABASES. Chapter 7 Relational-Database Design normalization of relational schemas. Main Contents in This Chapter. VII-1 Why normalization needed? pitfalls in relational-database design (§7.1 ) principles of relation normalization

cuyler
Télécharger la présentation

PART 2

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PART 2 RELATIONAL DATABASES

  2. Chapter 7 Relational-Database Designnormalization of relational schemas

  3. Main Contents in This Chapter • VII-1 Why normalization needed? • pitfalls in relational-database design (§7.1 ) • principles of relation normalization • VII- 2 Functional dependency (§7.4) • VII- 3 Definitions of normal forms • the first normal form(§7.2 ), second normal form, third normal form(§7.3.4 ), BCNF(§7.3.2 ) • VII- 4 Decomposition properties (§7.4.4, §7.4.5) • VII- 5 3NF decomposition (§7.5.2) • VII- 6 BCNF decomposition (§7.5.1) Database System Concepts - Chapter 7 Relational-Database Design -

  4. VII-1 Why Normalization Needed ? VII-1-1 Pitfalls in Relational-Database design (§7.1) • As shown in Fig. 7.0.1, logical DBS design consists of • initial relational schema generating (§2.9) • relational schema normalizing • A bad DB design, i.e. schema not being normalized well, may result in • repetition of information • inability to represent certain information Database System Concepts - Chapter 7 Relational-Database Design -

  5. Application area/problem in real world requirements analysis Specification of functional requirements DBMS independent conceptual DBS design Conceptual DBS schema , i.e. E-R schema Initial relational schema generating(§2.9) DBMS dependent logical DBS design Relational schema normalizing (chapter 7) logical DBS schema , i.e. relational data schema physical DBS design physical DBS schema about physical storage structure and access method Fig.7.0.1 DBS design

  6. VII-1-1 Pitfalls in Relational-Database design (cont.) Fig.7.0.2 Case study used in chapter 7 Database System Concepts - Chapter 7 Relational-Database Design - return

  7. VII-1-1 Pitfalls in Relational-Database design (cont.) • For all schema concerning loans in Fig.7.0.2 • Branch-schema, Borrower-schema, Loan-schema • combine these three relations into one single relation • Lending-schema = (branch-name, branch-city, assets, customer-name, loan-number, amount) • relation lending is shown in Fig.7.1 Database System Concepts - Chapter 7 Relational-Database Design -

  8. t1 Perryridge, Horseneck, 1700000, Adams, L-31, 1500 New-York, Long-island, 200000, null, null, null t2 Fig.7.1 Sample lending(Lending-schema) relation

  9. VII-1-1 Pitfalls in Relational-Database design (cont.) • Inserting problem and information redundancy • adding a new loan to the DB • the loan is made by the Perryridge branch to Adams in the amount of $1500, and loan-number is L-31 • tuple t1 = (Perryridge, Horseneck, 1700000, Adams, L-31, 1500) is inserted into DB • data for branch-name, branch-city, assets are repeated for each loan in the 3rd row, the10th row, and the last row that a branch“Perryridge”makes, space is wasted Database System Concepts - Chapter 7 Relational-Database Design -

  10. VII-1-1 Pitfalls in Relational-Database design (cont.) • Deleting problem • e.g. the branch Perryridge is canceled, all loans in this branch should then be removed • in Fig 7.1, every tuples containing Perryridge branch should be deleted • the 3rd row and the10th row Database System Concepts - Chapter 7 Relational-Database Design -

  11. VII-1-1 Pitfalls in Relational-Database design (cont.) • Updating problem and information redundancy • information redundancy in DB complicates updating, introducing possibility of inconsistency of assets value • e.g. changeassets of Perryridge branch from 1700000 to 1900000, • in Fig7.1, every tuples belonging to Perryridgebranch should be updated Database System Concepts - Chapter 7 Relational-Database Design -

  12. VII-1-1 Pitfalls in Relational-Database design (cont.) • Information Representation Problem • to represent directly information about a new-openedbranch (branch-name, branch-city, assets) in which there no exists loan, a tuple containing null vaules, such as t2 = (New-York, Long-island, 200000, null, null, null) , is inserted into the relation lending • null values in DB complicate data handling in DBS • Why ? • lending(branch-name, branch-city, assets, customer-name, loan-number, amount) includes two types of information about only bank and information about loan Database System Concepts - Chapter 7 Relational-Database Design -

  13. VII-1-1Pitfalls in Relational-Database design (cont.) • in E-R diagram of the banking enterprise, the relationship from bank to loan is one-to-many, i.e. a bank branch may make several loans • An improved design: • decompose the relation schema Lending-schema into: Branch-schema = (branch-name, branch-city,assets) Loan-info-schema = (customer-name, loan-number, branch-name, amount) Database System Concepts - Chapter 7 Relational-Database Design -

  14. VII-1-2 Principles of Relation Normalization • 在逻辑DBS设计过程中,将概念 DB设计结果E—R图进行转换,得到面向特定应用领域的初始关系模式集 • 这些初始关系模式集中可能存在多种(作为完整性约束的)关系模式属性间的数据依赖 (Data Dependencies) 关系 • 函数依赖 (functional dependencies, FD, §7.4) • 多值依赖 (Multivalued Dependencies, MVD, §7.6, Appendix C/C.1 ) • 连接依赖 (Join Dependencies, JD, Appendix C/C.2) Database System Concepts - Chapter 7 Relational-Database Design -

  15. VII-1-2 Principles of Relation Normalization (cont.) • 如果直接根据初始关系模式构造DBS,由于初始关系模式中数据依赖关系的存在,可能会违反DB的完整性约束,导致DBS使用过程中出现如下问题,影响DBS的正确性、性能、效率 • 数据冗余问题、插入问题、更新问题、删除问题(pitfalls,§7.1) Database System Concepts - Chapter 7 Relational-Database Design -

  16. VII-1-2 Principles of Relation Normalization (cont.) • 因此,对初始关系模式集,需要根据关系规范化理论,在保证关系模式的 • 函数无损连接性(lossless join),和/或 • 函数依赖保持性 (dependency preservation) 约束前提下,对关系模式集进行规范化处理——等价变换/模式分解 • 关系模式规范化主要步骤为 • 根据函数依赖的Armstrong’s 公理系统 ( §7.4.1 )和多值依赖 的公理系统 (Appendix C/C.1),从初始关系模式集中已知的函数依赖和多值依赖出发,推导出初始关系模式集中所有的函数依赖( §7.4.1/7.4.2/7.4.3)和多值依赖 Database System Concepts - Chapter 7 Relational-Database Design -

  17. VII-1-2 Principles of Relation Normalization (cont.) • 对具有函数依赖和多值依赖的初始关系模式集,采用 • 模式分解算法 , , 对其进行(等价)分解和变换,将其转换为各种范式形式,包括: • 1NF(§7.2) 、 2NF (Exercise 7.26) 、 BCNF(§7.3.2 ) 、 3NF (§7.3.4) 、 4NF(§7.6.2) , 以消除模式集中的函数依赖和多值依赖带来的负面影响, 保证数据库系统的完整性 • 关系模式规范化处理的基本要求为: • 静态关系具有第一范式形式 • (理论上)动态关系最好具有第三范式形式 Database System Concepts - Chapter 7 Relational-Database Design -

  18. VII-1-2 Principles of Relation Normalization (cont.) • 3种数据依赖间的关系 • 函数依赖是特殊的多值依赖 • 多值依赖又是连接依赖的特例 • 范式1NF、2NF、3NF、BCNF可以看作由符合范式要求的各种关系模式组成的关系模式的集合 e.g. 1NF = { R | R满足第一范式的定义} • 各种范式间的关系,参见Fig.7.A.1 1NF 2NF 3NF BCNF 4NF 5NF Database System Concepts - Chapter 7 Relational-Database Design -

  19. 1NF 消除非主属性对键的部分函数依赖 2NF 函数依赖保持性 消除非主属性对键的传递函数依赖 3NF 消除主属性对键的部分和传递函数依赖 BCNF 消除非平凡且非函数依赖的多值依赖 4NF 无损连接性 消除非平凡连接依赖 5NF Fig.7.0.3 关系范式间相互关系

  20. VII-1-2 Principles of Relation Normalization (cont.) • 给定一个关系模式,可以采用规范化算法将其转换为1NF、2NF、3NF、BCNF • 对连接依赖和第五范式,无相应的模式规范化算法 Database System Concepts - Chapter 7 Relational-Database Design -

  21. VII-2 Functional Dependencies (§7.3/7.4 ) • Function: f: X →Y, x∈ X, y∈ Y, y = f (x) e.g. y = 2x • for x1, x2∈X, if x1= x2, then f (x1) = f (x2) • Contents in section VII-2 • concepts about FD, §7.3.1 • Armstrong Axioms to derive all implied FD, i.e. closure of FD, §7.4.1 • an efficient algorithm to compute the closure§7.4.1 • “minimal” closure of FD, §7.4.3 Database System Concepts - Chapter 7 Relational-Database Design -

  22. VII-2-1 Basic Concepts • Definition1. Functional dependency(FD) holds on R /*函数依赖FD在关系模式R上成立/保持 */ Let R be a relation schema, and   R ,   R , thefunctional dependency  holds onschemaR if and only if for anylegalrelationsr(R), whenever any two tuplestiand tj in ragree on the attributes , they also agree on the attributes , that is, ti[] = tj []  ti[ ] = tj [ ] Database System Concepts - Chapter 7 Relational-Database Design -

  23. VII-2-1 Basic Concepts (cont.) • Keys in relational schema can be defined in terms of FD • K is a superkey for relation schema R, if and only if K  R • K is a candidate key for R, if and only if • K  R, and • for no   K,   R Database System Concepts - Chapter 7 Relational-Database Design -

  24. VII-2-1 Basic Concepts (cont.) • Definition2. (A particular) r(R) satisfy FD, or FD is satisfied by r(R) /*关系r(R)满足函数依赖集FD, FD 被r满足*/ Given FD ={ } holding on R, and a relation r(R) on R, • if r is legal under functional dependency set FD, r is said to satisfy FD • note: r is legal under FD={  } means • for ti, tj ∈r(R), if ti[]= tj [], then ti[]= tj [] • FD requires that the values for a certain set of attributes determines uniquely the value for another set of attributes Database System Concepts - Chapter 7 Relational-Database Design -

  25. A B C 1 4 2 3 5 6 3 4 6 7 3 8 9 1 0 Example One • Given relation r(R) shown below, which FD is satisfied by r • A. A →B B. AC → B C. BC →A D. B → C Fig. 7.0.4 t1 t2 t3 t4 t5 Database System Concepts - Chapter 7 Relational-Database Design -

  26. Example One (cont.) • t2[A]=t3[A]=3, t2[B]=5 ≠ t3[B]=4, A →B is not satisfied • t2[AC]=t3[AC]=36, t2[B]=5 ≠ t3[B]=4, AC →B is not satisfied • BC →A is satisfied • t1[B]=t3[B]=4, t1[C]=2 ≠ t3[C]=6, B →C is not satisfied Database System Concepts - Chapter 7 Relational-Database Design -

  27. Example Two • Consider the schema R=(employee_ID, date, turnover per-day, department_name, manager) that describes the information about the turnover per-day (日营业额) for each employee everyday, the department that each employee works at, and the manager of the department the employee works at. it is assumed that • at every day, each employee has only one turnover per-day • each employee works at only one department • each department is managed by only one manager • According to the descriptions mentioned above, list all the functional dependencies that hold on R Database System Concepts - Chapter 7 Relational-Database Design -

  28. Example Two (cont.) • Answer • F = { employee_ID, date → turnover per-day, employee_ID →department_name, department_name →manager } Database System Concepts - Chapter 7 Relational-Database Design -

  29. FD holds on R vs FD is satisfied by r(R) • For a schema R, there may be more than one relation instance r(R), i.e. r1(R) , r2(R) , r3(R) ,…, rm(R) , defined on R • e.g. consider R= (A, B, C, D) , and with respect to the instances r1(R) and r2(R) in Fig.7.5 • Relation r(R) satisfies  vs.  holds on schema R • if  holds on R, then every legalr(R) satisfies this R • but for schema R and, if only some ri(R) satisfies R,  may not hold on R. • e.g. in Fig.7.5, AC and ABD are satisfied by r1(R) , but AC is not satisfied by r2(R) , so AC does not holds on R Database System Concepts - Chapter 7 Relational-Database Design -

  30. FD1= {AC, ABD} , satisfied by r1 FD2= {ABD} , satisfied by r2 r1 r2 t2 t2 b3 b3 t6 AC does not hold on R Fig.7.5 instance r1 and r2 defined on schema R

  31. VII-2-1 Basic Concepts (cont.) • E.g. True or false ? • for a relation r(R) defined on schema R, if r satisfies functional dependency FD ={  }, then FD holds on schema R • answer: false • With respect to Fig. 2.4 • in general, customer-street  customer-city does not hold on customer schema, because two different cities may have the same street • but customer-street  customer-city is satisfied by relation instancecustomer as shown in Fig.2.4 Database System Concepts - Chapter 7 Relational-Database Design -

  32. Fig. 2.4 The customerRelation

  33. VII-2-1 Basic Concepts (cont.) • Functional dependencies allow us to express constraints that cannot be expressed using superkeys • E.g. consider the schema in Fig.7.2: • bor_loan= (customer-id, loan-number, amount). • with respect to the primary key, we have • customer-id, loan-number  amount • but the following FD also holds on bor_loan loan-numbercustomer-name Database System Concepts - Chapter 7 Relational-Database Design -

  34. VII-2-1 Basic Concepts (cont.) • Def. Trivial FD (平凡依赖) A functional dependency   is trivial , if for any schema R, where R,  R,   is satisfied by all relations/instances r on R or:   on R is trivial if it holds on any schema R, where R,  R • E.g. A  A is trivial, customer-name customer-name • In general,   is trivial if    • e.g. customer-name, loan-number customer-name ift1[customer-name, loan-number ]= t2[customer-name, loan-number ] then t1[customer-name]= t2[customer-name] Database System Concepts - Chapter 7 Relational-Database Design -

  35. a functional dependency  is transitive if:  ,(  ),   ,  ; and  is called transitive dependent on  VII-2-1 Basic Concepts (cont.) • Def. Transitive dependency (传递函数依赖) • E.g. Student(sno, sname, address, depart) • for transitive dependency sno→address , there are snosname, sname address Database System Concepts - Chapter 7 Relational-Database Design -

  36. VII-2-1 Basic Concepts (cont.) • Def.Partial dependency (部分函数依赖) • a functional dependency   is partial if there is a proper subset  of , i.e.  , such that   ; and  is partially dependent on  • /*非最小化/冗余 • E.g. Student(sno, sname, address, depart) • for partial dependency (sno, sname)→address , there are sname→address, sno→address Database System Concepts - Chapter 7 Relational-Database Design -

  37. VII-2-2 Closure of a Set of Functional Dependencies (cont.) (§7.4.1) • Given a set F set of functional dependencies, there may be some other functional dependencies that are logically implied by F • for example, R(A, B, C) if F ={A  B, B  C}, then it is inferred that A  C • e.g. Fig. 7.0.5 in next slide • For a given FD set F on R, how to derive all the other functional dependencies on R ? • closures of functional dependencies Database System Concepts - Chapter 7 Relational-Database Design -

  38. A B C 1 4 2 3 5 6 4 4 2 7 3 8 9 1 0 F ={A  B , B  C }; A  C is logically implied by F t1 t2 t3 t4 t5 Fig. 7.0.5

  39. VII-2-2 Closure of a Set of Functional Dependencies (cont.) • Def. Given a schema R, a functional dependency f on R is logically implied by a set of FD F on R , if every instance r(R) that satisfies F also satisfies f • e.g. Fig. 7.0.5 • Def. Given a set F of functional dependencies, the closure of F, denoted asF+ • F+ = { f | f is logically implied by F } • e.g. in Fig. 7.0.5, {A  B , B  C }+ = {A  B , B  C, A  C } Database System Concepts - Chapter 7 Relational-Database Design -

  40. VII-2-2 Closure of a Set of Functional Dependencies (cont.) • F+ can be derived on the basis of Armstrong’s axioms • Three basic Armstrong’s Axioms e.g. • if   , then   • reflexivity, 自反律 • if  , then    • augmentation, 增广律 • if  , and   , then    • transitivity,传递律 • For the proof of reflexivity and augmentation, refer to Appendix A Database System Concepts - Chapter 7 Relational-Database Design -

  41. VII-2-2 Closure of a Set of Functional Dependencies (cont.) • Other three additive Armstrong’s axioms • union: if   holds and   holds, then   holds • decomposition: if   holds, then  holds and   holds • pseudotransitivity: if   holds and   δ holds, then  δ holds • These three additive rules can be derived from the three basic Armstrong’s Axioms mentioned above Database System Concepts - Chapter 7 Relational-Database Design -

  42. An Example of Functional Dependency • Question Which rule about functional dependencies shown below is right • A. if  then  • B if AC , BCD then ABD • C. if ABC then BC • D if , then  • Answer: B Database System Concepts - Chapter 7 Relational-Database Design -

  43. An Example - Closure of Functional Dependency • R = (A, B, C, G, H, I) F = { A  BA  CCG  HCG  IB  H } • Some members of F+ • A H • by transitivity from A B andB H Database System Concepts - Chapter 7 Relational-Database Design -

  44. An Example - Closure of Functional Dependency (cont.) • AG I • by augmentingA C with G, to get AG CG • then by transitivity with AG CG andCG I • CG HI • by union with CG H and CG I Database System Concepts - Chapter 7 Relational-Database Design -

  45. F+ = Frepeatfor each functional dependency f in F+ apply reflexivity and augmentation rules on fadd the resulting functional dependencies to F+for each pair of functional dependencies f1and f2 in F+iff1 and f2 can be combined using transitivitythen add the resulting functional dependency to F+until F+ does not change any further VII-2-2 Closure of a Set of Functional Dependencies (cont.) Fig.7.8 Algorithm to compute F+ Database System Concepts - Chapter 7 Relational-Database Design -

  46. VII-2-3 Closure of Attributes (§7.4.2) • Def. An attribute B is functionally determined by  if B • Def. Given a set of attributes , the closure of  under F , denoted by +, is { |  is functionally determined by under F} •  is functionally determined byunder F , if and only if ∈F+ •   +is in F+ • E.g. R(A, B, C, D) and F={A  B, B  C} • F+= {A  B, B  C, A  C, …} • (A) += (ABC), A  ABC ∈F+ Database System Concepts - Chapter 7 Relational-Database Design -

  47. VII-2-3 Closure of Attributes (cont.) Input:  , F Output:  + result :=  ;while (changes to result) do for each inF do begin if  result /* result =(, …) then result := result end Fig.7.9 An efficient algorithm to compute + underF Database System Concepts - Chapter 7 Relational-Database Design -

  48. An Example • R = (A, B, C, G, H, I) • F = {A  B, A  C, CG  H, CG  IB  H } • Computing (AG)+ 1. result = AG /* or denoted as {A, G } 2. result = ABCG /* A  C , A  B 3. result = ABCGH /* CG  H 4. result = ABCGHI /* CG  I /* or { A, B, C, G, H, I } Database System Concepts - Chapter 7 Relational-Database Design -

  49. An Example(cont.) • (AG)+ = R, AG is a superkey of R • Is AG a candidate key? • step1. is AG a super key? • does AG R? == Is (AG)+ = R • yes • step2. is any subset of AG a superkey? • does AR? == is (A)+ = R ?, no • does GR? == is (G)+ = R ?, no • so, AG is a candidate key Database System Concepts - Chapter 7 Relational-Database Design -

  50. VII-2-3 Closure of Attributes (cont.) • Usage-I. Testing for superkey To test whether  is a superkey of Runder F, i.e. whether R, we check if R = + • Usage-II. Testing functional dependencies To determine whether or not    holds on R under F, we check if   + • Usage-III. Computing closure F+ for each γ R, computeγ+={S} under F; for each S  γ+, output γS as a functional dependency in F+ Database System Concepts - Chapter 7 Relational-Database Design -

More Related