180 likes | 324 Vues
This document explores the essential concept of data normalization, a process crucial for organizing database systems. Normalization helps validate and refine logical designs to eliminate redundancy and avoid data anomalies. It covers various normal forms—First (1NF), Second (2NF), Third (3NF), and Boyce-Codd Normal Form (BCNF)—focusing on functional dependencies and keys. Key stages of normalization are discussed to facilitate better management of insert, update, and delete operations, thereby enhancing database integrity and performance.
E N D
Normalization Dr. Mario Guimaraes
Data Normalization • Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that avoid unnecessary duplication of data. • The process of decomposing relations with anomalies to produce smaller, well-structured relations. • Primary Objective: Reduce Redundancy,Reduce nulls, • Improve “modify” activities: • insert, • update, • delete, • but not read • Price: degraded query, display, reporting
Functional Dependency and Keys • Functional Dependency: The value of one attribute (the determinant) determines the value of another attribute. • Candidate Key: Each non-key field is functionally dependent on every candidate key.
Normalization – most used • Four most commonly used normal forms are first (1NF), second (2NF) and third (3NF) normal forms, and Boyce–Codd normal form (BCNF). • Based on functional dependencies among the attributes of a relation. • A relation can be normalized to a specific form to prevent possible occurrence of update anomalies.
First Normal Form • No multi-valued attributes. • Every attribute value is atomic. • Why are the following tables not in 1NF Employee (ssn, Name, Salary, Address, ListOfSkills) Department (Did, Dname, ssn)
Second Normal Form • 1NF and every non-key attribute is fully functionally dependent on the primary key. • Every non-key attribute must be defined by the entire key, not by only part of the key. • No partial functional dependencies. Assuming that we have a composite PK (LicensePlate, OwnerSSN) for the Vechicle Table below, why is the table not in 2NF ? Vehicle (LicensePlate, Brand, Model, PurchasePrice, Year, OwnerSSN, OwnerName)
Third Normal Form & BCNF • 2NF and no transitive dependencies (or no functional dependency between non-key attributes = BCNF) Why are the following tables not in 3NF or BCNF ? • Why is Employee [ssn, name, salary, did, dname] • Customer
3NF & BCNF • It is very rare for a Table to be in 3NF and not be in BCNF (violation of BCNF). • Given a Relation R with attributes A, B and C where A and B are together the composite PK, IF A, B -> C and C -> B THEN R is in 3NF and is not in BCNF Example: Student, course -> Instructor Instructor -> Course
Steps in Normalization • 1NF: a table, without multivalued attributes • if not, then decompose • 2NF: 1NF and every non-key attribute is fully functionally dependent on the primary key • if not, then decompose • 3NF: 2NF and no transitive dependencies • if not, then decompose • GENERAL: • Each table should describe a single theme • Modification anomalies are minimized Hint: THE KEY, THE WHOLE KEY AND NOTHING BUT THE KEY
Normalized Tables Must: • Two important properties of decomposition: - Lossless-join property enables us to find any instance of original relation from corresponding instances in the smaller relations. - Dependency preservation property enables us to enforce a constraint on original relation by enforcing some constraint on each of the smaller relations.
4NF • Dependency between attributes (for example, A, B, and C) in a relation, such that for each value of A there is a set of values for B and a set of values for C. However, set of values for B and C are independent of each other.
Matching • 1) A Weak Entity _____ a) skills • 2) A Derived Attribute _____ b) ssn • 3) A Composite Attribute _____ c) Monthly Sales • 4) A Multi-Valued Attribute_____ d) full name • 5) Primary Key _______ e) Section of a Course
Which tables are in 3NF ? Employee (ssn, Name, Salary, pid) (obs.: employee can only work in one project) Employee (ssn, Name, Salary, pid) (obs.: employee can work in many projects) Vehicle (OwnerSSN, OwnerName, OwnerAddress, LicensePlate, Brand, Model, PurchasePrice, Year) (obs.: an owner can have several cars) Project (pid, pname, did) (pid determines did) Project (pid, pname, ssn) (ssn determines pid)
Normalize to 3NF • ClientName -> Address, EmpNo -> Name, Service -> AmountDue • ClientName and Date -> Service, ClientName and Date -> EmpNo
End of Lecture End Of Today’s Lecture.