Database NormalizationSept 15th 2009 Presents Arie d. jones (aj) Principal Technology manager
Why you should listen to AJ • Graduate degree: Computational Physics • 20+ years of programming experience • 10+ years experience in database platforms • SQL Server • Sybase • Oracle • MySQL • Postgres • DB2 • 4 time author
What Is Normalization? • Eliminating redundant data within the database to make it as efficient as possible • Efficiency is usually considered lowered complexity • ** Mulligan: Efficiency does not always equal better performance, nor does it necessarily equate to efficient query processing.
Normal Forms • Basis of normalization strategy • Created by E.F. Coddwaaaaay back when…. • 1st Normal Form (1NF) • 2nd Normal Form (2NF) • 3RD Normal Form (3NF) • Nth Normal Form
First Normal Form (1NF) • Every entity has an attribute or set of attributes that constitute a primary key • Natural versus Surrogate Keys • Each attribute can have only one value • NO single instance may have multiple values for a given attribute
Example Not 1NF Repeated Data Columns
1NF We have non-unique values so we apply a primary key
1NF So now each row is uniquely distinct No repeating attributes
Second Normal Form (2NF) Builds on 1NF Plus, a primary key uniquely identifies the non-key attributes in the row. So basically we are trying to put attributes that are not directly reliant on the primary key are moved out of the table. These types of relationships are referred to as Partial Dependencies
1NF – 2NF Non dependant Columns
Third Normal Form (3NF) • 3NF is very much like 2NF • You want to now remove Transitive Dependencies • The are similar to partial dependencies but rely on another non-key attribute that is dependant on the primary key • So removing the attribute and putting it in a separate entity solves the issue…just like 2NF
NORMALIZATION VS. DENORMALIZATIONWhat’s the big deal? Normalization means reducing duplicate data by using Denormalizationmeans the opposite, which is deliberately duplicating data in one or more structures. Each has specific effects on your databases performance
NORMALIZATION VS. DENORMALIZATIONWhat’s the big deal? Normalization improves entering data into the tables because it reduces the amount of data that needs to be updated. Denormalizationimproves the selecting of tables because it reduces the number of tables needed to be accessed by any particular query
Normalization versus Denormalization • When do you use them? • Dependent upon the system • Normalization • Highly transactional systems • Space is a consideration • Width of rows a consideration • Denormalization • Reporting systems or rarely updated • Data warehousing -> throw these rules out the window
Does this matter? • Does this equate with current technologies • Cloud computing • Entity Framework • SQL Server 2011 • Answer: YES!
Conclusion • Slides can be found on my blog • http://www.programmersedge.com • Email: email@example.com • Questions & Open Discussion