1 / 15

Normalisation

Normalisation. Normalisation is a process whereby the tables in a database are optimised to remove the potential for redundancy. Two main problems may arise if this is not done: Repeated data makes a database bigger.

nyoko
Télécharger la présentation

Normalisation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Normalisation • Normalisation is a process whereby the tables in a database are optimised to remove the potential for redundancy. Two main problems may arise if this is not done: • Repeated data makes a database bigger. • Multiple instances of the same values make maintaining the data more difficult and can create anomalies. Ms. D. Anderson

  2. Objectives • Normalisation aims to remove this redundancy by applying rules in a series of stages, splitting tables and creating relationships between unique identifiers (keys), to ensure that the database table structure is efficient, but data can still be accurately manipulated. Ms. D. Anderson

  3. Database Terminology • When we were looking at the physical database we use certain terms such as TABLE, COLUMN, ROW and FIELD. Discussion of database theory uses different terms: • Relation/ Entity - These are the same as a table. • Attributes - These are similar to columns in that they describe the type of data stored. • Domain - This is values within the same attribute (a collection of fields that exist in the same column). • Tuple - This is a record similar to a row. Ms. D. Anderson

  4. Normalisation Stages • There are several stages of normalisation that a database structure can be subject to, each with rules that constrain the database further and each creating what is called a Normal Form. These are, in order: • First Normal Form (1NF) • Second Normal Form (2NF) • Third Normal Form (3NF) • Boyce Codd Normal Form (BCNF) • Fourth Normal Form (4NF) • Fifth Normal Form (5NF) Ms. D. Anderson

  5. To demonstrate the stages of normalisation we have a example scenario • The setting is a university or college. • Essentially, a student is enrolled onto a programme and may take several modules as part of this programme. We can assume here that a module is only delivered by one lecturer. • We start with a copy of the programme Ms. D. Anderson

  6. Typical college or university data with regard to course details Ms. D. Anderson

  7. Normalisation rules a) A relation is said to be in first normal form (1NF) if it has no repeating groups and no multi-valued attributes. b) A relation is said to be in second normal form (2NF) if every non-key attribute depends fully on the primary key. c) A relation is said to be in third normal form (3NF) if there are no dependencies on non-key fields Ms. D. Anderson

  8. Initial Observations • The primary key for this data would be ‘Student No’. Student ‘Name’ could also have been a primary key except for the fact that names are not always unique. All too often we find that two students can have identical names (often in the same class). ‘Student No’ is always a safer approach as the registration department has access to a much greater set of data and can allocate a unique ‘Student No’ accordingly. • The table clearly shows that the fields Module, Module name and Lecturer form a repeating group for a single value of ‘Student No’. Ms. D. Anderson

  9. 1NF – first normal form To go to 1NF, we will therefore split the table into two parts: – (Student No, Student Name, Programme, Programme Duration); – (Student No, Module No, Module Name, Lecturer). Note that ‘Student No’ is an acceptable key for the non-repeating portion of the table. The second table requires the use of a composite key this time using both ‘Student No’ and ‘Module No’ to uniquely identify a row of data. Ms. D. Anderson

  10. 1NF Tables Ms. D. Anderson

  11. 2NF – second normal form To move to second normal form, we are interested in the removal of partial dependencies. As the first table has no composite key, then it cannot contain partial dependencies and hence we can say that this is currently in 2NF: – (Student No, Student Name, Programme, Programme Duration). The second table does contain partial dependencies that need to be eliminated. In this case both ‘Module Name’ and ‘Lecturer’ are dependent on the ‘Module No’ and not the ‘Student No’. To eliminate this dependency, we can again split the second table to form: – (Student No, Module No); – (Module No, Module Name, Lecturer). Ms. D. Anderson

  12. 2NF Tables Ms. D. Anderson

  13. 3NF – third normal form To move to third normal form we must remove any transitive dependencies, i.e. fields that are dependent on non-key fields. There is one such dependency in the first relationship where Programme ‘Duration’ depends on the Programme and not on the primary key ‘Student No’. • We can therefore express 3NF as follows: – Student (Student No, Student Name, Programme); – Programme (Programme, Programme Duration); – Student-Module (Student No, Module No); – Module (Module No, Module Name, Lecturer). Ms. D. Anderson

  14. 3NF Tables Module Table Programme Table Student Table Ms. D. Anderson

  15. References • http://www.learningmatters.co.uk/sampleChapters/pdfs/Chapter5.pdf Ms. D. Anderson

More Related