1 / 28

One thing in one place

One thing in one place. Data, Schemas and Applications Week 8. Data Normalisation. Formal technique to reduce complex relations to a set of simpler relations which are better structured and reduce redundancy

troy-baird
Télécharger la présentation

One thing in one place

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. One thing in one place Data, Schemas and Applications Week 8

  2. Data Normalisation • Formal technique to reduce complex relations to a set of simpler relations which are better structured and reduce redundancy • We shall look at it informally on this module. On a full database module this would be covered in more theoretical depth.

  3. Facilities Table

  4. Potential anomalies • UPDATE the price per session of facility 1 • now different rates for squash courts - suspect business rule is same rate for all facilities for a ‘sport’ • ADD a new tennis court • must also ensure the same rate is used as for other tennis courts • DELETE facility 11 • lose the rate per session for skittles

  5. Two better tables FacilityType Facility FtypeNo Name Price FacilityNo FaciltyUnit

  6. Normalisation • Must ensure no data has been lost in this process. • Can recover the Original data using join: select * from facility natural join facilityType • Informally we feel that the second solution is ‘better’ than the first. • Less redundancy • Update anomalies avoided • Simpler tables reusable • Queries are slower because a join needed • original table: select price from facility where facilityNo=6 • normalised tables: select price from facility natural join facilityType where faciltyNo = 6

  7. Exercise • Normalise this data on outstanding video rentals • Describe your result as an ER model (relationships not foreign keys) Source: www.gc.maricopa.edu/business/sylvester/cis164/norm-rev.htm

  8. 1st Normal form - 1NF • the data must be in the form of valid relations - i.e. with uniquely named columns, each column containing only atomic (indivisible) values • each row unique - identify the primary key

  9. 2nd Normal Form • Non-key fields are fully dependent on the primary key • ‘Dependent’ • Column A is dependent on Column B if, in the data, knowing the value of B will determine the value of A • E.g. E#, Age • Age is dependent on E# • E# is not dependent on Age ( several people with the same age , different employee numbers)

  10. 3rd Normal Form 3NF • A relation R is in 3NF if and only if • the non-key fields are fully dependent on the primary key i.e is 2NF and • the non-key fields are mutually independent • e.g R(A,B, C) • then • B dependent on A • C dependent on A • B and C independent

  11. Mnemonic • “the key, the whole key and nothing but the key” • 2NF • “the key” : non-key fields depend on the key • “the whole key”: non-key fields depend fully on the key • 3NF • “nothing but the key”: non-key fields are not dependent on each other

  12. Summary : Data Normalisation • Place common relations in separate tables and join them when required • Normalisation is bottom-up – given a mess of data in a table (a spreadsheet for example), how to restructure it to reduce redundancy • More names required – name for each relation at least – more concepts but more understandable • Normalised tables may be reusable • Reduced performance due to extra joins – may have to de-normalise to get better performance • Improved performance due to smaller database • Normalisation is a bottom-up process – starting from some mess – like a spreadsheet • Top-down models are usually normalised anyway

  13. Code ‘Normalisation’ To factor (maths) – to divide into the basic units 68 = 2*2*17 • For very similar reasons, we want to write our code (PHP, HTML, Javascript..) so it is ‘normalised’ too – common, recurrent code in factored into one place. • To avoid update anomalies – update in one place • To increase readability – less code to read • To create reusable components • But may reduce performance – but who cares – maybe later if you need to optimize-

  14. Recurrent Values • If the same value is used in several places, assign it to a variable (constant) and use the variable name where it is needed. • If this common value has to be changed, it can be changed in one place only. • The variable name provides a understandable explanation of what this value is for.

  15. Silly example: colorheads.php <?php print "<h1 style='color:4AFF2F;'>This one is OK</h1>"; print "<h1 style='color:FF1052;'>This one is BAD</h1>"; ?> <p> here is some text </p> <?php print "<h1 style='color:FF1052;'>This one is BAD too</h1>"; print "<h1 style='color:4AFF2F;'>This one is OK too</h1>"; ?>

  16. Revision 1 : colorheads2.php <?php $goodColour = "4AFF2F"; $badColour = "FF1052"; print "<h1 style='color:$goodColour;'>This one is OK</h1>"; print "<h1 style='color:$badColour;'>This one is BAD</h1>"; ?> <p> here is some text </p> <?php print "<h1 style='color:$badColour;'>This one is BAD too</h1>"; print "<h1 style='color:$goodColour;'>This one is OK too</h1>"; ?>

  17. Revision 2 : colorheads3.php <html> <style> .good {color:4AFF2F;} .bad {color:FF1052;} </style> </head> <body> <h1 class="good">This one is OK</h1> <h1 class="bad">This one is BAD</h1> <p> here is some text </p> <h1 class="bad">This one is BAD</h1> <h1 class="good">This one is OK</h1> </body> </html>

  18. Common code • A common piece of code can be put in on place and linked into the programme • Suppose you have an HTML footer to go on every page (it might include to call Google Analytics for example) • We want to put this in one file and then include it in the scripts for each of the pages

  19. Footer code (footer.php) <hr/> <p style="font-size: 10pt; color: blue"> This page is generated on <?php print date("l, d/m/Y") ?> at <?php print date("H:i") ?> from a script written by Chris Wallace. </p> php function to include php script from another file Sample page (page1.php) <html> <head> <title>First Page</title> </head> <body> <p style="color:silver; font-size:15pt;"> This is the main body of page 1 which is linked to <a href="page2.php">page 2</a> </p> <?php include("footer.php") ?> </body> </html>

  20. Modularisation • Modularisation as a common pattern • In PHP there are several slightly different ways to include code: • include(), include-once() • require(), require-once() • Since PHP is a scripting language, code can’t be pre-compiled as in Java.

  21. Functions • Common code but for a few variations • Variations can be expressed as parameters • Create a function • Call the function where required

  22. Two parameters Function declaration and use makefooter.php <?php // define the function function footer($author, $size) { return "<hr/><p style='font-size: {$size}pt; color: blue'> This page is generated on ". date("l, d/m/Y")." at ".date("H:i"). " from a script written by $author.</p>"; } //call the function print footer("Chris Wallace", 20); //call with different parameters print footer("Mark Butler", 30); ?> Can’t write $sizept { } means evaluate expression . Dot for concatenate strings

  23. Functions and Modules • Common to combine these two ideas • Place the function definitions in a separate file • Include file in every program which needs these functions • Call the function wherever required. • E.g. page3.php

  24. Function library functs.php <?php // generate a footer function footer($author, $size) { return "<hr/><p style='font-size: {$size}pt; color: blue'> This page is generated on ". date("l, d/m/Y")." at ".date("H:i"). " from a script written by $author.</p>"; } ?> Use of function <html> <head> <title>Third Page</title> </head> <?php include("functs.php") ?> <body> <p style="color:silver; font-size:15pt;"> This is the main body of page 3.</a> </p> <?php print footer("Chris Wallace", 25) ?> </body> </html>

  25. Code normalisation : Summary • Breaking a programming task cleanly into into simpler, re-useable units • The sign of a good programmer • Good structure is arrived at iteratively • You notice common parts as the code is written • You re-factor the code to put the common parts in one place • Avoid merely copying code because this creates update anomolies – an error has to be corrected in all copies

  26. One thing in one place • An idea which is common to both data and code normalisation • In database it means splitting big data sets into smaller, independent and reusable ones with no redundancy • In programming it means splitting a large amount of code into smaller reusable items – variables, modules, functions and classes • Where else do we see this principle? • A class or type separates the fixed parts of a set of things (every employee has a name) from the variable (Empno ‘1111’ has name ‘Fred’) • In learning, it is about identifying common themes or patterns, separating the general from the particular : • String concept implemented differently in each language • Code and Annotations in one script • Several Languages in one script

  27. Tradeoffs and judgement • Normalisation requires judgement • Are these two things really the same? • Will they always be the same? • Normalisation incurs a performance penalty – because the separate bits have to be put back together in execution – joins in a database, calls in code • Data and code design involves trade-offs

  28. Hints for debugging PHP • Use the isa.cems.uwe.ac.uk host to show errors • Add additional print statements • Comment out in case needed in the future • Use print_r($array) to print a complete array (such as a row returned from the database)

More Related