1 / 11

Data Handling & Analysis BD7054 Normality

Data Handling & Analysis BD7054 Normality. Andrew Jackson a.jackson@tcd.ie. Making assumptions. Each group is normally distributed. The residuals off the line are normally distributed. Distributions are where numbers come from.

geona
Télécharger la présentation

Data Handling & Analysis BD7054 Normality

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Handling & AnalysisBD7054Normality Andrew Jackson a.jackson@tcd.ie

  2. Making assumptions Each group is normally distributed The residuals off the line are normally distributed

  3. Distributions are where numbers come from • The binomial distribution tells us how systems like a coin toss behave • It tells us how many events are likely to occur given repeated attempts • The event has a fixed probability of occurring each time

  4. The normal distribution • Normal or Gaussian distribution • “the bell shaped curve” • Defined by mean and a variance (or standard deviation) • The PDF or Probability Density Function of the normal distribution is shown right

  5. Origins of the Normal Distribution • Assume that an individual’s weight or height (or whatever we are measuring) is affected by thousands of small +/- effects such as genes or environment • Add those effects up for each individual, and lo and behold… • The character will display a normal distribution

  6. Return to our brain/body data • We need to test whether each group is normally distributed • Equivalent to asking if the residuals are normally distributed • Residuals are the difference between an observed value and its predicted value • Which is the mean value in each group in this case

  7. Exploring Residuals from boxplots A simple histogram A Q-Q plot (quantile-quantile)

  8. Return to our scatter plot • We need to test whether our residuals off the line are normally distributed • Also need to check that there is no trend in the deviation of the residuals along the line

  9. Exploring residuals from scatter plot Histogram of residuals Q-Q plot of residuals

  10. Testing for a trend in the data

  11. What to do if residuals are not normal? • Transforming the data is often the solution • Taking the log of the response variable (y) is first port of call • For scatter plot type data, can also take the log of the explanatory (x) variable • We will do this next time we meet

More Related