1 / 24

Probability

Statistics 111 - Lecture 7. Probability. Normal Distribution and Standardization. Administrative Notes. Homework 2 due on Monday. Outline. Law of Large Numbers Normal Distribution Standardization and Normal Table. Data versus Random Variables.

anaya
Télécharger la présentation

Probability

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistics 111 - Lecture 7 Probability Normal Distribution and Standardization Stat 111 - Lecture 7 - Normal Distribution

  2. Administrative Notes • Homework 2 due on Monday Stat 111 - Lecture 7 - Normal Distribution

  3. Outline • Law of Large Numbers • Normal Distribution • Standardization and Normal Table Stat 111 - Lecture 7 - Normal Distribution

  4. Data versus Random Variables • Data variables are variables for which we actually observe values • Eg. height of students in the Stat 111 class • For these data variables, we can directly calculate the statistics s2 and x • Random variables are things that we don't directly observe, but we still have a probability distribution of all possible values • Eg. heights of entire Penn student population Stat 111 - Lecture 7 - Normal Distribution

  5. Law of Large Numbers • Rest of course will be about using data statistics (x and s2) to estimate parameters of random variables ( and 2) • Law of Large Numbers: as the size of our data sample increases, the mean x of the observed data variable approaches the mean  of the population • If our sample is large enough, we can be confident that our sample mean is a good estimate of the population mean! Stat 111 - Lecture 7 - Normal Distribution

  6. The Normal Distribution • The Normal distribution has the shape of a “bell curve” with parameters  and 2 that determine the center and spread:   Stat 111 - Lecture 7 - Normal Distribution

  7. Different Normal Distributions • Each different value of  and 2 gives a different Normal distribution, denoted N(,2) • We can adjust values of  and 2 to provide the best approximation to observed data • If  = 0 and 2 = 1, we have the Standard Normal distribution N(0,1) N(2,1) N(-1,2) N(0,2) Stat 111 - Lecture 7 - Normal Distribution

  8. Property of Normal Distributions • Normal distribution follows the 68-95-99.7 rule: • 68% of observations are between  -  and  +  • 95% of observations are between  - 2 and  + 2 • 99.7% of observations are between  - 3 and  + 3  2 Stat 111 - Lecture 7 - Normal Distribution

  9. Calculating Probabilities • For more general probability calculations, we have to do integration For the standard normal distribution, we have tables of probabilities already made for us! If Z follows N(0,1): P(Z < -1.00) = 0.1587 Stat 111 - Lecture 7 - Normal Distribution

  10. Standard Normal Table If Z has N(0,1): P(Z > 1.46) = 1 - P(Z < 1.46) = 1 - 0.9279 = 0.0721 • What if we need to do a probability calculation for a non-standard Normal distribution? Stat 111 - Lecture 7 - Normal Distribution

  11. Standardization • If we only have a standard normal table, then we need to transform our non-standard normal distribution into a standard one • This process is called standardization  1  0 Stat 111 - Lecture 7 - Normal Distribution

  12. Standardization Formula • We convert a non-standard normal distribution into a standard normal distribution using a linear transformation • If X has a N(,2) distribution, then we can convert to Z which follows a N(0,1) distribution Z = (X-)/ • First, subtract the mean  from X • Then, divide by the standard deviation  of X Stat 111 - Lecture 7 - Normal Distribution

  13. Linear Transformations of Variables • Sometimes need to do simple mathematical operations on our variables, such as adding and/or multiplying with constants Y = a·X + b • Example: changing temperature scales Fahrenheit = 9/5 x Celsius + 32 • How are means and variances affected? Stat 111 - Lecture 7 - Normal Distribution

  14. Mean/Variances of Linear Transforms • For transformed variable Y = a·X + b mean(Y) = a·mean(X) + b Var(Y) = a2·Var(X) SD(Y) = |a|·SD(X) • Note that adding a constant b does not affect measures of spread (variance and sd) Stat 111 - Lecture 7 - Normal Distribution

  15. More complicated linear functions • We can also do linear transformations involving with more than one variable: Z = a·X + b·Y + c • The mean formula is similar: mean(Z) = a·mean(X) + b·mean(Y) + c • If X and Y are also independent then var(Z) = a2·var(X) + b2·var(Y) • Need more complicated variance formula (in book) if the variables are not independent Stat 111 - Lecture 7 - Normal Distribution

  16. Standardization Example Dear Abby, You wrote in your column that a woman is pregnant for 266 days. Who said so? I carried my baby for 10 months and 5 days. My husband is in the Navy and it could not have been conceived any other time because I only saw him once for an hour, and I didn’t see him again until the day after the baby was born. I don’t drink or run around, and there is no way the baby isn’t his, so please print a retraction about the 266-day carrying time because I am in a lot of trouble! -San Diego Reader Stat 111 - Lecture 7 - Normal Distribution

  17. Standardization Example • According to well-documented data, gestation time follows a normal distribution with mean  of 266 days and SD  of 16 • Let X = gestation time. What percent of babies have gestation time greater than 310 days (10 months & 5 days) ? • Need to convert X = 310 into standard Z Z = (X-)/ = (310-266)/16 = 44/16 = 2.75 Stat 111 - Lecture 7 - Normal Distribution

  18. Standardization Example P(X > 310) = P(Z > 2.75) = 1 - P(Z < 2.75) = 1 - 0.9970 = 0.0030 So, only a 0.3% chance of a pregnancy lasting as long as 310 days! Stat 111 - Lecture 7 - Normal Distribution

  19. Reverse Standardization • Sometimes, we need to convert a standard normal Z into a non-standard normal X • Example: what is the length of pregnancy below which we have 10% of the population? • From table, we see P(Z <-1.28) = 0.10 • Reverse Standardization formula: X = σ⋅Z +μ • For Z = -1.28, we calculate X = -1.28·16 + 266 = 246 days (8.2 months) Stat 111 - Lecture 7 - Normal Distribution

  20. Another Example • NCAA Division 1 SAT Requirements: athletes are required to score at least 820 on combined math and verbal SAT • In 2000, SAT scores were normally distributed with mean  of 1019 and SD  of 209 • What percentage of students have scores greater than 820 ? Z = (X-)/ = (820-1019)/209 = -199/209 = -.95 Stat 111 - Lecture 7 - Normal Distribution

  21. Another Example • P(X > 820) = P(Z > -0.95) = 1- P(Z < -0.95) • P(Z < -0.95) = 0.17 so P(X > 820) = 0.83 • 83% of students meet NCAA requirements Stat 111 - Lecture 7 - Normal Distribution

  22. SAT Verbal Scores • Now, just look at X = Verbal SAT score, which is normally distributed with mean  of 505 and SD  of 110 • What Verbal SAT score will place a student in the top 10% of the population? Stat 111 - Lecture 7 - Normal Distribution

  23. SAT Verbal Scores • From the table, P(Z >1.28) = 0.10 • Need to reverse standardize to get X: X =σ⋅Z +μ =110⋅1.28 + 505 = 646 • So, a student needs a Verbal SAT score of 646 in order to be in the top 10% of all students Stat 111 - Lecture 7 - Normal Distribution

  24. Next Class - Lecture 8 • Chapter 5: Sampling Distributions Stat 111 - Lecture 7 - Normal Distribution

More Related