1 / 25

Statistical Analysis – Chapter 4 Normal Distribution

Statistical Analysis – Chapter 4 Normal Distribution . What is the normal curve?. In chapter 2 we talked about histograms and modes

shanta
Télécharger la présentation

Statistical Analysis – Chapter 4 Normal Distribution

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistical Analysis – Chapter 4Normal Distribution

  2. What is the normal curve? • In chapter 2 we talked about histograms and modes • A normal distribution is when a set of values for one variable, when displayed in a histogram (or line graph) has one peak (mode) and looks like a bell. Here is an example using height:

  3. Characteristics of the Normal Curve • Bell shaped, fading at the tails. In other words, more values are in the middle, and odd or unusual values fall at the tails • All (100%) of the data fits on the curve, with 50% before the mean and 50% after • 68% of the data falls within -1 and +1 standard deviations of the mean • 95% of the data falls between -2 and +2 standard deviations • The percentage of data between any two points is equal to the probability of randomly selecting a value between the two points (remember classical probability from Ch. 3)

  4. Standard Deviations and Z-Score • Z – scores = the number of standard deviations away from the mean. • z-score = x - µ σ (x = data for which we want to know the z-score) • We use the characteristics of the normal curve, and the z-score, to find out the probability of a particular event or value occurring (remember classical probability from Chapter 3)

  5. Solving Normal Curve Problems Using Z-Scores (steps listed at bottom of p. 111) • Draw a normal curve, showing values for (-2 through +2) • Shade the area in question • Calculate the z scores and cutoffs (percentages asked for) • Use the z-scores and cutoffs to solve the normal curve problem

  6. Find Percentages on the Normal Curve Table Let’s do these questions as a class… • What is the percentage of data from z = 0 to z = 0.1? • What is the percentage of data from z = 0 to z = 2.16? • What is the percentage of data from z = -1.11 to z = 1.11? • What is the percentage of data above z = 1.24? • What is the percentage of data below z = -0.6? Answers • .0398…39.8% • .4846…48.46% • .3665 + .3665 = .733…73.3% • .50 - .3925 = .1075…10.75% • .50 - .2257 = .2743…27.43%

  7. Working backwards from percentages… • When working backwards from percentages, we still use the normal table…but look for the percentage to give us the z-score… • What is the z-score associated 10.2% of the data? • What is the z-score(s) for the middle 30% of the normal curve? • What is the z-score of data in the upper 25% of the normal curve? Answers • z = 0.26 • z = -.39 to z = .39 • z = 0.67

  8. Let’s do Question 4.2 Use the normal curve table to determine the percentage of data in the normal curve • Between z = 0 and z = .82 • Above z = 1.15 • Between z = -1.09 and z = .47 • Between z = 1.53 and z = 2.78 Work backward in the normal curve table to solve the following: • 32% of the data in the normal curve data can be found between z = 0 and z = ? • Find the z score associated with the lower 5% of the data. • Find the z scores associated with the middle 98% of the data.

  9. Question 4.2 Answers Answers to Question 4.2 • 29.39% • 12.51% • 54.29% • 6.03% • Between z = 0 and z = .92, or between z = 0 and z = -.92

  10. Question 4.7 Use the normal curve table to determine the percentage of data in the normal curve • Between z = 0 and z = .38 • Above z = -1.45 • Above z = 1.45 • Between z = .77 and z = 1.92 • Between z = -.25 and z = 2.27 • Between z = -1.63 and z = -2.89 Work backward in the normal curve table to solve the following. • 15% of the data in the normal curve can be found between z = 0 and z = ? • Find the z score associated with the upper 73.57% of the data. • Find the z scores associated with the middle 95%

  11. Question 4.7 Answers • 14.80% • 92.65% • 7.35% • 19.32% • 58.71% • 4.97% • z = .39 or -.39 • z = -.63 • Between z = -1.96 and z = +1.96

  12. Binomial Distributions and Sampling Binomial means two categories in a population… • Males and females • Sports game players vs. Non sports game players • Incomes over 40,000 vs. incomes under 40,000 Quick note: Remember…for binomial distributions, we would visualize this data through a pie chart…because we do not have enough categories for a histogram…

  13. Sampling from a Two-Category Population • With two-category populations, we can describe the population by p – the percentage of values in one category • This is the same p from the last chapter on probability (classical probability)… P(event) ≈ s (number of chances for success) n (total equally likely possibilities) • We know (actually….statisticians know) that if we randomly sampled from a population, then ps ≈ p

  14. Sampling Distribution • In order to know the odds of getting certain values from this particular binomial sample, we have to know the sampling distribution from this population. • Under certain conditions, the sampling distribution for a binomial value is normal (i.e. the distribution follows the normal curve). • When the sampling distribution is normal, then we can make predictions using our table and our z-scores

  15. Sampling from a Binomial Distribution • Suppose, we defined a population (full time FIT students who either shop at Hot Topic), and we have made our measure of interest into a binomial distribution – those who shop at Hot Topic and those who do not. • Suppose over the last 10 years, marketers have surveyed the FIT population hundreds of times and found that Hot Topic shoppers are p = .13. (those who are non-Hot Topic shoppers is p = .87)

  16. Sampling from a Binomial Distribution • But suppose sometime later, your manager asks you to lead another study. But this time, you don’t have enough money to survey the whole population, and you have to get a sample. • We can assume, because so many studies have been done in the past that the true value of Hot Topic shoppers is p = .13. Thus, because we know that ps ≈ p, your sample should have approximately the same value.

  17. Sampling from a Binomial Distribution • For each sample, we can use the number sampled, and the p value from the population to predict the total number of Hot Topic shoppers. This is called the expected value. • Expected value = np • Thus, if we collected a sample of 200 FIT students, how many students would we expect to be Hot Topic shoppers? np = (200)(.13) = 26 • This expected value is the mean of your sample

  18. Binomial Distribution and the Normal Curve • Now, we need to decide if we can use the normal curve to solve problems… • If (np) > 5 and n(1 – p)>5…then the sampling distribution will be normally distributed. • So, our sample was 200 students. Is (np) > 5? Is n(1 – p)>5? • Yes…and yes. np = (200)(.13) = 26 n(1 – p) = (200)(1 - .13) = (200)(.87) = 174

  19. Binomial Distribution and the Normal Curve • What do we mean that a sampling distribution is normal? • Just like someone’s age is one value among many ages that we tally to make a histogram, we can tally many samples, get the p values of those sample, and construct histograms from these means. • If we took say, 1000 samples, and tallied the p values for Hot Topic shoppers, then those values, when turned into a histogram, should form a normal curve. Just like if we took the heights of a 1000 women, and tallied those values to get a normal curve.

  20. How to use the Binomial Distribution and the Normal Curve • Get the mean (µ)…the mean is the expected value (np) • Get the standard deviation (σ) = √np(1 – p) • Draw a normal curve using mean and standard dev • Use the “continuity correction factor,” and add +/- half a unit to the value we want to solve for • Get the z-scores = x - µ σ • Use the normal curve table to solve the problem

  21. Why the “continuity correction factor”? • This is only for discrete values (where values occupy only distinct points.) For example, in our study, there is no such thing as a “half” or “3/4” Hot Topic shopper. Either you are a shopper or not. Looking at how histograms are presented, you can see why we have to use the correction factor. • Probability of getting a value equal to or greater than (=>), then you must subtract a half-unit • Probability of getting a value equal to or lesser than (=<), you must add a half unit. • Probability of getting the exact value, you must get the Z-scores for a half-unit above and a half-unit below

  22. Now let’s answer a Hot Topic Question… If you collected a sample of 200 FIT students… • What is the probability that 13 will be Hot Topic shoppers? • What is the probability that you will have 30 or more Hot Topic shoppers? • What is the probability that you will have 25 or less Hot Topic shoppers?

  23. Question • What is the probability that 13 will be Hot Topic shoppers? • What is the probability that you will have 30 or more Hot Topic shoppers? • What is the probability that you will have 25 or less Hot Topic shoppers? Answer • Get the mean (µ) = expected value = np = (200)(.13) = 26 • Get the standard deviation (σ) = √np(1 – p) = √26(1 - .13) = √26(.87) = √22.62 ≈ 4.76 • Draw a normal curve using mean and standard dev. • Use the continuity correction factor to correct x. (a) 12.5 and 13.5, (b) 29.5, (c) 25.5 • Get the z-scores. (a) -2.83 and -2.62, (b) .735, (c)-.105 • Solve the problem… (a) 4977 - .4956 = .002, or 2% (b) .50 - .2704 ≈ .23, or 23%, (c) .50 - .0596 = .4404

  24. Now let’s do question 4.16 as a class… In a marketing population of phone calls, 3% produced a sale. If this population proportion (p = 3%) can be applied to future phone calls, then out of 500 randomly monitored phone calls, • How many would you expect to produce a sale? • What is the probability of getting 11 to 14 sales? • What is the probability of getting 12 or less sales? • 15 • 32.93% • 25.46%

  25. Question 4.16 answers • Expected value = np = 500(.03) = 15 • 32.93% • 25.46%

More Related