Topic 4

Topic 4 Statistical Inference: Confidence Intervals

Parameter and Statistic A parameter is a number that describes the population. A statistic is a number that describes a sample from the population. Examples of parameters: (1) Mean (µ) (2) Standard deviation (σ) (3) Proportion (p) (4) Correlation (ρ) Examples of statistics: (1) Sample mean (2) Sample standard deviation (3) Sample proportion (4) Sample correlation (r)

Statistical Inference • Statistical inference provides methods for drawing conclusions about a population from sample data. • These methods are • Confidence intervals, which are used to estimate the value of a population parameter. • Tests of significance, which assess the evidence for a claim about a population.

Simple Conditions for Inference about a Mean (σ is known) Need an SRS The population is normally distributed The population standard deviation is known.

Confidence Intervals • A level Cconfidence interval for a parameter has two parts: • An interval calculated from the data, usually of the form Estimate  Margin of Error (the estimate is our guess for the value of the unknown parameter and the margin of error shows how accurate we believe our guess is.) • A confidence level C, which gives the probability that the interval will capture the true parameter value in repeated samples. That is, C is the success rate for the method.

Confidence Interval (CI) for the Mean µ Draw an SRS of size n from N(µ, σ). A level C confidence interval for µ is Here z* is called a critical value (CV), which, along with – z*, marks the middle (100C)% of all values from N(0, 1). Note that the CI is written as est (cv)(std). There is a trade-off between the confidence level and the margin of error: to obtain a small margin of error from the same data, you must be willing to accept lower confidence. Increasing the sample size reduces the margin of error for any fixed confidence level.

Examples of Calculating Confidence Intervals for the Mean Here are the IQ scores of 31 7-grade girls randomly chosen from a school district: 114 100 104 89 102 91 114 114 103 105 108 130 120 132 111 128 118 119 86 72 111 103 74 112 107 103 98 96 112 112 93 The sample mean is 105.8387. Suppose that the standard deviation of IQ scores in the population is known to be 15. Find the 95% CI for the mean IQ score for all 7-grade girls in the school district. (what is the population? What is the sample?) Interpret your result.

Example: (Plasma Aldosterone in Dogs) Aldosterone is a hormone involved in maintaining fluid balance in the body. In a veterinary study, 8 dogs with heart failure were treated with the drug Captopril, and plasma concentrations of aldosterone were measured before and after the treatment. Suppose that the before-after change (before – after) in concentration has a normal distribution with standard deviation 15. Display all values on a single graph, with paired values connected by a line. Find the 95% CI for the mean change. Answer: – 24.4  12.95

Choosing the Sample Size To achieve a margin of error of m for the level C confidence interval of the mean µ, the sample size has to be at least

Example of Sample Size Determination (Plasma Aldosterone in Dogs) Aldosterone is a hormone involved in maintaining fluid balance in the body. In a veterinary study, 8 dogs with heart failure were treated with the drug Captopril, and plasma concentrations of aldosterone were measured before and after the treatment. Suppose that the before-after change (before – after) in concentration has a normal distribution with standard deviation 15. How large a sample of dogs would be needed to estimate the mean change in plasma concentration of aldosterone to within  8 with 95% confidence? That is, to reduce the margin of error to 8, how many more dogs are needed?

Conditions for Inference About a Mean (σ is unknown) Need an SRS. The SRS comes from a population that has a normal distribution, or the sample size is large (> 30).

The 1-Sample t Confidence Interval for µ • When σ is unknown, a level C confidence interval for µ is where t* is the t-critical value for the t density curve with area C between – t* and t*.

Example: 1-Sample t Confidence Interval for µ The height (in inches) of adult males in the United States is believed to be normally distributed with mean µ. The average height of a random sample of 25 American adult males is found to be = 69.72 inches, and the standard deviation of the twenty-five heights is found to be s = 4.15. What is the 90% t confidence interval for µ?

Example: 1-Sample t Confidence Interval for µ Do students tend to improve their SAT mathematics (SAT-M) score the second time they take the test? A random sample of four students who took the test twice received the following scores.Student 1 2 3 4 First score 450 520 720 600 Second score 440 600 720 630 Assuming that the change in SAT-M score (second score - first score) for the population of all students taking the test twice is normally distributed with mean µ, a 95% confidence interval of µ isA. 25.0 ± 64.29.B. 25.0 ± 56.09.C. 25.0 ± 39.60.

Skip the Remaining Slides? • You read them.

Confidence Interval for the Proportion p • Draw an SRS of size n from a large population that contains proportion p of successes. Denote the sample proportion of successes by • A large-sample level C confidence interval is where z* is the critical value such that z* and - z* mark the central area C under the standard normal density curve.

Example: Confidence Interval for a Proportion p Flip a die 25 times and the side 1 appears 4 times. Find a 95% confidence interval for p, the probability of flipping a 1. (0.16± 0.0144)

Sample Size for Desired margin of Error How large a sample is needed to get the margin of error approximately equal to m? The sample size is where z* is the critical value such that z* and - z* mark the central area C under the standard normal density curve, and p* is a guessed value for the sample proportion. A safe sample size (the maximum sample size needed) can be obtained by choosing p* to be 0.5. p* is often estimated using a previous sample as a pilot study.

Example: A Gallup Poll asked a sample of Canadian adults if they thought the law should allow doctors to end the life of a patient who is in great pain and near death if the patient makes a request in writing. The poll included 270 people in Quebec, 221 of whom agreed that doctor-assisted suicide should be allowed. (a) What is the margin of error of the large-sample 95% confidence interval for the proportion of all Quebec adults who should allow doctor-assisted suicide? (b) How large a sample is needed to get the common 3 percentage point margin of error? Use the previous sample as a pilot study to get p*.

Comparing Two Population Means • Conditions for inference comparing two means • Need two independent SRS’s, from two distinct populations. • Both populations are normally distributed or the sample sizes are large (> 30). • Notation

Two Sample t Procedure: Confidence Interval Draw an SRS of size n1 from a normal population with unknown mean µ1, and draw an independent SRS of size n2 from another normal population with unknown mean µ2. A level C confidence interval for µ1 – µ2 is given by where t* is the critical value with area C between – t* and t* under t* density curve with degrees of freedom equal to the smaller of n1 – 1 and n2 – 1.

Example: the 2-Sample t Confidence Interval Procedure Scores this year on the SAT mathematics test (SAT-M) for students taking the test for the first time are believed to be normally distributed with mean µ1. For students taking the test for the second time, this year's scores are also believed to be normally distributed but with a possibly different mean µ2. We wish to estimate the difference. A random sample of the SAT-M scores of 100 students who took the test for the first time this year was obtained, and the mean of these 100 scores was = 504.5, while the standard deviation was = 100. A random sample of the SAT-M scores of 30 students who took the test for the second time this year was also obtained, and the mean of these 30 scores was = 539.1 while the standard deviation was = 90. A 95% confidence interval for µ2-µ1 is A. 34.6 ± 32.68.B. 34.6 ± 39.34.C. 34.6 ± 37.70.

Example: the 2-Sample t Confidence Interval Procedure • A sports writer wished to see if a football filled with helium travels farther, on average, than a football filled with air. To test this, the writer used eighteen adult male volunteers. These volunteers were randomly divided into two groups of nine subjects each. Group 1 kicked a football filled with helium to the recommended pressure. Group 2 kicked a football filled with air to the recommended pressure. The mean yardage for Group 1 was = 300 yards with a standard deviation = 8 yards. The mean yardage for Group 2 was = 296 yards with a standard deviation = 6 yards. Assume the two groups of kicks are independent. Let µ1 and µ2 represent the mean yardage we would observe for the entire population represented by the volunteers if all members of this population kicked, respectively, a helium- and air-filled football. Assuming two sample t procedures are safe to use, a 90% confidence interval for µ1 - µ2 is (use the conservative value for the degrees of freedom) A. 4 ± 5.5 yards.B. 4 ± 6.2 yards.C. 4 ± 7.7 yards.

Topic 4

Topic 4

Presentation Transcript

TOPIC 4

Topic 4

Topic 4

4. Topic

TOPIC #4

TOPIC 4

Topic 4

Topic 4

TOPIC 4

Topic 4

Topic 4

Topic 4

Topic 4

Topic 4

Topic 4

Topic 4

Topic 4

TOPIC 4

Topic 4

Topic 4

TOPIC 4

TOPIC 4