Sociology 601, Class 4: September 10, 2009

Sociology 601, Class 4: September 10, 2009 Chapter 4: Distributions • Probability distributions (4.1) • The normal probability distribution (4.2) • Sampling distributions (4.3, 4.4)

4.1: probability distributions • We study probability to get an idea of how well sample statistics match up to their population parameters • probability: the proportion of times that a particular outcome would occur in a long run of repeated observations • example: you go to Monte Carlo and watch people play roulette. What is the probability of observing the number “23” in a single spin of a roulette wheel with 38 slots? • probability distribution: a listing of possible outcomes for a variable, together with their probabilities

Probability distributions for discrete variables: formulas • let y denote a possible outcome for variable Y, and let P(y) denote the probability of that outcome. • then 0 P(y)  1 and all yP(y) = 1 • the mean of a probability distribution:  = (y*P(y)) • why do we use  instead of Ybar? • Is this equation compatible with our formula for a sample mean? • variance of a probability distribution: 2 = ((y-)2*P(y))

Probability distributions for discrete variables: 3 flips of a coin

Probability distributions for discrete variables: example (p. 83) we will estimate parameters from this chart:

Calculating the mean, variance, and standard deviation of a probability distribution based on the previous chart:

Probability distributions for continuous variables • So far we have described discrete probability distributions where the variable can take on only a finite number of values. • As the number of possible values for the variable increases, the probability distribution becomes a continuous function. • In such cases, we must solve areas under curves to find: • Population mean or standard deviation • Probability for a certain range of the x-variable.

4.2: The normal probability distribution Many social and natural variables have a distinctive continuous probability distribution when we measure them, sort of a ‘bell-shaped’ curve, or a normal distribution.

Examples of normal probability distributions Graph on board: • Normal distribution for adult women’s heights:  = 64.3 inches,  = 2.8 inches • Normal distribution for adult men’s heights:  = 69.9 inches,  = 3.0 inches

Standardizing scores • Standardizing a score is taking a raw score, a mean, and a standard deviation, and translating the score into a number of standard deviations from the mean. • formula: z = (y - ) /  • examples: if y =  then z = 0 • y =  +  z = 1 • y =  + 2 z = 2 • y =  - 2 z = -2

Standardizing scores: Examples Calculate a z-score for each example SAT score: y = 350,  = 500,  = 100 SAT score: y = 520,  = 500,  = 100 IQ score: y = 88,  = 100,  = 15 Woman’s height: y = 71,  = 65,  = 3.5 Psychological test: y = -2.58,  = 0,  = 1

General properties of the normal curve • The normal curve is symmetric about the mean • The normal curve is bell-shaped, with the highest probability occurring at the mean • for z from –1 to +1, the probability is about 0.68 • for z from –2 to +2, the probability is about 0.95 • for z from –3 to +3, the probability is about 0.997 • If a curve is not symmetrical, or if a z-score is inconsistent with the above probabilities, then it is not a normal curve. • any z-score is conceptually possible, because the normal curve never quite converges to a probability of zero.

Formula for a normal probability distribution A normal probability distribution (e.g. the probability distribution for a roll of 100 dice) is based on the formula: • Note that  and  are both elements of the probability. • This formula is impossible to integrate, so it is difficult to calculate the probability that an observation will be between y1 and y2.

A dilemma and a solution • The dilemma: the universe is filled with phenomena that have a probability distribution we can’t calculate! • The solution: since this distribution recurs so often, it is worth the effort to painstakingly estimate the probabilities associated with each part of the normal distribution, list them by z-scores, then put all the results in a table for everybody to use. (see Appendix A, page 668) • This is an important purpose of standardization.

Using Table A (page 668) to estimate areas under the normal curve • You are given a z-score and asked to find a p-value Example: z = 1.53, p(z >1.53 = ?) • 1.) Move down to the row with the first decimal (1.5) • 2.) Move across to the row with the second decimal (.03) • 3.) Write the corresponding p-value in an inequality (P(z > 1.53) = .063, by chance alone) • For negative z-scores, use the same procedure but reverse the inequality. (p(z < -1.53) = .063, by chance alone)

Using Table A (page 668) to estimate areas under the normal curve Practice these examples: • what is p(z≥ 1.19) by chance alone? • what is p(z≤ - .04) by chance alone ? • what is p(-1 ≤z≤ 1) by chance alone? • what is p(z≤ -1.96) or p(z≥ 1.96) by chance alone? • what is p(|z|≥ 1.96) by chance alone?

reading stata computer outputs #1 going between z-statistics and p-values using DISPLAY NORMPROB and DISPLAY INVNORM note differences between these results and Page 668! display invnorm(.025) -1.959964 display invnorm(.975) 1.959964 * to verify that +/-1.96 are the z-scores you want display normprob(-1.96) .0249979 display normprob(1.96) .9750021

Notes about working with the normal curve • The table for deriving probabilities only works for normal distributions. • If you have some other distribution, you can still calculate σ and z, but you can’t match z to a p-value. • Axis references are often confusing in statistics books: • the x-axis often lists values for what we call the y-variable • the y-axis often has no scale listed at all. It probably should have values for probability per unit of the y-variable. • Tables are also confusing: • some texts provide tables for p(z<z), while • some texts provide tables for p(z>z). • To save space, texts don’t provide information for z<0, it is assumed that you understand that the distribution is symmetrical

4.3: Sampling distributions • Why would we care about a distribution of samples? • We can’t study a population, but we can study a sample. • We can’t know how well this sample reflects the population, but we can use probability theory to study how samples would tend to come out if we did know the characteristics of the population.

Definitions: • Sampling distribution: a probability distribution that determines probabilities of a possible values of a sample statistic (i.e. a relative frequency distribution of many sample means). • Standard error of a sampling distribution: a measure of the typical distance between a sample mean and a population mean • Standard deviation of a population: a measure of the typical distance between an observation and the population mean.

Equations: • Mean of a sampling distribution: • Standard error of a sampling distribution: • Example: estimate the standard error of this sample: • 1, 3, 5, 5, 5, 7, 9 • Is this estimate the true standard error of the population?

An advantage of large samples: • The central limit theorem. As the sample size n grows, the sampling distribution of Y(bar) approaches a normal distribution. • This is true even for variables that are not normally distributed in the population, such as age or income!

Why is the central limit theorem a big deal? • When you use a sample statistic to guess a parameter, you will want to know how good your guess is. • If the distribution of sample means about the population mean is normal, you can estimate how far off a given sample mean might be. • With a moderate sample size, the sampling distribution is normal, even if the underlying distribution is not! • However, you still may not have a large enough sample to estimate the parameter with the precision you want.

Another advantage of large samples: • The law of large numbers. The bigger the sample, the closer (on average) the sample statistic to the parameter. • In other words, as samples become larger, the variation between samples becomes smaller. • Note: the law of large numbers does not involve any sort of telos.(Example of 4th coin toss)

The law of large numbers in action. • Here is the complete sampling distribution of possible sample means for up to four coin tosses • (score variable “heads” = “1” if heads, “0” if tails)

The law of large numbers: the standard error of a sample shrinks as n increases • Recall the formula for a variance of a probability distribution: • σ2 = Σ((y – μ)2 * P(y)) • For n = 1, σ2 = ((0 - .5)2 * .5) + ((1 - .5)2 * .5) = .25 σ = .5 • For n = 2, σ22 = .125, σ2 = .35 • For n = 4, σ42 = .0625, σ4 = .25 • The standard error is the standard deviation of a distribution of samples. • This is not the same thing as a standard deviation of a single sample, or the standard deviation of a population. • The sample standard deviation does not shrink as n increases.

Summary: Why we work with samples • On average, a statistic from a good random sample will have the same value as the corresponding population parameter. • With a larger sample, the sample statistic will be closer to the population parameter on average. • If the distribution of sample means is normal, one can make additional guesses about how close the sample statistic might be to the population parameter. • We assume the distribution of sample means is normal … • - If n > 30 (by the central limit theorem), or • - If the population is normally distributed

Sociology 601, Class 4: September 10, 2009