Basic Quantitative Methods in the Social Sciences (AKA Intro Stats)

Basic Quantitative Methods in the Social Sciences(AKA Intro Stats) 02-250-01 Lecture 5

Sampling Distributions • Inferential Statistics generalizes findings obtained from samples to the populations that the samples were drawn from • Samples need to be representative of the populations they are drawn from – so we use random sampling

Random Sample • Random Sample: a sample in which each member of the population has an equal chance of being included • We cannot assume that a random sample is exactly representative of its population • E.g., randomly choosing 50 students from this class – their mean age may not be exactly the mean age of the entire class (the population – approx 230 students)

Random Sampling • Random sampling makes all the samples which could be drawn from the population equally likely (e.g., who is included in the 50 student sample) • Each of the possible samples of 50 students would have mean ages that would slightly differ from the population mean age • We measure this difference with sampling error

Sampling Error • Sampling Error: the difference between a statistic and the parameter it estimates • E.g., if the population mean age was 24 and the sample mean age was 21, we say we have a sampling error of 3 years

Sampling Error • Because we usually don’t collect data for an entire population, we must have some way of estimating the sampling error size and account for it when we generalize sample information to populations • We often obtain more samples to determine the sampling error

Sampling Distributions • If we draw 6 samples of 50 students from this class, we can obtain a better estimate of the true population mean age than if we only drew one sample • Suppose the mean ages for those 6 samples were as follows: 25, 23, 23, 25, 25, 26 The mean of these 6 mean ages is 24.5

Sampling Distributions • Looking at the mean age of the first sample, 25 years, if we only had data for this one sample, 25 years would be our best estimate of the true population mean • By taking more than one sample, we calculate a more accurate estimate of the population mean, 24.5 years

Sampling Error • Since all of the ages are relatively close to each other, we can say with greater certainty that we have small sampling error for any one of the sample means • If the samples’ mean ages were much more dissimilar, any one of the sample age means would probably have a much higher sampling error

Sampling Error • This means that the variability of a statistic over repeated samplings gives us some indication of sampling error • If we continued to draw samples from the population until all possible samples had been drawn and the statistic of interest (mean age) is entered into a frequency distribution, this is known as a sampling distribution

Sampling Distributions • Sampling Distribution: the distribution of a statistic over repeated sampling from a specified population • Using our previous example, the sampling distribution of the mean for this class is a distribution of the means of every possible sample of 50 students

Expected Value • The mean of a sampling distribution of is known as the expected value of the mean = the mean of sampling means • We use the symbol  instead of for the mean of a sampling distribution because it is a population of terms

Standard Error • The standard deviation of a sampling distribution is know as the standard error (x) = the standard amount of difference between and  that is reasonable to expect simply by chance • The mean of any sample we take can be plotted on the sampling distribution of X if we know the x and x • The sampling distribution of X is a normal distribution

Sampling Distribution Sampling error x =  x Obtained from one sample

Standard Error • The formula for standard error is as follows:

Sampling Distributions • We usually don’t know x and x and must estimate x • Sampling Distributions are the basis for many statistical tests (e.g., t-test – we’ll talk about this later) • Statistical tests are a mathematical way of testing a hypothesis

Hypothesis Testing • Hypothesis testing is a way of examining a statement about a relationship between independent and dependent variables: • Independent variable: the variable whose effects the experimenter is interested in studying • Dependent variable: the variable that the experimenter measures (the data)

Independent and Dependent Variables - Example • If an experimenter is interested in researching how hours of studying for an exam affect performance on a test, the variables are as follows: • Independent Variable (IV): hours spent studying • Dependent Variable (DV): performance on test (e.g., grade received)

Independent Variables • There are 2 broad types of IVs: • Treatment Variable: a treatment the experimenter applies to previously undifferentiated participants • E.g., certain participants are told to study for 5 hours and others are told to study for 2 hours • Categorical Variable: A characteristic that is inherent to, or pre-exists, in the participant • E.g., gender – you can’t assign someone a gender

Levels of IV • We also talk about the levels of IVs – how we break down the IV • E.g., if we are interested in studying the IV of hours spent studying, it could have 2 levels – 2 hours and 5 hours • Studying the IV of gender has 2 levels – male and female • The levels of an IV are compared on their DV scores to look for a difference in outcome – is there a difference in test performance between those who study for 5 hours and those who study for 2?

Time to Think • A nursing researcher wants to know if giving TLC prolongs life in cancer patients. 50 cancer patients are divided into two groups: group A (n=25) is given TLC by their nurses, and group B (n=25) are not. What is the DV, IV, and levels of IV? • A researcher wants to know if members of the Federal Liberal Party are wealthier than are members of the Federal NDP. 100 members of each party are asked to submit financial statements. What is the DV, IV, and levels of IV?

Null Hypothesis • Tests of hypotheses in science are decisions to retain or reject a null hypothesis (Ho) • Null hypothesis (Ho) : a statement of relationship between the IV and DV, usually a statement of no difference or no relationship – we assume there is no relationship between IV and DV

Null Hypothesis Examples • Men and women do not differ in IQ (men = women) • Hours spent studying do not affect test performance (2 hours = 5 hours) • Height does not affect weight (short = tall)

Null Hypotheses • Null hypotheses contain 3 components: • The IV comparison being made • The DV being measured • The null relationship between IV and DV (e.g., “do not differ”)

Alternative Hypothesis • Although not directly tested, the Alternative Hypothesis (Ha) does state a relationship, or effect, of the IV on the DV – this is often called the Research Hypothesis • E.g., • Ha: Men and women do differ in IQ (men  women) • Ha: Women have higher IQs than men (women > men)

Directional Ha • Ha: Women have higher IQs than men (women > men) is a directional alternative hypothesis – we state that one level of the IV will have greater (or lesser) DV scores than the other level • When we make a directional alternative hypothesis, we have a reason (either based on past research or a theory) to predict the direction of the results (i.e., that a statistic at one level of the IV will be greater or less than the statistic at the other level of the IV) (note: the above example is hypothetical only)

Non-Directional Ha • A non-directional alternative hypothesis does not state the expected direction of effect: Ha: Men and women have differing IQs (women  men) • We make a non-directional alternative hypothesis when we have no reason to predict the direction of the results. For instance, since there is no theory or research body that would suggest that women should have higher IQs than men, we would only predict that their IQs are different than men’s

Hypothesis Testing • Hypothesis testing looks at the observed difference in DV scores between the levels of the IV and compares this difference to the expected difference (Ho) • Any difference in value of the DV between the levels of the IV can be explained in 2 ways – the effect of the IV or sampling error

Hypothesis Testing • Testing the null hypothesis is a way of determining the probability that the observed outcome could be found if the null hypothesis was true • E.g., if we did find a difference between the IQs of men and women, what is the chance we would find this result if there is actually no difference between their IQs?

Confidence Levels • When this probability drops below a certain level (a criterion level), we call the result significant • This criterion level is known as the confidence level of the test, or alpha ()

Confidence Level • Confidence Level: a criterion level of probability (alpha ), set by the experimenter, which acts as the reference for deciding whether to reject or retain the null hypothesis • Significant Result at .05: we determine the null hypothesis is not true but there is a 5% chance that the null hypothesis is actually true.

Confidence Level • The confidence level is set by the experimenter, but generally the convention is to use  = 0.05 and = 0.01 • For  = 0.05, this means that there is a 5% chance we will reject the null hypothesis when it is actually true

Rejecting the Null Hypothesis • If the likelihood of observing this outcome is below the confidence level ( = 0.05 or  = 0.01), then we say that the result is significant and we reject the null hypothesis • Significant results reject Ho (there is a difference) • Non-significant results retain Ho (there is no difference)

Type I and Type II Errors • When we decide to retain or reject the null hypothesis, we never do so with 100% certainty we are making the right decision – we make the decision with a probability of being correct (the alpha level) • We can make an incorrect decision, resulting in 2 types of errors, Type I or Type II

Type I Errors • Type I Error: Rejection of the null hypothesis when it is true • We conclude that the IV affects or is related to the DV when in reality the result was due to sampling error • We see something that is not really there

Type I Error Example • If our null hypothesis is that men and women do not differ in IQ, the Type I error is: Finding a result that men and women do differ in IQ, when in reality they do not • We find this difference because of sampling error

Type II Errors • Type II Error: Retention of the null hypothesis when it is false • We conclude that the IV does not affect or is not related to the DV when in reality there is an effect or relationship • We fail to see something that is really there

Type II error Example • If our null hypothesis is that men and women do not differ in IQ, the Type II error is: Finding a result that men and women do not differ in IQ, when in reality they do

Type I and Type II Errors

Type I and Type II Errors • The probability of making a Type I error is equal to the confidence level of the statistical test ( = 0.05 or  = 0.01) • When you lower the probability of making a Type I error (e.g., use  = 0.01 instead of  = 0.05) you increase the probability of making a Type II error

Forget About It! • For this class, you do not need to know how to determine the numerical value of a Type II error, nor do you need to understand power • You do need to understand what a Type II error is

Consider a Sampling Distribution of Arts Students’ GPAs. Sampling error x =  x 6 10

What might this mean? • This sample’s mean (10) appears to be substantially larger than the population mean (6). Why might this be? • Perhaps there is something distinct about this sample such that it is not really part of this sampling distribution to begin with (e.g., maybe there are gifted arts students) • Alternatively, perhaps it’s just fluke, and we just happened to have sampled a bunch of good arts students. Stated differently, perhaps this sample mean is part of the sampling distribution of arts students

Reminder • We can determine the proportion of scores (in this situation, sample means) that would fall to the right of the sample mean in question by looking at a normal distribution table (Table E.10). • To do so, we need to know the Z value of this sample mean. We will come back to this (but for sake of clarity, note that we will be learning to calculate a z-test, which uses a slightly different formula than the z-score formula that you know)

One vs. Two Tailed Tests • The “tails” of a test set up our rejection region – they determine how we decide to retain or reject Ho • When we use a one-tailed test, we are testing the null hypothesis for a directional alternative hypothesis (e.g., Ha: women will have higher IQs than men) • We are only interested in whether or not women have higher IQs than men, not lower

Two-Tailed Tests • When we use a two-tailed test, we are testing the null hypothesis for a non-directional alternative hypothesis (e.g., Ha: women and men will have different IQs) • Here, we are interested in whether or not women have higher or lower IQs than men

One vs. Two Tailed Tests (using = 0.05) 2.5% 2.5% 5% 5%

Two-Tailed Tests • Once we begin discussing t-tests, you will see that the value that determines whether or not our observed statistic falls above or below the = 0.05 depends on a number of factors • For now, know that we reject Ho if our observed statistic is significantly greater than our expected statistic

Test Statistics • A test statistic is a number calculated from the scores of a sample that allows us to test a Null Hypothesis and make a decision to reject or retain the Ho • We will be talking about various test statistics for the remainder of the term, and will begin with the z-statistic today

Z-scores Revisited • We know, by using the z-score formula, the probability of obtaining a score less than a given X value in a standard normal distribution • E.g., when

Basic Quantitative Methods in the Social Sciences (AKA Intro Stats)