Categorical Data Analysis Review for Final

Categorical Data AnalysisReview for Final Sucharita & Cookie Spring 2013

Overview • Binary Logistic Regression • Resampling

Odds & Odds Ratio • Odds : • (see CD06, p. 6) The odds a woman drinks is 524/358 = 1.464 • Interpretation: When odds >1, frequency of ‘A’ is larger than frequency of ‘not A’ • Odds ratio: • Used to measure the association between the two dichotomous variables • Effect Size (but not easy to interpret) • Interpretation of odds ratio: • (1) The odds of A in Group1 is _____ times the odds of A in Group2 • (2) The odds of A in Group2 is (1 / _____) times the odds of A in Group1 • Expected odds ratio when there is no effect: 1.00 • Calculate significance of odds ratios with: • Chi-square • Wald (binary logistic regression)

Binary Logistic Regression • When: • DV is a categorical variable (binary) • Predictors are categorical or continuous, or both. • Important statistics: • Wald statistic (p06 p.25) – used to determine statistical sig of variable • Wald Alone : dummy vs. all others dummies ignoring all other variables • Wald at Entry : dummy vs. referent group controlling for all other variables • Odds ratio at Final Model : effect sizes

DATA • World Value Survey (Between Sep 19th and Sep 29th 2006) (N = 1174) • Predicted variable: • Generally speaking, would you say that most people can be trusted or that you need to be very careful in dealing with people? • Dichotomous DV: (0= need to be careful 1=trusted) • Predictors of interest • gender (1=male, 2=female) • age (continuous variable) • general political party preference: (1=Republican, 2= Democrat, 3=Independent, 4= non-partisan)

Check for the reference group • Reference group for political party preference?

Research questions • Q1: Did more or less respondents say that most people can be trusted than that they need to be very careful in dealing with people? • Dichotomous DV: (0= need to be careful 1=trusted) • Step1: output  Block 0 • Step2: report Wald (df=1, N=1174)=45.239, p <.001, Exp(B)=.670 • Step3: interpretation  The odds of trusting is .670. In other words, the number of respondents who said that most people can be trusted were significantly less than the number of respondents who said that they need to be very careful in dealing with people. • Odds of trusted = # Trusted / #non-trusted = 471/703=.670 • Odds of non-trusted = #non-trusted / # Trusted = 703/471 = 1.49*

Research questions • Q2: How is sex related to respondents’ perception that most people can be trusted? • Step1: • Output 03  Block 1 Wald (df=1, N=1174)=.063, p=.802, Exp(B)=1.030 • Step2: interpretation  sex was not related to people’s perception about others as trusted. • What are the odds of women saying others can be trusted compared to the odds of men trusting others? • What are the odds of the men saying the same?

Research questions • Q3: How is age related to respondents’ perception that most people can be trusted? • Step1: • Output 02  Block 1 Wald (df=1, N=1174)=22.463, p<.001, Exp(B)=1.017 • Step2: interpretation  age was significantly related to people’s perception of others as trusted. On average, for each one year increase in age, the predicted odds of trusting someone is 1.017 times as great. Older people are more likely to perceive others as trusted than younger people, on average. • On average, with one unit decrease in age, what are the odds of trusting others?

Research questions • Q4: How is political party preference related to respondents’ perception that most people can be trusted? • Step1: • Output 01  Block 1 Wald (df=3, N=1174)=7.261, p=.064 • Step2: interpretation  political party preference was not significantly related to people’s perception of others as trusted. However, the Democrats are less likely to perceive others as trusted than Republicans, Wald =5.590, p<.05, Exp(B)=.699. The odds for Democrats to perceive others as trusted is .699 times the odds for Republicans. Furthermore, the Non-partisans are less likely to perceive others as trusted than Republicans, Wald =4.916, p<.05, Exp(B)=.696. The odds for Non-partisans to perceive others as trusted is .696 times the odds for Republicans.

Research questions • Q5: Does political party preference contribute beyond age and sex to predicting one’s perception that most people can be trusted? • Step1: • Output 03  Block 1 Wald (df=3, N=1174)=7.295, p=.063 • Step2: interpretation  controlling for sex and age, political party preference was not significantly related to people’s perception of others as trusted. However, the Democrats are less likely to perceive others as trusted than Republicans, Wald =6.755, p<.01, Exp(B)=.671. The odds for Democrats to perceive others as trusted is .671 times the odds for Republicans. The odds ratio of Non-partisans over Republicans became non-significant when controlling for sex and age.

Resampling • When do you use it? • When parametric tests cannot be employed. • The assumptions for using a parametric test have been violated • Use statistics not typically used in parametric statistics (e.g. medians)

Howell’s ProgramHow do you test the null hypothesis? Randomization Bootstrapping

NHST & p • Elements of p-value explanation • If Ho is true • and all assumptions are met • the probability of getting results this extreme or more extreme • is [p-value] • Ho is never true (Cohen, 1994)

T vs. F about p-value • p = .01… True or False • There is a 1% chance that the decision to reject Ho is wrong. • FALSE • Assuming Ho is true and the study is repeated many times, about 1% of these results will be even more inconsistent with Ho than the observed result. • TRUE

More T vs. F about p-value • The p value is the probability that the null hypothesis is true. • The p value is the probability that a finding is merely a luck. • The p value is the probability of falsely rejecting the null hypothesis. • 1-p value is the probability that a replicating experiment would yield the same conclusion. • The p value is the probability that a replicating experiment would not yield the same conclusion. • 1 − p is the probability of the alternative hypothesis being true. • The p-value indicates the size or importance of the observed effect. All False!

More T vs. F about p-value • T F A smaller p-value indicates a larger effect. :Explain • If all the conditions are the same (same population, same sample size, same variables): Then a smaller p-value indicates a larger effect size. But NO, when you compare p-values from different populations, different sample size, or different variables.

99% Confidence Interval means…(True or False?) • There is a 99% chance that your interval captures the population mean but we are not sure!  True • The population mean falls between CLs 99% of time.  False: The population mean is an unknown fixed value.

Summer is almost here!

Categorical Data Analysis Review for Final

Categorical Data Analysis Review for Final

Presentation Transcript

Categorical Data Analysis

Chapter 16 – Categorical Data Analysis

Introduction to Categorical Data Analysis

Categorical Data Analysis

Categorical Data

Analysis of Categorical Data

INTRODUCTION TO CATEGORICAL DATA ANALYSIS

Categorical Data

Categorical Data

Categorical Data

Categorical Data Analysis

STA617 Advanced Categorical Data Analysis

Categorical Data Analysis PGRM 14

WLS for Categorical Data

The Analysis of Categorical Data

AS 737 Categorical Data Analysis For Multivariate

Categorical Data Analysis

The Analysis of Categorical Data

Categorical Data Analysis

INTRODUCTION TO CATEGORICAL DATA ANALYSIS

Categorical data

WLS for Categorical Data