1 / 44

Inferential statistics

Doing inferential statistics = doing pairwise comparisons of models. Inferential statistics. Every comparison answers a conceptual question, e.g. - Is there a relationship between X and Y?. - Is the difference between the experimental conditions statistically significant?.

vidal
Télécharger la présentation

Inferential statistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Doing inferential statistics = doing pairwise comparisons of models Inferential statistics Every comparison answers a conceptual question, e.g. - Is there a relationship between X and Y? - Is the difference between the experimental conditions statistically significant? We looked at several models …

  2. With each model, we complete two steps: Inferential statistics Step 1: Make the best predictions, given the information you have Step 2: Compute the total prediction error (sum of squared errors = SSE) Once we have two or more models, we can make pairwise comparisons Ex: Self-complexity data set

  3. Model 0 (= the "Stupid Model" = the "Null Model"): swb = B0 + e ; (here: B0 = 0) Price: P = ? 0 ; Total Prediction Error: SSE = 438 ; Error for participant #7? e7 = 3 ; #7

  4. Model 1 (= the "Basic Model" = the "Mean-Only Model"): swb = b0 + e ; (here: b0 = 5) Price: P = ? 1 ; Total Prediction Error: SSE = 88 ; Error for participant #7? e7 = -2 ; #7

  5. Compact m.: swb = B0 + e = 0 + e ; P = 0 ; SSE = 438 ; Augmented m.: swb = b0 + e = 5 + e ; P = 1 ; SSE = 88 ; Model Comparison 1 Augmented model Compact model

  6. C: swb = B0 + e = 0 + e ; P = 0 ; SSE = 438 ; (Model 0) A: swb = b0 + e = 5 + e ; P = 1 ; SSE = 88 ; (Model 1) Model Comparison 1

  7. C: swb = B0 + e = 0 + e ; P = 0 ; SSE = 438 ; (Model 0) A: swb = b0 + e = 5 + e ; P = 1 ; SSE = 88 ; (Model 1) Model Comparison 1 Mathematical interpretations: It's worth it to estimate the additional parameter. The parameter b0 is reliably different from zero. Conceptual interpretation: The subjective well-being scores are on average reliably different from zero. Conceptual interpretation: ? Often not meaningful  good-bye stupid model

  8. Model 1 (= the "Basic Model" = the "Mean-Only Model"): SST = sum of squares total swb = b0 + e ; (here: b0 = 5) Price: P = 1 ; Total Prediction Error: SSE = 88 ; SSE N - 1 = s2 SSE = the part of the variance that we have not (yet) explained

  9. ? ? swb = b0 + b1*comp + e ; (here: b0 = 2.05 and b1 = 1.18) Price: P = ? 2 ; Total Prediction Error: SSE = 58.49 ; Model 2 the "slope" b1 #7 Error for participant #7? e7 = .7 ; the "intercept" b0

  10. Different models Model 2: ^ swb = 2.05 + 1.18 * comp comp 0 1 2 3 4 swb 2.05 3.23 4.41 5.59 6.77 ^

  11. C: swb = b0 + e = 5 + e ; P = 1 ; SSE = 88 ; (Model 1) C: swb = b0 + 0 * comp + e = 5 + 0 + e ; A: swb = b0 + b1* comp + e = 2.05 + 1.18*comp + e ; P = 2 ; SSE = 58.49 ; (Model 2) Model Comparison 2 Augmented model Compact model

  12. C: swb = b0 + e = 5 + e ; P = 1 ; SSE = 88 ; (Model 1) A: swb = b0 + b1* comp + e = 2.05 + 1.18*comp + e ; P = 2 ; SSE = 58.49 ; (Model 2) Model Comparison 2

  13. C: SSEC = 88 ; A: SSEA = 58.49 ; 29.51 Model Comparison 2 58.49 33.53% 66.47% Variance explained by comp Unexplained variance η2 = partial eta squared = proportion of variance explained = = = = = .3353 = .34 = p SSEC – SSEA 29.51 88 – 58.49 88 SSEC 88 = PRE = proportional reduction in error ;

  14. ηpη2 p Effect sizes rpPRE Small effect .1 .01 Medium effect .3 .09 Large effect .5 .25

  15. C: swb = b0 + e = 5 + e ; P = 1 ; SSE = 88 ; A: swb = b0 + b1* comp + e = 2.05 + 1.18*comp + e ; P = 2 ; SSE = 58.49 ; Model Comparison 2 F(1,12) = 6.05; t(12) = 2.46; p < .03; η2 = .34; p Mathematical interpretations: It's worth it to estimate the additional parameter. The parameter b1 (the slope) is reliably different from zero. Conceptual interpretation: ? Conceptual interpretation: There is a statistically significant relationship between subjective well-being and self-complexity. The effect is quite large.

  16. C: swb = 5 + e; P = 1; SSE = 88; (Model 1) A: swb = 2.05 + 1.18*comp + e; P = 2; SSE = 58.49; (Model 2) Model Comparison 2 SSR PA-PC N-PA "numerator (model) degrees of freedom" "denominator (error) degrees of freedom"

  17. R script : m2 <- lm(swb ~ comp, data=d) summary(m2) lm.sumSquares(m2)

  18. > m2 <- lm(swb ~ comp, data=d) > summary(m2) Call: lm(formula = swb ~ comp, data = d) Residuals: Min 1Q Median 3Q Max -3.472 -1.866 0.000 1.866 3.472 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 2.0491 1.3366 1.533 0.151 comp 1.1804 0.4797 2.460 0.030 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘’ 1 Residual standard error: 2.208 on 12 degrees of freedom Multiple R-squared: 0.3353, Adjusted R-squared: 0.2799 F-statistic: 6.054 on 1 and 12 DF, p-value: 0.03001 > lm.sumSquares(m2) SS dR-sqr pEta-sqr df F p-value (Intercept) 11.45597 0.1302 0.1638 1 2.3503 0.1512 comp 29.50897 0.3353 0.3353 1 6.0541 0.0300 Error (SSE) 58.49103 NA NA 12 NA NA Total (SST) 88.00000 NA NA NA NA NA Model comparison: C: swb = B0 + b1* comp + e ; B0 = 0 ; A: swb = b0 + b1* comp + e ; Model comparison: C: swb = b0 + e A: swb = b0 + b1* comp + e

  19. Write-up: Testing one continuous predictor against zero We estimated a simple regression model in which we regressed participants' subjective well-being scores on their self-complexity scores. We observed a positive relationship between the two variables, b1 = 1.18, F(1,12) = 6.05 [t(12) = 2.46], p < .03, h2 = .34. [For every unit increase in self-complexity, participants' self-complexity scores increased by 1.18 units.] As can be seen in Figure 1, the more individuals had a complex representation of the self, the more they reported feeling satisfied with their life. p

  20. Publication-quality graph

  21. Model 2b: What happens if we "center" self-complexity? Different models > d$compC <- d$comp - mean(d$comp) > describe (d) var n mean sd median trimmed mad min max range skew kurtosis se swb 1 14 5.0 2.60 5.0 5.0 2.97 1.0 9.0 8.0 0 -1.59 0.70 comp 2 14 2.5 1.28 2.5 2.5 1.48 0.2 4.8 4.6 0 -1.00 0.34 compC 4 14 0.0 1.28 0.0 0.0 1.48 -2.3 2.3 4.6 0 -1.00 0.34

  22. Model 2b: ? ? swb = b0 + b1*compC + e ; (here: b0 = 5.00 and b1 = 1.18) Different models P = 2 ; SSE = 58.49 ; The "intercept" b0 the "slope" b1

  23. Model 2: Model 2b: swb = b0 + b1*comp + e = = 2.05 + 1.18 comp + e ; P = 2 ; SSE = 58.49 ; swb = b0 + b1*compC + e = = 5.00 + 1.18 compC+ e ; P = 2 ; SSE = 58.49 ; Different models

  24. C: swb = b0 + e = 5 + e ; P = 1 ; SSE = 88 ; A: swb = b0 + b1* compC + e = 5.00 + 1.18*compC + e ; P = 2 ; SSE = 58.49 ; Model Comparison 2b Augmented model Compact model

  25. C: swb = b0 + e = 5 + e ; P = 1 ; SSE = 88 ; A: swb = b0 + b1* compC + e = 5.00 + 1.18*compC + e ; P = 2 ; SSE = 58.49 ; Model Comparison 2b p

  26. Centering an independent variable (subtracting a constant) doesn't change much. Take-home message Especially, it doesn't affect the test of the regression coefficient associated with this variable.

  27. Ex: data s_complexity 3 inter.dat Models with a single dichotomous predictor > d <- lm.readDat ("data s_complexity 3 inter.dat") > describe (d) var n mean sd median trimmed mad min max range skew kurtosis se swb 1 14 5.0 2.60 5.0 5.0 2.97 1.0 9.0 8.0 0 -1.59 0.70 comp 2 14 2.5 1.28 2.5 2.5 1.48 0.2 4.8 4.6 0 -1.00 0.34 sex 3 14 1.5 0.52 1.5 1.5 0.74 1.0 2.0 1.0 0 -2.14 0.14 > describeBy(d$swb,d$sex) group: 1 var n mean sd median trimmed mad min max range skew kurtosis se 1 1 7 3.43 2.3 3 3.43 1.48 1 8 7 0.89 -0.55 0.87 ------------------------------------------------------------------------- group: 2 var n mean sd median trimmed mad min max range skew kurtosis se 1 1 7 6.57 1.9 7 6.57 1.48 3 9 6 -0.59 -0.8 0.72

  28. > m2c <- lm(d$swb ~ d$sex) > summary(m2c) Call: lm(formula = d$swb ~ d$sex) Residuals: Min 1Q Median 3Q Max -3.5714 -1.2143 0.0000 0.5714 4.5714 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.2857 1.7833 0.160 0.8754 d$sex 3.1429 1.1279 2.787 0.0165 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘’ 1 Residual standard error: 2.11 on 12 degrees of freedom Multiple R-squared: 0.3929, Adjusted R-squared: 0.3423 F-statistic: 7.765 on 1 and 12 DF, p-value: 0.01645 > lm.sumSquares(m2c) SS dR-sqr pEta-sqr df F p-value (Intercept) 0.1142857 0.0013 0.0021 1 0.0257 0.8754 d$sex 34.5714286 0.3929 0.3929 1 7.7647 0.0165 Error (SSE) 53.4285714 NA NA 12 NA NA Total (SST) 88.0000000 NA NA NA NA NA b0 b1

  29. Model 2c: Models with a single dichotomous predictor swb = b0 + b1*sex + e ; (here: b0 = .29 and b1 = 3.14) men women sex 0 1 2 3 swb .29 3.43 6.57 9.71 ^

  30. Model 2c: swb = b0 + b1*sex + e ; (here: b0 = .29 and b1 = 3.14) Different models P = 2 ; SSE = 53.43 ; the "slope" b1 the "intercept" b0

  31. C: swb = b0 + e = 5 + e ; P = 1 ; SSE = 88 ; A: swb = b0 + b1* sex + e = .29 + 3.14*sex + e ; P = 2 ; SSE = 53.43 ; Model Comparison 2c Augmented model Compact model

  32. C: swb = b0 + e = 5 + e ; P = 1 ; SSE = 88 ; A: swb = b0 + b1* sex + e = .29 + 3.14*sex + e ; P = 2 ; SSE = 53.43 ; Model Comparison 2c

  33. > m2c <- lm(d$swb ~ d$sex) > summary(m2c) Call: lm(formula = d$swb ~ d$sex) Residuals: Min 1Q Median 3Q Max -3.5714 -1.2143 0.0000 0.5714 4.5714 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.2857 1.7833 0.160 0.8754 d$sex 3.1429 1.1279 2.787 0.0165 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘’ 1 Residual standard error: 2.11 on 12 degrees of freedom Multiple R-squared: 0.3929, Adjusted R-squared: 0.3423 F-statistic: 7.765 on 1 and 12 DF, p-value: 0.01645 > lm.sumSquares(m2c) SS dR-sqr pEta-sqr df F p-value (Intercept) 0.1142857 0.0013 0.0021 1 0.0257 0.8754 d$sex 34.5714286 0.3929 0.3929 1 7.7647 0.0165 Error (SSE) 53.4285714 NA NA 12 NA NA Total (SST) 88.0000000 NA NA NA NA NA Model comparison: C: swb = b0 + e A: swb = b0 + b1* sex + e

  34. C: swb = b0 + e = 5 + e ; P = 1 ; SSE = 88 ; A: swb = b0 + b1* sex + e = .29 + 3.14*sex + e ; P = 2 ; SSE = 53.43 ; Model Comparison 2c Mathematical interpretations: It's worth it to estimate the additional parameter. The parameter b1 (the slope) is reliably different from zero. Conceptual interpretation: There is a statistically significant relationship between subjective well-being and sex. The effect is quite large. Conceptual interpretation: ?

  35. Write-up: Testing one dichotomous predictor against zero We ran an independent samples t-test with subjective well-being as the outcome variable and sex as the predictor variable. The effect of sex was statistically significant, [b1 = 3.14,] t(12) = 2.79 [F(1,12) = 7.77], p < .02, h2 = .39. As can be seen in Figure 1, women (M = 6.57, SD = 1.90) reported higher subjective well-being than men (M = 3.43, SD = 2.30). p

  36. Write-up: Testing one dichotomous predictor against zero

  37. We analyze dichotomous predictors just as we analyze continuous predictors. Take-home message There is no fundamental difference between dichotomous predictors and continuous predictors.

  38. Model 2d: What happens if we "center" sex? Different models d$sexC[d$sex==1] = -0.5 d$sexC[d$sex==2] = +0.5 > describe (d) var n mean sd median trimmed mad min max range skew kurtosis se swb 1 14 5.0 2.60 5.0 5.0 2.97 1.0 9.0 8.0 0 -1.59 0.70 comp 2 14 2.5 1.28 2.5 2.5 1.48 0.2 4.8 4.6 0 -1.00 0.34 sex 3 14 1.5 0.52 1.5 1.5 0.74 1.0 2.0 1.0 0 -2.14 0.14 sexC 4 14 0.0 0.52 0.0 0.0 0.74 -0.5 0.5 1.0 0 -2.14 0.14

  39. Model 2d: swb = b0 + b1*sexC + e ; (here: b0 = 5.00 and b1 = 3.14) Different models P = 2 ; SSE = 53.43 ; the "intercept" b0 the "slope" b1

  40. Model 2c: Model 2d: swb = b0 + b1*sex + e = = .29 + 3.14 sex + e ; P = 2 ; SSE = 53.43 ; swb = b0 + b1*sexC + e = = 5.00 + 3.14 sexC+ e ; P = 2 ; SSE = 53.43 ; Different models

  41. Centering an independent variable doesn't change much… … regardless of whether it is a continuous or a dichotomous independent variable. Take-home message We center - continuous variables by subtracting the mean d$compC <- d$comp – mean(d$comp) - dichotomous variables by recoding them into -.5 and +.5 d$sexC[d$sex==1] = -0.5 d$sexC[d$sex==2] = +0.5

  42. Just for fun η2 = partial eta squared = PRE = proportional reduction in error = proportion of variance explained p ^ A

  43. Scaling a variable ANOVA table Stuff to add (if time permits) Alternative representation of a dichotomous predictor (two flat lines)

More Related