1 / 39

Experimental thinkaloud protocols: a new method for evaluating the validity of survey questions

Experimental thinkaloud protocols: a new method for evaluating the validity of survey questions. Patrick Sturgis National Centre for Research Methods (NCRM) and University of Southampton.

hao
Télécharger la présentation

Experimental thinkaloud protocols: a new method for evaluating the validity of survey questions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Experimental thinkaloud protocols: a new method for evaluating the validity of survey questions Patrick Sturgis National Centre for Research Methods (NCRM) and University of Southampton Paper presented at the New Measurement Issues in Survey Research meeting of the Survey Resources Network, 21 September 2010

  2. Do different questions measure the same thing? • Many important concepts are measured by different ‘standard’ questions in surveys: • Social/political trust • General health • Life happiness/satisfaction • Fear of crime/confidence in police • How to tell if they are ‘equivalent’? • How to tell which is the ‘best’ measure?

  3. Validity assessment strategies • Face/process validity • Correlation with criterion variables • Multi-trait-multi-method (MTMM) • Expert panels • Behaviour coding • Interviewer debrief • Thinkaloud protocols/cognitive interview

  4. Experimental thinkalouds • Randomly assign respondents to receive one or other version of the ‘same’ question • Follow-up with verbatim probe ‘what came to mind when answering last question?’ • Examine marginal distribution of cognitive frames by question type • Are people thinking of things they should be? • Use thinkaloud variables in regression model to predict earlier response • Which cognitive frames are most relevant in forming answers to the questions?

  5. Example 1 - Trust

  6. Conceptions of Trust • Trust is a ‘good thing’ • Trusting citizens are good citizens (voting, volunteering, civic engagement) • Trusting societies are good societies (more democratic, egalitarian, > economic performance) • Trust ‘lubricates’ social and economic transactions • Reduces ‘monitoring costs’ and the need for contracts etc.

  7. The standard trust question • Generally speaking, would you say that most people can be trusted, or that you can't be too careful in dealing with people? • Most people can be trusted • Can’t be too careful • Usually credited to Rosenberg (1959), the ‘Rosenberg Generalized Trust’ (RGT) item

  8. The Local Area Trust item • How much do you trust people in your local area? • a lot • a fair amount • not very much • not at all • Reflects Putnam’s emphasis on trust being a property of local areas

  9. Trust by Question type • These items are both used more or less interchangeably as measures of generalized trust • Yet, they yield very different estimates of trust at the national level. e.g.: • Social Capital Community Benchmark survey: 47% most people can be trusted; 83% trust people in local area ‘some’ or ‘a lot’ • UK Taking Part survey: 44% most people can be trusted; 74% trust ‘many’ or ‘some’ of the people in their local area • Why such a large discrepancy in generalized trust (trust in strangers)?

  10. Research Design • Ipsos-MORI general population omnibus survey • Random selection of small areas, quota controlled selection of individuals • n=989 (fieldwork, November 2007) • Respondents randomly assigned to RGT or TLA item • In answering the last question, who came to mind when you were thinking about ‘most people’/ ‘people in your local area’?

  11. Distributions for trust questions

  12. Who comes to mind by RGT

  13. Who comes to mind by TLA

  14. Who came to mind – both questions

  15. Explanatory Models 1

  16. Explanatory Models 2

  17. The science of well-being “Now is the time for every government to collect data on a uniform basis on the happiness of its population…every survey of individuals should automatically measure their well-being, so that in time we can really say what matters to people and by how much. When we do, it will produce very different priorities for our society. ” Layard 2010, Science.

  18. Survey measures of subjective well-being • Tend to ask about ‘happiness’ or ‘satisfaction’ with life • And treat these as if they are measuring the same concept

  19. Happiness = Satisfaction? • Yes – time-series models show same pattern of effects (Blanchlower and Oswald, 2002) • No – happiness and satisfaction correlated but not equivalent in European Values Survey (Gundelach and Kreiner 2004)

  20. Mode effects • Widely different estimates of well-being across different surveys • Could mode be an explanatory factor? • Being unhappy with your life is not socially desirable (people may over-state happiness to an interviewer) • Conti and Pudney (2008) find higher ratings of satisfaction in interviewer relative to self-administered questions

  21. Design • Ipsos-MORI face-to-face omnibus survey (quota sample), April 2010 • n=2033 • Respondents randomly allocated to: • interviewer administered life satisfaction • Self-administered life satisfaction • Interviewer administered happiness • Self-administered happiness

  22. Questions (from European Social Survey) All things considered, how happy would you say you are? Please answer using the scale on the card where 1 means ‘extremely unhappy’ and 10 means ‘extremely happy’. 1. Extremely unhappy . . 10. Extremely happy All things considered, how satisfied are you with your life as a whole nowadays? Please answer using the scale on the card where 1 means ‘extremely dissatisfied’ and 10 means ‘extremely satisfied 1. Extremely dissatisfied . . 10. Extremely satisfied

  23. Verbatims Now, thinking about your answer to the last question, please tell me what came to mind when thinking about your answer. There are no right or wrong answers; I just want you to tell me everything that came to mind in thinking about how happy you are. What else? PROBE FULLY

  24. Results 1 satisfaction = happiness?

  25. Raw distributions for happiness and satisfaction Mean=7.39 Mean=7.38

  26. Satisfaction v Happiness - distributions Pearson’s Chi Square, p=0.041

  27. Satisfaction v Happiness by sex Means Male = 7.43 Female = 7.34 p=0.047 p=0.394

  28. Results 2 mode effects

  29. Mode effect by question - means

  30. Mode effect by question - distributions p=0.209 p=0.015

  31. Question*mode*sex - means

  32. Question*mode*sex - distributions p=0.053 p=0.018 p=0.037 p=0.145

  33. Prediction model

  34. Verbatim responses

  35. Verbatim responses • Verbatim responses coded to a descriptive frame with 111 codes • These were then allocated to one of 14 thematic codes

  36. Thematic Codes

  37. Significant differences in thematic codes across questions

  38. Conclusions • great deal of heterogeneity in the frames of reference people use in answering trust questions • Acquaintances more trusted than strangers • Problematic to assume these questions measure generalized trust • Local area question should not be used interchangeably with standard trust item

More Related