1 / 12

Rater Reliability

Rater Reliability. How Good is Your Coding?. Why Estimate Reliability?. Quality of your data Number of coders or raters needed Reviewers/Grant Applications. For What Variables Do You Need Reliability Estimates?. Any variables with judgments Ratings of any kind

stian
Télécharger la présentation

Rater Reliability

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Rater Reliability How Good is Your Coding?

  2. Why Estimate Reliability? • Quality of your data • Number of coders or raters needed • Reviewers/Grant Applications

  3. For What Variables Do You Need Reliability Estimates? • Any variables with judgments • Ratings of any kind • Recordings, even of numbers or counts • Basically, all of them

  4. Data Collection (1) • 1 judge rates all targets. NA1. • 2 judges, each rates (different) half of the targets. More than 2, but each rates different targets. NA2. • 2 judges, each rate all targets. 3 or more, all rate all. Crossed design. Fixed Effects. • 4 judges, different pairs rate each targets – all targets by 2, but different 2 each target. 3 or more, not all rate all. Nested design. Random Effects.

  5. Data Collection (2) • Use a fully crossed design to estimate reliability (otherwise it will be hard to estimate and you have to hire help). Fully crossed is good for final data collection, too, but may not be feasible. • Use any design (crossed or nested) to collect real data. • Use proper estimate of reliability (fixed for crossed, random for nested, proper number of raters) for the design you finally used.

  6. Reliability Estimates • Kappa – categorical • Pearson r okay for two raters • Intraclass correlations – best for most continuous variables. Two kinds • ICC(2) – random – reduced for mean diffs • ICC(3) – fixed – means don’t count • (see Shrout & Fleiss, 1979) • Cronbach’s alpha = ICC(3,k)

  7. Estimation (1) • Use the data you collected to compute sums of squares for judge, target, and error. SAS GLM can do this for you. • Compute ICC(2,1) or ICC(3,1) depending on whether your design will be random (nested) or fixed (crossed) • Apply Spearman-Brown to estimate the reliability of your data.

  8. Estimation (2) • If you collected fully crossed data (all judges saw all targets for entire study), you can treat each rater as a column (item), and each target or study as a row (person), and then compute Cronbach’s alpha for those data as rater reliability index. [Alpha =ICC(3,k)]. • Can’t do that if raters and targets are not crossed.

  9. Illustration (1) 3 raters judge rigor of 5 articles using 1 to 5 scale.

  10. Illustration (2) SAS Input: One column for ratings, one for rater, one for target. SAS Program: GLM – rating equals rater, target, rater by target. SAS Output: sums of squares and mean squares for each. Source Type III SS Mean Square Rater 3.73 1.87 JMS Target 14.27 3.57 BMS Rater*Target 2.93 .37 EMS

  11. Illustration (3) Use mean squares to compute intraclass correlations.

  12. Illustration (4) Use Spearman Brown to estimate reliability of multiple raters and to estimate the number of raters needed for a desired level of reliability.

More Related