1 / 65

Evaluating Health-Related Quality of Life Measures

Evaluating Health-Related Quality of Life Measures. Ron D. Hays, Ph.D. UCLA GIM & HSR February 10, 2014 (9:00-11:50 am) HPM 214, Los Angeles, CA. Where are we now in HPM 214? http://hpm214.med.ucla.edu/. Introduction Profile Measures Preference-Based Measures Designing Measures

raine
Télécharger la présentation

Evaluating Health-Related Quality of Life Measures

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Evaluating Health-Related Quality of Life Measures Ron D. Hays, Ph.D. UCLA GIM & HSR February 10, 2014 (9:00-11:50 am) HPM 214, Los Angeles, CA

  2. Where are we now in HPM 214?http://hpm214.med.ucla.edu/ • Introduction • Profile Measures • Preference-Based Measures • Designing Measures • Evaluating Measures  • Use of Measures in HIV/AIDS • PROMIS/IRT • Course Review (Cognitive interview assignment due) • Final Exam (3/17/14)

  3. Four Levels of Measurement • Nominal (categorical) • Ordinal (rank) • Interval (numerical) • Ratio (numerical)

  4. Ordinal Scale • In general, how would you rate your health is … • Excellent? • Very good? • Good? • Fair? • Poor?

  5. Ordinal Scale • In general, how would you rate your health is … • 100 = Excellent? • 075 = Very good? [85] • 050 = Good? [60] • 025 = Fair? • 000 = Poor?

  6. Interval Scales • “Everyday” Temperature Scales • Fahrenheit • Centigrade • 20°C + 20° C = 40°C • 40° C ≠2 times as hot as 20°C A 4- year old is twice as old as a 2-year old. If you subtract 1 from both of their ages, then 4 becomes 3 and 2 becomes 1. The 4-year old is still twice as old as the 2-year old despite the new age values being 3 versus 1 (i.e., “0” no longer means zero years).

  7. Ratio Scales • Kelvin Temperature Scale (absolute 0) • Age • Days spent in hospital in last 30 days

  8. Measurement Range for HRQOL Measures Nominal Ordinal Interval Ratio

  9. Levels of Measurement and Their Properties

  10. Levels of Measurement and Their Properties

  11. Four Types of Data Collection Errors Coverage Error • Does each person in target population have an equal chance of selection? Sampling Error • Only some members of the target population are sampled. Nonresponse Error • Do people in the sample who respond differ from those who do not? Measurement Error • Inaccuracy in answers given to survey questions. 11

  12. Characteristics of Good Measures • Acceptability • Variability • Reliability • Validity • Interpretability

  13. Indicators of Acceptability • Response rate • Administration time • Missing data (item, scale)

  14. Variability • Responses fall in each response category • Distribution approximates bell-shaped “normal” curve (68.2%, 95.4%, and 99.6%)

  15. Reliability Reliability is the degree to which the same score is obtained for thing being measured (person, plant or whatever) when that thing hasn’t changed. • Ratio of signal to noise

  16. Observed Score is:

  17. Flavors of Reliability • Inter-rater (rater) • Need 2 or more raters of the thing being measured • Test-retest (administrations) • Need 2 or more time points • Internal consistency (items) • Need 2 or more items

  18. Reliability Minimum Standards 0.70 or above (for group comparisons) 0.90 or higher (for individual assessment) SEM = SD (1- reliability)1/2 95% CI = true score +/- 1.96 x SEM if z-score = 0, then CI: -.62 to +.62 when reliability = 0.90 Width of CI is 1.24 z-score units

  19. Hypothetical Ratings of Performance of Six Students in HPM 214 by Two Raters Using Excellent to Poor Scale [1 = Poor; 2 = Fair; 3 = Good; 4 = Very good; 5 = Excellent] 1= John (Good, Very Good) 2= Ida (Very Good, Excellent) 3= Di (Good, Good) 4= Claire (Fair, Poor) 5= Adriane (Excellent, Very Good) 6= Ara (Fair, Fair) (Target = 6 students; assessed by 2 raters)

  20. Kappa Coefficient of Agreement(Corrects for Chance)

  21. Cross-Tab of Ratings Rater 2

  22. Calculating KAPPA

  23. Guidelines for Interpreting Kappa

  24. Weighted Kappa(Linear and Quadratic) Wl = 1 – ( i/ (k – 1)) W q = 1 – (i2 / (k – 1) 2) i = number of categories ratings differ by k = n of categories Linear weighted kappa = 0.52; Quadratic weighted kappa = 0.77

  25. Intraclass Correlation and Reliability Model Reliability Intraclass Correlation One-way Two-way mixed Two-way random BMS = Between Ratee Mean Square N = n of ratees WMS = Within Mean Square k = n of items or raters JMS = Item or Rater Mean Square EMS = Ratee x Item (Rater) Mean Square 25

  26. 01 13 01 24 02 14 02 25 03 13 03 23 04 12 04 21 05 15 05 24 06 12 06 22 Two-Way Random Effects (Reliability of Performance Ratings) Students (BMS) 5 15.67 3.13 Raters (JMS) 1 0.00 0.00 Stud. x Raters (EMS) 5 2.00 0.40 Total 11 17.67 df Source SS MS 6 (3.13 - 0.40) = 0.89 2-way R = ICC = 0.80 6 (3.13) + 0.00 - 0.40

  27. Responses of Students to Two Questions about Their Health 1= John (Good, Very Good) 2= Ida (Very Good, Excellent) 3= Di (Good, Good) 4= Claire (Fair, Poor) 5= Adriane (Excellent, Very Good) 6= Ara(Fair, Fair) (Target = 6 students; assessed by 2 items)

  28. Two-Way Mixed Effects (Cronbach’s Alpha) 01 34 02 45 03 33 04 21 05 54 06 22 Respondents (BMS) 5 15.67 3.13 Items (JMS) 1 0.00 0.00 Resp. x Items (EMS) 5 2.00 0.40 Total 11 17.67 Source SS MS df 3.13 - 0.40 = 2.93 = 0.87 Alpha = ICC = 0.77 3.13 3.13

  29. Rating of 6 Students’ Health by 12 Family Members (2 per student) 1. John (fam1: Good, fam2: Very Good) 2. Ida (fam3: Very Good, fam4: Excellent) 3. Di (fam5: Good, fam6: Good) 4. Claire (fam7: Fair, fam8: Poor) 5. Adriane (fam9: Excellent, fam10: Very Good) 6. Ara (fam11: Fair, fam12: Fair) (Target = 6 students; assessed by 2 family members each)

  30. 01 13 01 24 02 34 02 45 03 53 03 63 04 72 04 81 05 95 05 04 06 12 06 22 One-Way ANOVA (Reliability of Ratings of Students) Respondents (BMS) 5 15.67 3.13 Within (WMS) 6 2.00 0.33 Total 11 17.67 Source MS SS df 3.13 - 0.33 = 2.80 = 0.89 1-way = 3.13 3.13

  31. Standardized Alpha for Different Numbers of Items and Average Inter-item Correlation Average Inter-item Correlation ( r ) Number of Items (k) .0 .2 .4 .6 .8 1.0 2 .000 .333 .572 .750.889 1.000 4 .000 .500 .727 .857 .941 1.000 6 .000 .600 .800.900 .960 1.000 8 .000 .666 .842 .924 .970 1.000 Alphast = k * r 1 + (k -1) * r

  32. Spearman-Brown Prophecy Formula ) ( N • alpha x alpha = y 1 + (N - 1) * alpha x N = how much longer scale y is than scale x

  33. Example Spearman-Brown Calculations

  34. Number of Items and Reliability: Three Versions of the Mental Health Inventory (MHI)

  35. Multitrait Scaling Analysis • Internal consistency reliability • Item convergence • Item discrimination

  36. Item-scale correlation matrix 36

  37. Item-scale correlation matrix 37

  38. Validity • Does instrument measure what it is supposed to measure? • A “validated” instrument is a holy grail

  39. Reliability and Validity

  40. Threats to Validity • Acquiescent Response Set • Socially Desirable Response Set

  41. Listed below are a few statements about your relationships with others. How much is each statement TRUE or FALSE for you? 1. I am always courteous even to people who are disagreeable. 2. There have been occasions when I took advantage of someone. 3. I sometimes try to get even rather than forgive and forget. 4. I sometimes feel resentful when I don’t get my way. 5. No matter who I’m talking to, I’m always a good listener. Definitely true; Most true; Don’t know; Mostly false; Definitely false

  42. Two Types of Validity • Content Validity • Includes face validity • Construct Validity • Many Synonyms

  43. Content Validity • Does the measure adequately represent the domain? • Do items operationalize concept? • Do items cover all aspects of concept? • Does scale name represent item content? • Face validity is extent to which measure “appears” to reflect what it is intended to • E.g., by expert judges or by patient focus groups

  44. Construct Validity • Do scores on a measure relate to other variables in ways consistent with hypotheses?

  45. Evaluating Construct Validity Cohen effect size rules of thumb (d = 0.2, 0.5, and 0.8): Small correlation = 0.100 Medium correlation = 0.243 Large correlation = 0.371 r = d / [(d2 + 4).5] = 0.8 / [(0.82 + 4).5] = 0.8 / [(0.64 + 4).5] = 0.8 / [( 4.64).5] = 0.8 / 2.154 = 0.371 (Beware r’s of 0.10, 0.30 and 0.50 are often cited as small, medium, and large.)

  46. Average HRQOL Scores for Comparison Groups and Deviation Scores for Patients With Chronic Conditions From Stewart AL Greenfield S, Hays RD, et al. Functional stth chronic conditions. JAMA 1989;262:907-913.

  47. Relative Validity Analyses • Form of "known groups" validity • Relative sensitivity of measure to important clinical difference • One-way between group ANOVA

  48. Relative Validity Example

  49. Responsiveness to Change • HRQOL measures should be responsive to interventions that changes HRQOL • Need external indicators of change (Anchors)

  50. Self-Report Indicator of Change • Overall has there been any change in your asthma since the beginning of the study? Much improved; Moderately improved; Minimally improved No change Minimally worse; Moderately worse; Much worse

More Related