1 / 30

Establishing the Reliability and Validity of Outcomes Assessment Measures

Establishing the Reliability and Validity of Outcomes Assessment Measures. Anthony R. Napoli, PhD Lanette A. Raymond, MA Office of Institutional Research & Assessment Suffolk County Community College http://sccaix1.sunysuffolk.edu/Web/Central/IT/InstResearch/. Validity defined.

tieve
Télécharger la présentation

Establishing the Reliability and Validity of Outcomes Assessment Measures

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Establishing the Reliability and Validity of Outcomes Assessment Measures Anthony R. Napoli, PhD Lanette A. Raymond, MA Office of Institutional Research & Assessment Suffolk County Community College http://sccaix1.sunysuffolk.edu/Web/Central/IT/InstResearch/

  2. Validity defined • The validity of a measure indicates to what extent items measure some aspect of what they are purported to measure

  3. Types of Validity • Face Validity • Content Validity • Construct Validity • Criterion-Related Validity

  4. Face Validity • It looks like a test of *#%* • Not validity in a technical sense

  5. Content Validity • Incorporates quantitative estimates • Domain Sampling • The simple summing or averaging of dissimilar items is inappropriate

  6. Indicated by correspondence of scores to other known valid measures of the underlying theoretical trait Discriminant Validity Convergent Validity Construct Validity

  7. Represents performance in relation to particular tasks of discrete cognitive or behavioral objectives Predictive Validity Concurrent Validity Criterion-Related Validity

  8. Reliability defined • The reliability of a measure indicates the degree to which an instrument consistently measures a particular skill, knowledge base, or construct • Reliability is a precondition for validity

  9. Types of Reliability • Inter-rater (scorer) reliability • Inter-item reliability • Test-retest reliability • Split-half & alternate forms reliability

  10. Validity & Reliability in Plain English • Assessment results must represent the institution, program, or course • Evaluation of the validity and reliability of the assessment instrument and/or rubric will provide the documentation that it does

  11. Content Validity for Subjective Measures • The learning outcomes represent the program/course (domain sampling) • The instrument addresses the learning outcomes • There is a match between the instrument and the rubric • Rubric scores can be applied to the learning outcomes, and indicate the degree of student achievement within the program/course

  12. Inter-Scorer Reliability • Rubric scores can be obtained and applied to the learning outcomes, and indicate the degree of student achievement within the program/course consistently

  13. Content Validity for Objective Measures • The learning outcomes represent the program/course • The items on the instrument address specific learning outcomes • Instrument scores can be applied to the learning outcomes, and indicate the degree of student achievement within the program/course

  14. Inter-Item Reliability • Items that measure the same learning outcomes should consistently exhibit similar scores

  15. Objective I II III IV Description Write and decipher chemical nomenclature Solve both quantitative and qualitative problems Balance equations and solve mathematical problems associated w/ balanced equations Demonstrate an understanding intra-molecular forces Content Validity (CH19) A 12-item test measured students’ mastery of the objectives

  16. Content Validity (CH19)

  17. Objective I II III Description Identify the basic methods of data collection Demonstrate an understanding of basic sociological concepts and social processes that shape human behavior Apply sociological theories to current social issues Content Validity (SO11) A 30-item test measured students’ mastery of the objectives

  18. Content Validity (SO11)

  19. Content Validity (SO11)

  20. Drawing Design Technique Creativity Artistic Process Aesthetic Criteria Growth Portfolio Presentation Scale: 5 = Excellent 4 = Very Good 3 = Satisfactory 2 = Unsatisfactory 1 = Unacceptable Inter-Rater ReliabilityFine Arts Portfolio

  21. Inter-Rater ReliabilityFine Arts Portfolio

  22. Inter-Item Reliability (PC11) Objective Description Demonstrate a satisfactory knowledge of: 1. the history, terminology, methods, & ethics in psychology 2. concepts associated with the 5 major schools of psychology 3. the basic aspects of human behavior including learning and memory, personality, physiology, emotion, etc… 4. an ability to obtain and critically analyze research in the field of modern psychology A 20-item test measured students’ mastery of the objectives

  23. Embedded-questions methodology Inter-item or internal consistency reliability KR-20, rtt = .71 Mean score = 12.478 Std Dev = 3.482 Std Error = 0.513 Mean grade = 62.4% Inter-Item Reliability (PC11)

  24. Inter-Item Reliability (PC11)Motivational Comparison • 2 Groups Graded Embedded Questions Non-Graded Form & Motivational Speech • Mundane Realism

  25. Inter-Item Reliability (PC11)Motivational Comparison • Graded condition produces higher scores (t(78) = 5.62, p < .001). • Large effect size (d = 1.27).

  26. Inter-Item Reliability (PC11)Motivational Comparison • Minimum competency 70% or better • Graded condition produces greater competency (Z = 5.69, p < .001).

  27. Inter-Item Reliability (PC11)Motivational Comparison • In the non-graded condition this measure is neither reliable nor valid KR-20N-g = 0.29

  28. Criterion-Related Concurrent Validity (PC11)

  29. “I am ill at these numbers.” -- Hamlet --

  30. “When you can measure what you are speaking about and express it in numbers, you know something about it; but when you cannot express it in numbers, your knowledge is of a meager and unsatisfactory kind.” -- Lord Kelvin -- “There are three kinds of lies: lies, damned lies, and statistics.” -- Benjamin Disraeli --

More Related