Lessons from High-Stakes Licensure Examinations for Medical School Examinations

Lessons from High-Stakes Licensure Examinations for Medical School Examinations Queens University 4 December 2008 Dale Dauphinee, MD, FRCPC, FCAHS

Background: FAME Course Goal today is to offer insights for those of you working at the undergraduate level – looking back on my two careers in assessment: Undergraduate Assoc. Dean and CEO of the MCC!

FAME Course Framework

Elements of Talk • Process: be clear on why are doing this! • Describe: assessment steps written down • Item design: key issues • Structure: clear where decision are made • Outcome: pass-fail or honours-pass-fail • Evaluation cycle: it is about improvement! • Getting into trouble • Problems in process: questions to be asked • Never ask them after the fact: ANTICIPATE • Prevention

Preparing a ‘Course’ Flow Chart • For whom and what? • What is the practice/curriculum model? • What method? • What is the blueprint and sampling frame? • To what resolution level will they answer? • Scoring and analysis • Decision making • Reporting • Due process HINT: Think project management! What are the intended steps?

Classic Assessment Cycle Desired Objectives or Attributes Performance Gaps Program Revisions Educational Program Assessment of Performance

1960 2000 Professional or clinical Authenticity Change in the Hallmarks of Competence - Increase Validity Practice assessment Clinical skills assessment Problem-solving assessment Knowledge assessment (adapted from van der Vleuten 2000)

Performance assessment in vivo: Undercover SPs, Video, Logs….. Does Performance assessment in vitro: OSCE, SP-based test….. Knows how (Clinical) Context based tests: MCQ, essay type, oral….. Knows Factual tests: MCQ, essay type, oral….. Shows how Climbing the Pyramid Does Shows how Knows how Knows

Traditional View Curriculum Assessment Teacher Student After van der Vleutin - 1999

An Alternative View Curriculum Assessment Teacher Student After van der Vleutin - 1999

Traditional Assessment: What, Where & How Traditional Tests/Tools at School Student-Trainee Assessment • Content: maps on to the domain and curriculum → to which the results generalize - basis of assessment • Where and who: within ‘set’ programs where candidates are in same cohort • Measurement: • Test or tool testing time is long enough to yield reliable results • Tests are comparable from administration to administration • Controlled environment – not complex • Can attribute differences to candidate? …and rule out ‘exam-based’ or error attribution • Adequate numbers per cohort Does content map to domain Test length = reliable Attributable to candidate? Are tests comparable? Ideal test or all these qualities!

PrincipleIt is all about the context and purpose your course, then intended use of the test score - or the program! ‘There is no test for all seasons or for all reasons’.

Written Tests: Designing Items Key Concepts

Principle ∴ The case ‘prompts’ or item stems mustcreate low level simulationsin the candidate’s mind about …. the performance situations that are about to be assessed …..

Classifying Constructed Formats • Cronbach (1984): defined constructed response formats as broad class of item formats where the response is generated by examinee rather than selected from a list of options. • Haladyna (1997): constructed response formats • High inference format • Requires expert judgment about a trait being observed • Low inference format • Are observing behaviour of interest: short answer; checklists

Low Inference Work sampling Done in real time In-training evaluations Provide rating later Mini-CEX Short answer Clinical orals: structured Essays (with score key) Key features (no menus) OSCEs at early UG level High Inference Work – 360’s OSCEs at grad level Orals (not ‘old’ vivas) Complex simulations Teams Interventions Case-based discussions Portfolios Demonstration of procedures Types of CR Formats* *Principle - All CR formats need lots of development planning: you can’t show up and wing it!

The CR format can provide Opportunity candidates to generate/create a response Opportunity to move beyond MCQs Response is evaluated by comparing response to pre-developed criteria Evaluation criteria have a range of values that are acceptable to the faculty of the course or testing body. CRs: other considerations - Writers/authors need training - Need CR development process - Need topic selection plan or blueprint - Need guidelines - Need scoring rubric and analysis → reporting - Need content review process - Need test assembly process - May encounter technical issues… What Do CRs Offer & What Must One Consider for Good CRs

Moving to Clinical Assessment Think of it as work assessment! Point: validity of scoring is key because the scores are being used to imply judge clinical competence in certain domains!

Clinical Assessment Issues Presentation Grid • Context: • Clinical Skills • Work Assessment • Overview: • Validating test scores • Validating decisions • Examples: • Exit (final) OSCE • Mini-CEX • Conclusion

Key Pre-condition #1 • What Is the Educational Goal? • And the level of resolution expected? • Have you defined the purpose or goal of the evaluation and the manner in which the result will be used? • Learning point: • Need to avoid Downing’s threats to validity • Too few cases/items (under representation) • Flawed cases/items (irrelevant variance) • If not – you are not ready to proceed!

Key Pre-condition #2: • Be Clear About Due Process! • Ultimately, if this instrument is an ‘exit’ exam or an assessment to be used for promotion, clarity about ‘due process’ is crucial • Samples: Student must know that he/she has the right to the last word; the ‘board’ has followed acceptable standards of decision-making; etc.

Practically in 2008, validity implies ... … that in the interpretation of a test score a series of assertions, assumptions and arguments are considered that support that interpretation! • ∴Validation is a pre-decision assessment - specifying how you will consider and the interpret the results as ‘evidence’ that will be used in final ‘decision-making’ ! • In simple terms: for student promotion – a series of conditional steps (‘cautions’) are needed – to document a ‘legitimate’ assessment ‘process’ • ∴ Critical steps for a ‘valid’ process leading to ultimate decision • i.e. make a pass/fail decision or provide a standing

General Framework for Evaluating Assessment Methods – after Swanson Evaluation: determining the quality of the performance observed on the test Generalization: generalizing from performance on the test to other tests covering similar, but not identical, content Extrapolation: inferring performance in actual practice from performance on the test Evaluation, Generalization, and Extrapolation are like links in a chain: the chain is only as strong as the weakest link 4

Kane’s ‘Links in a Chain’ Defense - after Swanson Evaluation Extrapolation Includes: Scoring and Decision-making Generalization 5

Scoring: Deriving the Evidence • Content validity: • Performance and work based tests • Enough items/cases? • Match to exam blueprint and ultimate uses • Exam versus work-related assessment point • Direct measures of observed attributes • Key: is it being scored by items or cases? • Observed score compared to target score • Item (case) matches the patient problem! • And the candidates’ ability!

Preparing the Evidence • From results to evidence: three inferences • Evaluate performance – get score • Generalize that to target score • Translate target score into a verbal ‘description’ • All three inferences must be valid • Process: • Staff role versus decision-makers responsibilities/role • Flawed items/cases • Flag unusual or critical events for decision-makers’ • Prepare analyses • Comparison data

Validating the Scoring - Evidence • Validation carried out in two stages • Developmental stage: process is nurtured, refined • Appraisal stage: real thing - trial by fire! • Interpretive argument • Content validity: how do scores function in various required conditions? • Enough items/cases? • Eliminate flawed items /cases

Observation of Performance With Real Patients Evaluation Extrapolation - if sees variety of patients Generalization 7

Objective Structured Clinical Examination (OSCE) - Dave Swanson Evaluation Extrapolation Generalization 10

Stop and Re-consider …. What were the educational goals? AND How will the decision be used?

The Decision-making Process • Standard setting • many methods • But keys are: • ultimate success • fidelity • care with which decision is executed is crucial – must be documented • Helpful Hint: can also use standard setting for defining faculty expectations for content and use - in advance of test!

The Decision-making Process • Generic steps: • exam was conducted properly; • results are psychometrically accurate and valid; • establish pass-fail point; • and consider each candidate’s results • Red steps require an evaluating process that is – • Deliberate and reflective • Open discussion • Black steps: decision • All members of decision-making board must be ‘in’ – or else an escalation procedure needs to be established – in advance!

OSCE MCC meeting steps Overview: how exam went Review each station Discussion Decision: use all cases Review results ‘in toto’ Decide on pass-fail point Consider each person: Decide pass-fail for specific challenging instances Award standing or tentative decision Comments Work-based: mini-CEX Six month rotation in PGY-1 Construction steps Sampling grid? Numbers needed Score per case Rating issues: Global (preferred) vs. Check-list Scale issues Examiner strategy Not same one Number needed Preparation Awarding standing: Pass-fail or one of several parameters? Comments Examples

Appeals vs. Remarking! • Again – pre-defined process • Tending to make a negative decision • Candidate’s right to last word before final decision • Where does that take place? Must plan this! • Differentiate decision-making from rescoring • Requires independent ‘ombudsperson’ • Other common issues

Delivering the News • Depends on the purpose and desired use • Context driven • In a high stakes situation at a specific faculty – may want two steps process • Tending - to negative decision: • Notion of right of the candidate to the last word before a decision is made: has right to provide evidence that addresses the board’s concerns • Final decision • Comments/queries?

Key Lessons: Re-cap • Purpose and use of result • Overview of due process – in promotion • Overview of Validity – prefer Kane’s approach • Scoring component of validity • Generalization and extra-polization • True score variance ↑ - and error variance ↓ • Interpretation/Decision-making components of validity • Know ‘due process’

Are you ready? • Are the faculty clear on the ultimate use and purpose of the test or exam? • How will you track the issues to be resolved? • Have you defined the major feasibility challenges at your institution – and plan! • Do you have a process to assure valid scoring and interpretation of the result? • Do you have support and back-up?

Summary and Questions Thank You!

References Clauser BE, Margolis MJ, Swanson DB. (2008). Issues of Validity and Reliability for Assessments in Medical Education. In Practical Guide to the Evaluation of Clinical Competence. Hawkins R. and Holmboe ES, eds. Publisher - Mosby Pangaro L, Holmboe ES (2008). Evaluation Forms and Global Rating Forms. In Practical Guide to the Evaluation of Clinical Competence. Hawkins R.& Holmboe ES, eds. Publisher - Mosby Newble D, Dawson-Saunders B, Dauphinee WD, et al: (1994). Guidelines for Assessing Clinical Competence. Teaching and Learning in Medicine 6 (3): 213-220. Kane MT. (1992). An Argument-Based Approach to Validity. Psychological Bulletin Validity. 112 (3): 527-535. Downing S. (2003) Validity: on the meaningful interpretation of assessment data. Medical Education 37:830-7 Norcini J. (2003) Work based assessment. BMJ 326:753-5 Smee S. (2003) Skill based assessment. BMJ 326: 703-6

Lessons from High-Stakes Licensure Examinations for Medical School Examinations

Lessons from High-Stakes Licensure Examinations for Medical School Examinations

Presentation Transcript

Medical standards and medical examinations

Mandatory Examinations

Occupational Medical Examinations

Workplace Examinations

Direct Examinations

APA School District Examinations

FROM EXAMINATIONS TO SCHOOL BASED ASSESSMENT...

IB Examinations

Hair Examinations

Examinations

Regulatory Examinations

School District Audits and Examinations

State Examinations

SEC Examinations

S4 Examinations

Workplace Examinations

Preparation Tips for Pre-Medical Examinations

Workplace Examinations

Medical examinations before surgical abortion

European Examinations

Workplace Examinations

Workplace Examinations