Performance-Based Testing to Measure Ophthalmic Skills Using Computer Simulation

Performance-Based Testing to Measure Ophthalmic Skills Using Computer Simulation Authors John T. LiVecchi, MD William Ehlers, MD Lynn Anderson, PhD Assistant Clinical Professor Associate Professor Chief Executive Officer Drexel University College of Medicine University of CT Health Center Joint Commission on Allied Health and University of Central Florida University of Connecticut Personnel in Ophthalmology (JCAHPO) College of Medicine Director of Oculoplastic Surgery St. Luke’s Cataract & Laser Institute Overview JCAHPO is a non-profit, non-governmental organization that provides certification of ophthalmic medical assistants and performs other educational and credentialing services. JCAHPO is governed by a Board of Directors composed of representatives from participating ophthalmic organizations and a public member. (April, 2011) The authors have no financial interest in the subject matter of this poster.

Abstract Purpose To investigate the validity and reliability of an interactive computer-based simulation and test a computer automated scoring algorithm to replace clinical hands-on skill testing with live observers by assessing the knowledge and performance of ophthalmic technicians on clinical skills. Design Validity and reliability study of video-taped ophthalmic technicians’ performance of computer simulations on 12 clinical skills. Participants 50 JCAHPO candidates: Certified Ophthalmic Technician (COT®) or Certified Ophthalmic Medical Technologist (COMT®). Methods Tests were conducted to evaluate ophthalmic technician’s knowledge and ability to perform 12 ophthalmic skills using high fidelity computer simulations in July 2003 and again in August 2010. Performance checklists on technique and task results were developed based on best practices. A scoring rationale was established to evaluate performance using weighted scores and computer adapted algorithms. Candidate performance was evaluated by a computer-automated scoring system and expert evaluations of video-computer recording of skills tests. Inter-rater reliability of the instruments was investigated by comparing the agreement of the computer scoring and the rating of two ophthalmic professional raters on the scoring agreement of a process step and results between the computer and the raters . Computer and rater agreement for a particular step must be statistically significant by Chi-square analysis or a percentage of agreement of 90% or higher. Results Of 80 process steps evaluated in seven COT skills, 71% of the process steps were found to be in agreement (statistically significant by Chi-square or 90% agreement criteria); and 29% of the process steps were found to be suspect. Similarly, of five COMT skills with 86 process steps evaluated, 75% were in agreement and 25% of the process steps were suspect. Given the high degree of agreement between the raters and computer scoring, the inter-rater reliability was judged to be high. Conclusions Our results suggest that computer performance scoring is a valid and reliable scoring system. This research found a high level of correspondence between human scoring and computer-automated scoring systems.

Tasks Performed • Keratometry • Lensometry • Tonometry • Ocular Motility • Visual Fields • Retinoscopy • Refinement • Versions and Ductions • Pupil Assessment • Manual Lensometry with Prism • Ocular Motility with Prism • Photography with Fluorescein Angiography

Simulation Design • Standardized skill checklists were created based on best practices • Multiple scenarios were created for each skill, which were randomly administered. • Interactive arrows allow candidates to manipulate simulated equipment. • Fidelity (realistic & reliable) analysis assessed the degree to which test simulation required the same behavior as those required by the task. Necessary fidelity allows a person to: - Manipulate the simulation • Clearly understand where they are in performance • Demonstrate capability on evaluative criteria

Simulation Test Design Challenges Important considerations in the development of the simulation scoring included: • Accurate presentation of the skill through simulation • Presentation of correct alternative procedures • Presentation of incorrect alternative procedures: • Not performing the step correctly • Performing the steps out of order • Arriving at the wrong answer even if the correct process is used • Scoring: Differentiate exploration and intentional performance • Validation of all aspects of the simulation to ensure successful candidate navigation, usability, and fidelity • Candidate tutorial training to ensure confident interaction with simulated equipment and tasks on the performance test

Test Design, Simulation Scoring, and Rating • Candidate performance was evaluated on technique and results on each of the 12 ophthalmic tasks. • Procedural checklists were developed for all tasks based on best practices. Subject matter experts including ophthalmologists and certified ophthalmic technician job incumbents determined criteria for judging correct completion of each procedural step, and if steps were completed in an acceptable process order. (In some cases, the procedural step could be completed in any order and still yield a satisfactory process.) • Each step on performance checklists was analyzed to determine the importance of the step and a weighted point-value was assigned for scoring. These weighted checklists were then used by raters and the computer for scoring. • The values ranged from 6 points for a step considered to be important but have little impact on satisfactory performance; to 21 points for a step that was considered critical to satisfactorily completing the skill. A cut score was established for passing the skill performance. • Using the computer, candidates were tested on all skills. Candidate performance was scored by the computer and a video-computer recording was created for evaluation by live rater observation. • Computer automated scoring has a high correlation to live rater observation scoring. 1, 2 • The results were compared to determine the agreement between computer scoring and the scoring of professional raters using the same checklists. • The accuracy of the skills test results was also evaluated. Each task’s results were compared to professional standards for performing the skill for each scenario presented within the simulation.

Validity Analysis • Computer simulation validity measures included content, user, and scoring validity. • Measurement of the candidate’s ability to accurately complete the task was based on performance checklists. • To ensure that computer scoring and rater scoring was being done on the same candidate performance, each candidate’s performance of a computer simulation skill was recorded on video for viewing by the observers. • The scoring of the simulations was validated by comparing the candidate’s scores on each skill with job incumbent professional’s assessments of the candidate’s performance. • The raters were asked to evaluate whether the candidate performed each step correctly, and if the order of performing the steps was acceptable given the criteria presented in the checklist. • The computer scoring based on the criteria specified in scoring checklists was compared to ophthalmic professional’s judgments using the same checklists.

Data Analysis • Test validity was high with candidate pass rates over 80% on the various individual tasks. • Candidates were surveyed on their perceptions of the simulation’s accurate portrayal of clinical skills they perform for daily job performance. • The inter-rater reliability of the instruments was analyzed by comparing the computer scoring of the candidates to the ratings of the two ophthalmic professionals using the same checklist with a confidence interval of +/- 95%. • Scores generated by the computer and scores generated by each rater were entered into a database as exhibited in Table 1 (Slide 9). A representative sample task (keratometry) is displayed. • The scores for a test’s overall process steps and the accuracy of results were compared. • The decision rule used to determine the raters’ score which was compared with the computer score was as follows: • Scores of both raters had to agree with each other for a process step for a given candidate to be included in the analysis. • If the two raters did not agree, a third rater evaluated the process for final analysis. • Table 2 (Slide 10) indicates representative results for inter-rater reliability for three tasks with agreement between the computer scoring and the rater scoring. • Chi-square and percentage of agreement analyses were used to determine statistical significance.

Data Comparison of Computer Scoring and Rater Scoring Table 1

Agreement Between the Computer Scoring and the Rater Scoring Table 2

Results Validity • 90% of the candidates reported that the COT simulation accurately portrayed the clinical skills they perform for daily job performance. • 89% of the candidates reported that the COMT simulation accurately portrayed the clinical skills they perform for daily job performance. • The same scoring checklist was used by both the computer and raters to judge the candidate performance, assuring consistent and objective measurement rather than subjective judgment regarding candidate skills. Reliability • Of 80 process steps evaluated in seven COT skills, 71% of the process steps were found to be in agreement (statistically significant by Chi-square or 90% agreement criteria) and 29% of the process steps were found to be suspect. • Of five COMT skills with 86 process steps evaluated, 75% were in agreement and 25% of the process steps were suspect. • Given the high degree of agreement between the raters and computer scoring, the inter-rater reliability was judged to be high.

Discussion and Conclusions Discussion Computer simulations are now commonly used for education and entertainment. The key to incorporating new technologies to improve skills assessment is to formally incorporate automated scoring of individual performance steps identified in a checklist developed by subject matter experts and weighted with regard to importance of each step and performance of steps in the correct order when necessary. High fidelity computer simulations, with objective analysis of the correct completion of checklist steps and the determination of accurate test results, can provide accurate assessment of ophthalmic technicians’ clinical skills. Conclusion This comparative analysis demonstrates a high level of correspondence between human scoring and computer-automated scoring systems. Our results suggest that computer performance scoring is a valid and reliable system for assessing the clinical skills of Ophthalmic Technicians. This research further supports that computer simulation testing improves performance-based assessment by standardizing the examination and reducing observer bias. These findings are useful in evaluating and improving the training and certification of ophthalmic technicians. References 1. Williamson, David M., Mislevy, Robert J. and Bejar, Isaac I. Automated Scoring of Complex Tasks in Computer Based Testing: An Introduction. Mahwah, NJ: Lawrence Erlbaum Associates, Inc., 2006. 2. Yang, Y., Buckendahl, C.W., Juszkiewica, P.J., & Bhola, D.S. (2002). A review of strategies for validating computer automated scoring. Applied Measurement in Education, 15 (4), 391.

Performance-Based Testing to Measure Ophthalmic Skills Using Computer Simulation

Performance-Based Testing to Measure Ophthalmic Skills Using Computer Simulation

Presentation Transcript

Using Theme-Based Training to Teach Computer Skills to the Public

Online Testing: Transitioning from Paper to Computer‐Based Testing

2010-11 Computer-Based Testing Update Tara Gardner, Computer-Based Testing Coordinator

Broad-Based Office Skills Testing

Using Scorecards to Measure Safety Performance

Preparing for Computer-based Testing

Simulation-Based Curriculum to Teach Procedural Skills to ...

How to measure Multi-Instruction, Multi-Core Processor Performance using Simulation

Using simulation to develop critical transferable skills

How to measure Multi-Instruction, Multi-Core Processor Performance using Simulation

STAR Computer Based Testing Tryout

Introduction to Spring 2013 Computer-Based Testing

Using Metrics to Measure Contractor Performance:

Introduction to Spring 2014 Computer-Based Testing

Computer Based Testing Planning

Using Patient Simulation to Facilitate Students’ Clinical Skills

2010-11 Computer-Based Testing Update Tara Gardner, Computer-Based Testing Coordinator

Internet Based Testing (IBT) and Computer Based Testing (CBT)

Performance Measure

FROM SIMULATION TO VR BASED SIMULATION AND TESTING

Using Skills-Based Volunteers

Communication Skills Using Simulation for Physicians