1 / 19

www.enac.fr

11th ICAEA Forum: Rating for ICAO Language Proficiency Standards ENAC 6th-7th September 2011. Inter-rater Reliability. École nationale de l’aviation civile • The French Civil Aviation University. John Kennedy. www.enac.fr. Presentation Outline. Remote rating Recruitment and Training

ayita
Télécharger la présentation

www.enac.fr

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 11th ICAEA Forum: Rating for ICAO Language Proficiency Standards ENAC 6th-7th September 2011 Inter-rater Reliability École nationale de l’aviation civile • The French Civil Aviation University John Kennedy www.enac.fr

  2. Presentation Outline Remote rating Recruitment and Training Measures of Inter-rater Reliability Inter-rater reliability: Data obtained Towards further harmonization

  3. What is Remote Rating ? All tests are recorded Sound files are sent to remote raters Sound files evaluated the following week

  4. Remote rating for the MTF_ALP The MTF (ENAC test approved by DGAC for ATCO’s) Interlocuter elicits a thirty-minute rateable sample: - Section 1: Picture / Photo - Section 2: Listening to Pilot messages - Section 3: Short text (prompt for discussion) Administrator distributes recorded sound files to: - two separate raters who rate independently - a third rater (in the case where raters disagree)

  5. Advantages of Remote Rating Procedure • Voice only • Objectivity / Anonymity (Protection) • Reduction of test anxiety • Flexible work schedules • Continuous monitoring of interlocutor behaviour • Continuous monitoring of individual rater performances

  6. Disadvantages of Remote Rating Procedure: Only one that we have experienced so far: Time required to produce results To be taken into consideration: Administration involved Cost of the procedure Loneliness of the rating job

  7. Recruitment and Training of Team Started with team of eight raters (aviation English teachers) All experienced with language testing and ICAO scale Initial two-day training / harmonization session (April 2009) Regular refresher training (at least once a year) Ad hoc Feedback provided All raters rate regularly (4-8 tests per month) Initial Training of operational raters (October 2010) Candidates rated by one language teacher and one operational rater Commitment of individual raters to the project

  8. Initial measures of Inter-rater Reliability * Before rater training **After rater training

  9. Rater Training Initial rater training took place after completion of the development / trialling phases. Initial rater training was based on ICAO speech samples (2005 DVD) and the corresponding MTF benchmark samples.

  10. Overall evaluation of rater performance Last year 29 out of 100 recordings were sent to a third rater for evaluation (who then gave the ‘correct’ level) In 29 out of 100 tests, one rater gave an ‘incorrect’ level

  11. Performance of Individual Raters (2009 / 2010)

  12. Performance of Third Raters (2010 / 2011)

  13. Language Teachers / Operational Raters Differences between Raters 2010/2011

  14. Understanding and improving the Procedure X X (----------level 4-----------)(-----------level 5-----------) X X (----------level 4-----------)(-----------level 5-----------) In which case above do the two raters demonstrate a marked difference ?

  15. Example 100 expert raters independently rate a candidate who is right on the boundary between levels 4 and 5. X (----------level 4-----------)(-----------level 5-----------) How many raters will give level 4 ? Level 5 ? What is the ‘correct’ level ?

  16. Borderline Cases – Further Understanding It is in such cases that ‘disagreement’ is to be expected. Thanks to remote rating, the procedure is subject to continuous monitoring to ensure the validity of the final results. Where ‘disagreement’ exists, the candidate will have the benefit of three entirely independent and subjective evaluations.

  17. Non-Borderline Cases Interrater reliability was checked for non-borderline cases in the MTF and found to be higher than 95% This was done by choosing candidates that appeared to be right in the middle of the respective bands X X X (-----------level 3-----------)(----------level 4-----------)(-----------level 5-----------) The corresponding sound files were then sent to all raters (acknowledgement: Sergey Melnichenko)

  18. Towards further harmonization The ICAO Speech Samples Rater Training Project: use in the next refresher training session Contact with other test providers: cross-rating Further and more detailed analysis of borderline cases: defining the borders

  19. Contacts john.kennedy@enac.fr scott.stroud@enac.fr michael.odonoghue@enac.fr Thank you!

More Related