1 / 47

Computerised Marking of Free-Text Intelligent Assessment Technologies Ltd. April 2003

Computerised Marking of Free-Text Intelligent Assessment Technologies Ltd. April 2003. Overview. Introduction and background. The challenge of marking free-text responses. Computerised marking of free-text responses. IAT Projects. ExamOnline. Key Stage 2 Science National Tests.

miya
Télécharger la présentation

Computerised Marking of Free-Text Intelligent Assessment Technologies Ltd. April 2003

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computerised Marking of Free-Text Intelligent Assessment Technologies Ltd. April 2003

  2. Overview • Introduction and background. • The challenge of marking free-text responses. • Computerised marking of free-text responses. • IAT Projects. • ExamOnline. • Key Stage 2 Science National Tests. • English Comprehension. • Testing Medical Knowledge.

  3. About IAT • The focus of the company is thecomputerised marking of free-text. • These questions are prevalent in UK education. • Approx 70% of UK exam questions require short free-text answers. • We call our marking engine AutoMark.

  4. IAT Projects Recent projects for : • SQA (Scottish Qualifications Authority). • QCA (Qualifications and Curriculum Authority). • University of Dundee, Faculty of Medicine. • nferNelson (part of Granada Learning). • Scottish Education Authorities. • www.ExamOnline.co.uk

  5. A Generic Technology

  6. Marking Complexity Electronic marking versus human marking.

  7. The Challenge of Marking Free-Text Responses

  8. Open-Ended Questions What sort of questions are we talking about ? Science • What will happen to the bulb when the switch is closed ? • Why are some flowers highly scented with brightly coloured petals ? • Explain what leaves do to help trees grow well. English Comprehension • Those were the days when children faded quickly. What is meant by faded in this sentence ? Medical Exams • What movement is affected by rupture of the supraspinatus tendon ?

  9. A Paper-Based Mark Scheme What does a paper-based mark scheme look like ? • What happens to the mass of a candle when it burns ? • - Award ONE mark for a description of the mass decreasing: • the mass decreased / got smaller / went down / got lower / got less / went small. • - Do not give credit for: it got shorter.

  10. Free-Text Responses What sort of responses do we actually get ? What happens to the mass of a candle when it burns ? Predictable answers: 1 it decreases 1 it got less. 1 It goes down. 1 it got lighter. 1 It got lower. 1 the mass decreased Unusual answers: 1 The mass of the candle reduced. 1 the mass changed by going down. 1 it went down 2grams every 30minites. 1 The mass reduced until there was only liquid. 1 The mass of the candle shrank each time it was measured, because the heat was melting the wax. And these… 1 the mas of canld wint dowwn wen the canld is burning 0 The mass gets smaller and bigger. 1 It fell down-slanted. 0 It didn’t go down. 1 it was geting ligther.

  11. Marking Free-Text Responses By Computer

  12. Computerised Marking • How do we mark free-text responses by computer ? • IATs Marking Engine does not operate on raw text, but on the output of a sentence analyser.

  13. A Computerised Mark Scheme How do we represent the mark scheme ? • Each mark scheme answer is represented as a template. • Each template specifies one particular form of acceptable or unacceptable answer.

  14. A Computerised Mark Scheme So what is a computerised mark scheme ? • A computerised mark scheme consists of a number of mark scheme templates. • Each template can match many different correct responses. • Many different templates are developed for each question.

  15. Building a Mark Scheme How do we develop a computerised mark scheme ?

  16. Marking The Responses The output of the sentence analyser is compared with the computerised mark scheme and a mark is awarded :

  17. IAT Projects. ExamOnline.

  18. AutoMark and ExamOnline • AutoMark was originally developed to mark actual National Test questions in Science. • AutoMark is in daily use in schools in England and Wales : www.ExamOnline.co.uk

  19. ExamOnline • National Test questions look the same as on the real exam. • Answers in full text sentences. • Instant online marking and reporting.

  20. IAT Projects. Key Stage 2 Science National Tests. Collaborative study with Centre for Research into Primary Science and Technology (CRIPSAT) - Professor Terry Russell.

  21. Background CRIPSAT : • Test development agency for national curriculum assessment of science for pupils at age 11 during 1995-99. • Conducted an annual qualitative re-marking of responses. Collaboration : • A brief quantitative and qualitative study of the performance of AutoMark in marking a range of free-text responses.

  22. Overview. The Study : • KS2 Science at age 11. • Domain where errors in spelling, syntax, and semantics are at their most frequent. • “to etract the flys and other creatures” • “Because they want to atracted bugs” • “The more the force the more ferther the car will travel” • “it affects the distance bucuse the biger force and futher it goes back the futher it goes”

  23. The Experiments Performance of AutoMark measured using four KS2 Science National Test questions from 1999 paper. • 120 responses were randomly selected for each item. • Hand-written pupil responses were faithfully transcribed. • Marks awarded by software compared to marks awarded by human markers. • Two separate experiments carried out : • Blind, and Moderation.

  24. Experiments – The Questions. Four items of varying degrees of open-endedness were selected, requiring : • Single word generation. • Single value generation. • Generation of a short explanatory sentence. • “Why are some wild flowers highly scented with brightly coloured petals?” • Description of a pattern in data (2 marks). • “Describe how the size of the starting force affects the distance moved by the car. ”

  25. The Blind Experiment. • Computerised mark schemes developed using QCA mark scheme as a guide. • Student responses submitted to AutoMark by CRIPSAT using a ‘blind’ clerical procedure. • Performance measured against human markers.

  26. The Moderation Experiment. • Computerised mark schemes are moderated using test data. • Account for unexpected but allowable responses / synonyms / phraseology. • Student responses re-submitted to AutoMark. • Performance measured against human markers. • Objective : • Identify system errors (not addressable by moderation).

  27. Results – Overview.

  28. System Errors. Full paper available (email for a copy).

  29. IAT Projects. English Comprehension – A Pilot.

  30. Background to the Project Computerised marking of free-text responses to English Comprehension items at ages 9 and 13.The study addressed : • Marking performance with different question types. • Investigation of optimum sample size for configuration of the marking engine.

  31. The Teachers Guide Mark Schemes • Single-word responses : the majority of acceptable answers are specified in the Teachers Guide. • Short-phrase responses : Teachers Guide clearly specifies the meaning of correct answers, although students may phrase responses in many ways. • Explanatory answers : Human markers use professional judgement when applying the Teachers Guide marking guidelines to individual student responses. • Student responses were faithfully transcribed from original hand-written student scripts. There were 216 to 288 scripts available for each of 22 items.

  32. The Responses Short-phrase responses. Qu. You can tell the cyclist…was not a careful rider because. 0 he could not see in the dark because he had no light, 0 he crashed into Laurie, 0 he didnt ware eany cloths that sow up in the dark, 1 he also crashed into Laurie’s mother, 1 he also hit his mother, 1 he crashed into his mum, 1 he had already knock down laurie’s mother, 1 he had hit his mother too, 1 he had hit his mum, 1 he had nocked down my mum, 1 he hit bothe Laurie and his mother, 1 he knocked down two people, 1 the cyclist was cycling at night and knocked down his mother too

  33. The Responses Explanatory sentence responses. Qu. What does plonked tell you about the way he put the cases down ? 0 banged them down 0 droped it gently 0 he didn’t put them down carefully 0 he frou our cases down beside us 0 He just let them go and they fell on the suitcase carrier 0 He poot the cases like Ban 0 He thlung them down hard 0 It tells me that he just troped them 1 He droped them but ment it and plonch 1 He droped them hevily 1 He just trough them down 1 He kind of threw them 1 He through thir suitecases on the ground 1 It means you just dump them down hard on the ground 1 Plonked is throw 1 Plonked tells me the way they were put down is droped 1 That he thore then down 1 that men’s that dropt 1 plonk mean’s if you through something down

  34. The Responses Explanatory sentence responses. Qu. How did Laurie behave when he had pneumonia ? 0 he acted as though he was cursed. Like a primadonna 0 He behaved like it was a big show and every body should watch him 1 He ‘milked it’, he got all he could out of it, he played it up 1 He a over exagurated his illness 1 He acted as though it was worse than it actually was so that he could get attention 1 He acted like he was really really ill and that he was going to die when he was O.K. and was going to live 1 He always made a mountain out of a mole hill when he had pneumonia 1 He behaved like it was much biggar than it actually was 1 he behaved differently. always making a big fuss over nothing 1 He behaved very badly he over reacted 1 He made a big deal about it 1 He made a song and dance of it by trying to get as much sympathy as possible

  35. Marking Accuracy Single-word responses.

  36. Marking Accuracy Explanatory sentences.

  37. IAT Projects. Dundee Medical School.

  38. Background. • Medical School at the University of Dundee. • Learning outcome based formative test of medical knowledge – Progress Test. • 3 hour test, rapid turn-around required – feedback strengths and weaknesses to students. • Items are short-answer free-text.

  39. The Paper-Based System. • 150 students per academic year, 750 students in total. • Test comprised of 270 short-answer free-text items. • Human marking takes approximately 300 man-hours of academic and consultants time.

  40. A Computerised Pilot. • The pilot ran in November 2002. • 25 items, approximately 30 students. • Assessing test interface and marking accuracy. • Marking accuracy analysed by Dundee. • > 98%.

  41. The Computerised System. • Web-based delivery of test. • 8 items per page. • Randomised delivery of items. • Roll-out April 2003. • Year 2 and Year 3 tested April 14th / 15th.

  42. Computerised Marking. • Marking carried out in batch after test completion. • Marking takes approximately 30 seconds per student per test on a 2.4GHz PC. • Year 2 and year 3, 300 students marked in approx 2½ hours.

  43. Moderation. • The system enables human moderation of computerised marking. • Particularly for new questions – no sample scripts available. • Interface supports rapid moderation.

  44. Remaining on Dundee Project. • Testing of another 450 students. • Adding functionality to moderation interface. • Assessment of marking accuracy on a per-item basis. • Preparation for next years test – new items.

  45. Conclusions • IAT have demonstrated the feasibility of computerised marking of free-text. • The technology can find application in both low and high stakes assessment. • The technology is now being used in Primary and Higher Education. • www.IntelligentAssessment.com

More Related