1 / 41

Listening tests: past, present and future John Field, CRELLA, University of Bedfordshire

Listening tests: past, present and future John Field, CRELLA, University of Bedfordshire. Language Testing Forum 2013, Nottingham. A problematic skill. Difficult to test because it is an extremely individual operation in terms of both listener and input.

Télécharger la présentation

Listening tests: past, present and future John Field, CRELLA, University of Bedfordshire

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Listening tests: past, present and futureJohn Field, CRELLA, University of Bedfordshire Language Testing Forum 2013, Nottingham

  2. A problematic skill Difficult to test because it is an extremely individual operation in terms of both listener and input. • Internalised. Takes place in the mind of the test taker. • Highly variable signal. Variable at the levels of phoneme –word – speaker. University of Bedfordshire

  3. The value of a cognitive approach • It sheds light on what goes on in the mind of the test taker. • We need to know whether high-stakes test actually test what they claim to test. Can a listening test, for example, accurately predictthe ability of a test taker to study at an English medium university? • At local level, we need to use tests to diagnose learner problemsso that the tests can feed into learning. This is especially true of listening.

  4. Cognitive validation asks… • Does a test elicit from test takers the kind of process that they would use in a real-world context? In the case of listening, are we testing the kinds of process that listeners would actually use ? • Or do the recordings and formats that we use lead test takers to behave differently from the way they would in real life?

  5. Phases of listening (Field 2008, 2013) Input decoding Speech signal Words Meaning Lexical search Parsing Meaning construction Discourse construction

  6. Issues of cognitive validity • A. To what extent do the processes elicited by a test resemble real-world processes? • B. To what extent are the processes elicited by a test comprehensive enough to represent the range of processes that make up a skill? • C. Are the processes finely enough calibrated to reflect what a listener is capable of at the target level? University of Bedfordshire

  7. The ghost of listening past: 1913-1974 University of Bedfordshire

  8. Word identification Tick the word you hear the examiner say: [ ] hide [ ] heard [ ] hard [ ] hoard Test taker hears: I heard her telling him Test taker chooses: A heard B hurt C hot D hotel Test taker hears: It’s hot all day long Test taker chooses: A heard B hurt C hot D hotel [Lower Certificate in English, 1972, quoted Weir 2013] [ University of Bedfordshire

  9. A cognitive perspective • Only taps into lowest two levels of processing (phoneme recognition – lexical FORM) • Role of the phoneme as a perceptual unit has been much questioned. Processing is now viewed as taking place at multiple levels (including top-down word level matches that overrule phoneme level information: the veshtable effect) • And yet: We still use items based on minimal pair phoneme perception in lower-level and YL tests: The porter said that the train leaves at • A 9.15 B 9.50 C 5.15 D 5.50 University of Bedfordshire

  10. Dictation • Fear seized him / in the woods. / At one moment / it seemed to him / that enemy soldiers / were watching him / from behind the trees, / crawling out of the bushes./ He ran blindly, paying no attention / to the path / until he was out of breath. Lower Certificate of English, June 1945 (quoted Weir, 2013) • The passage will be read three times. During the first reading the candidates will write nothing down. It will be read a second time by groups of words, as divided by bars on the printed copy.… After each group, a pause will be made to allow the candidates to write it down. All essential punctuation will be given by the examiner University of Bedfordshire

  11. From a cognitive perspective • A classic divided attention task (writing vs speaking). Conversion from one modality (speech) to another (writing) • Little resemblance to any real-life listening task • Natural processing (Jarvella, 1971) entails assembling words in order to parse them, then erasing them once they have been converted into a piece of information. Dictation requires test takers to hang on to words beyond the end of the phrase / clause. • Encourages test takers to focus attention at word level. This reinforces a tendency among listeners at B1 and below to focus on discrete words rather than chunks. University of Bedfordshire

  12. And yet…. • Dictation takes the spoken word as its point of departure unlike today’s formats that rely heavily on written items • Present-day tasks such as gap filling also entail divided attention effects. • Dictation taps into lexical segmentation where the listener has to detect word boundaries in connected speech • Conclusion. There might be value in including in lower level tests the transcription of clips of authentic speech. Such tests would show • a) the ability to segment words in connected speech • b) whether test takers can process words in chunks rather than just singly (a mark of progress towards B2 level) University of Bedfordshire

  13. The ghost of listening present: ‘comprehension’ in listening and reading University of Bedfordshire

  14. Listening test components • Recording • Recording as text • Format • Items University of Bedfordshire

  15. Recordings Does the input impose similar listening demands to those of a real-world speaker?

  16. Natural speech ( Recording Level B2) • To what extent do these recordings resemble authentic everyday speech?

  17. Some conclusions on studio recordings • Actors adapt their delivery to fit punctuation. • They pause regularly at the ends of clauses • There are few hesitation pauses. • No overlap between speakers

  18. Solution: transcribe the speech as speech • M1: the long lunch hour has been replaced by the quick snack + according to a new survey ++ most people take just 30 minutes to eat in the middle of the day + many of us don’t even leave our desks • F: er I’m taking an hour today + but it’s normally sort of half an hour or 20 minutes. • M2: I pop out for about ten minutes + get something to eat + and then go back to my desk • M1: a survey at the start of the year + found that only one per cent of people in Britain + regularly take a full 60 minute break + this is very different from forty years ago + when offices everywhere stopped work at one o’clock + people went out to lunch + and didn’t return until two. • [loosely based on BBC Radio 4 broadcast]

  19. Solution: Specify speaker variables for item writers • Accent • Speech rate: speed and consistency • Pausing • Level and placing of focal stress • Number of speakers • Pitch of voice; familiarity of voice • Precision of articulation

  20. Recording-as-text Is the recording content at an appropriate level for the expertise of the listener? Format Does the task elicit processes which resemble those that a listener would use in a real-world listening event?

  21. Recording You hear a man and a woman talking about going to the gym. What does the man say about going to the gym? A. It is too expensive for him B. It takes too much of his time. C. It is too physically demanding (FCE Handbook, 2008: 68)

  22. Recording as text Woman: So that didn’t last long, did it? Two weeks going to the gym and you’re already talking about giving it up… Man: Look, if you’re saying I’m not up to it, you’re wrong. I realise it’s very effective in working every muscle, and when I get started, it’s just like other sports. I don’t even mind feeling exhausted at the end. But, listen, you sort out your kit at home, lug it to the gym, queue to pay your entrance fee, then change and queue for the machines … when you could have been for a run straight from your home and then been free to get on with your life. Woman: Well, I think you’re wrong and you should make the effort to carry on.

  23. Recording as text 2 Woman: So that didn’t last long, did it? Two weeks going to the gymand you’re already talking about giving it up… Man: Look, if you’re saying I’m not up to it, you’re wrong. I realise it’s very effective in working every muscle, and when I get started, it’s just like other sports. I don’t even mind feeling exhaustedat the end. But, listen, you sort out your kit at home,lug it to the gym, queue to pay your entrance fee, then change and queue for the machines… when you could have been for a run straight from your home and then been free to get on with your life. Woman: Well, I think you’re wrongand you should make the effort to carry on.

  24. Recording as text 2 Woman: So that didn’t last long, did it? Two weeks going to the gym and you’re already talking about giving it up… Man: Look, if you’re saying I’m not up to it, you’re wrong. I realise it’s very effective in working every muscle, and when I get started, it’s just like other sports. I don’t even mind feeling exhausted at the end. But, listen, you sort out your kit at home, lug it to the gym, queue to pay your entrance fee, then change and queue for the machines … when you could have been for a run straight from your home and then been free to get on with your life. Woman: Well, I think you’re wrong and you should make the effort to carry on.

  25. Recording as text • Test setters tend to base their tests on a written script which has not yet been recorded. • The linguistic criteria they employ rely heavily on lexical frequency and syntactic simplicity. • BUT in processing terms difficulty is often caused by: • a. the density of ideas and the complexity of the links between them • b. perceptual saliency of phrases and clauses University of Bedfordshire

  26. Recording difficulty: cognitive criteria • How frequent is the vocabulary? • How complex is the grammar? • How familiar is the topic? • How long is the recording? • How dense are the idea units in the recording? • How complex are the connections between idea units? • How clearly structured is the overall line of argument? • How concrete or abstract are the points made?

  27. Using conventional tasks • Provide items after a first playing of the recording and before a second. This ensures more natural listening, without preconceptions or advance information other than general context. • Keep items short. Loading difficulty on to items (especially MCQ ones) just biases the test in favour of reading rather than listening. • Favour tasks (e.g. multiple matching) that allow items to ignore the order of the recording and to focus on global meaning rather than local detail.

  28. Items Do the items target a sufficiently wide range of levels of processing? University of Bedfordshire

  29. Five phases of listening (Field 2008) Decoding Speech signal Words Meaning Word search Parsing Meaning construction Discourse construction

  30. Targets An item in a test can target any of these levels: • Decoding: She caught the (a) 9.15 (b) 9.50 (c) 5.15 (d) 5.50 train. • Lexical search: She went to London by ……. • Factual information: Where did she go and how? • Meaning construction: Was she keen on going by train? • Discourse construction. What two reasons did she give for going by train?

  31. Targeting levels of listening Test takers at proficiency level B1 and below focus heavily on word recognition and have problems in processing language in chunks.. In these tests, it may be desirable to focus items mainly on the first three areas Higher- level tests should particularly target meaning representation and discourse representation.

  32. Information handling But they don’t. Reason 1: Item writers tend to focus on discrete points of information. They do not target the connections between them. In real life, the listener has to build an information structure. Reason 2: It is the item writer who decides what is/is not important in a recording. In the real world, the listener has to identify major and minor points and ignore irrelevant points University of Bedfordshire

  33. Structure building (Gernsbacher, 1990) • Skilled listeners construct a hierarchical representation of a recording

  34. Structure building • Unskilled listeners focus their attention at local level. • They build a linear structure.

  35. A structure building task Three types of pollution 1..………………….. a. Example:…………. b. Solution:…………… 2. ……………………. a. Cause: ……………….. b. Result: Climate change 3. ……………………. a. Result:…………………….. b. Solution: ……………………..

  36. The inflexibility of high stakes tests Large scale high-stakes tests have major constraints which prevent them from testing listening in a way that fully represents the skill. • Reliability and ease of marking • Highly controlled test methods, using traditional formats that the candidate knows • Little attention possible to individual variation or alternative answers

  37. Advantages of more local tests and tasks Smaller-scale testsafford the possibility of testing a wider range of listening processes with: • More open ended questions • More scope for testing information handling • Marking on an individual basis • Possible acceptance of alternative answers

  38. Computer delivery • Computer delivered tests offer the possibility of: • Controlling timing. • Providing a first play of the recording before items become visible • Monitoring responses to direct the test taker towards a particular level of difficulty • Exploiting oral questions (including oral MCQ with short options) University of Bedfordshire

  39. An important issue for the future… • Testers need to find a means of validating listening tests by means of evidence external to the test. • This would entail establishing the listening proficiency of a listener by subjective assessment of performance. • Methods might include ‘Listen and speak’ activities or the separate assessment of listening performance within a speaking test.. University of Bedfordshire

  40. References • Field, J. (2008) Listening in the Language Classroom. Cambridge: CUP • Field, J. (2013) Cognitive validity. In Geranpayeh, A. & Taylor, L. (eds.) Examining Listening. Cambridge: Cambridge University Press • Weir, C.J. (2013) Measured Constructs. Cambridge: Cambridge University Press

  41. Thanks for listening jcf1000@dircon.co.uk University of Bedfordshire

More Related