First and Second Language Models to Correct Preposition Errors

First and Second LanguageModels to Correct PrepositionErrors Matthieu Hermet, Alain Désilets National Research Council of Canada

PrepositionErrors • A good case study: • High errorrate • More than 17% of errors in ourdataset • Instance of function-worderrors, correctibleusing corpus-basedmethods • Instance of interferenceerrors

PrepositionErrors • 2 major causes: • Confusion withpreposition of the samesemantic class …à la conférence NAACL …at the NAACL conference …in the NAACL conference • Interferencewith L1 Écouter les intervenants Listen to the speakers Listen the speakers

Approaches • Rule-based: • Mal-rules: cost of manualcreation • Syntacticconstraint relaxation: parser-dependent • Corpus-based: • Languagemodels: lowcoverage • Web as a corpus: bettercoverage • Still not enough: lessthan 40% of our data set

Approach • Interferenceerrorsmaybe hard to addressproperlythroughcorpus-basedmethods • Theyrepresent a model of L2 correctness  To deal withinterferenceerrors, itmaybeadvantageous to use a model whichtakes L1 intoaccount

Roundtrip MT • carry out a single round-trip translation at the level of a clause or sentence • Use a phrase-based translation system  Google Translate

Roundtrip MT Send to phrase-based translation system L1 (en): “Police arrived at the scene of the crime” To L1: Policemen arrived at the crime scene Back to L2: Les policiers sont arrivés sur les lieux du crime L2 (fr): “Les policiers sont arrivés à la scène de la crime.”

Theory Les policiers sont arrivés à la scène du crime

Drawback • The round-trip translatedsentence can show • A wrongtranslation N’hésitez pas de me contacter  s’il vous plait contactez moi • A correct translation that uses the wrongpreposition J’ai de la difficulté de formuler des phrases  je trouve difficile de formuler des phrases • A wrong translation that usesthe correct preposition […] demandé à mon amie pour le corriger […] demandé à mon amie de le fixer

Assessment • Correctnesscantakeat least twoforms: • Correct translation • Wrong translation but correct preposition Twostrategies for evaluation: • Clause: the roundtrip translation is a good correction, includingpreposition • Prep: the prepositiononlyis correct in the roundtrip translation

Assessment • In the Clausestrategy, the RT translation is sent back as the correction • In the Prepstrategy, weneed a procedure to retrieve the prepositionfrom the incorrect translation  The prepositiononlyis sent back as the correction

Prep • greedy mining method to retrieve the preposition from the translation • Êtreprocheàlui êtreprèsdelui • The sequences <prepà> lui == <prepde> lui validates the preposition de as a correction

Unilingual • An instance of a corpus-basedapproach • Web as a probabilisticlanguage-model • Strength of an utterancemeasured in number of search hits • Practically the Web’scoverageisincomplete • Impossible to discriminatewhenzerohits are returned for all alternatives  Syntacticpruning to maximize chances of hits

Pruning 1 • Sentence isparsed and reduced to a phrasalminimum around the preposition • S  VP or NP (or AP) I have lived in a smalltown all my life  lived in a smalltown I’llget a chance to meet people a chance to meet • Words are lemmatized • Verbs to Infinitive • Nouns to singular

Pruning 2 • Suppressunnecessarywords • Adj, whenattributive: To live in a smalltown To live in a town This iseasy to understandeasyto understand • Adv, in all cases Call immediately for help  call for help • NP or PP Une fenêtre qui permet au soleil d’entrer … qui permet d’entrer … au soleil d’entrer

Alternateprepositions • Once pruned, replace the erroneouspreposition by alternates • Most commonprepositions • De, sur, avec, par, pour, à • Prepositions of the samesemantic class • Localization, temporal, cause, goal, manner, material, possession • 1 input sentence = as many sentences as there are alternateprepositions

Preposition Categories

Unilingual • Input Sentence Il y a une grande fenêtre qui permet au soleil <à> entrer (there is a large window which lets the sun come in) • Syntactic Pruning and Lemmatization permettre<à> entrer + au soleil <à> entrer (let come in) (the sun come in) • Generation of alternate prepositions • semanticallyrelated: dans, en, chez, sur, sous, au, dans, après, avant, en, vers • mostcommon: de, avec, par, pour • Query and sort alternative phrases permettre d'entrer: 119 000 hits au soleil d’entrer: 397 hits permettre avant entrer: 12 hits au soleil avant entrer: 0 hits permettre à entrer: 4 hits … permettre en entrer: 2 hits ... • → preposition <d'> is returned as correction

Results • Dataset: 133 sentences extractedfromintermediate-advanced FSL productions • Unilingualreturns hits in only~85% of cases • Impact of L1 on L2 inputs • Incompleteness of the Web as a language model

Hybrid • Agreement between the two strategies is only 65.4% • A thirdstrategy to combine the twomodels • MT as a model of controlled incorrectness (here, anglicisms) • Web as a model of correctness

Hybrid • Triggered when the unilingual approach does not give any hits  Then send to roundtrip MT - prep • Yields results of 82%

Conclusion and Future Work • Unilingual and roundtrip MT equivalent • Hybridapproachseemsrelevant due to the differentparadigms of the twoapproaches • More Data • Enhancepruning • Study in the context of errordetection • Extend MT approach to othererror classes

First and Second Language Models to Correct Preposition Errors

First and Second Language Models to Correct Preposition Errors

Presentation Transcript

Second language writers meet first-year composition

First and second draft

Language and the Mind Prof. R. Hickey SS 2006 First and Second Language Acquisition

First and Second Generation

Preposition

Fill in the correct preposition:

Adverbs and Preposition

Fill in with the correct preposition start

Introduction to Second Language Acquisition

First and Second Conjugation Verbs and First and Second Declension Noun

Introduction to Second Language Acquisition

Bilingual Education and English as a Second Language Program Models

Introduction to Second Language Acquisition

Second-language Acquisition and models of instruction

Methodology First and Language Second -A Way to Teach Object-Oriented Programming

Barriers for first and second language acquisition. When delay leads to deviance .

Preposition Usage Errors by English as a Second Language (ESL) learners:

Correct the Errors!

IMPACT OF EARLY SECOND LANGUAGE LEARNING ON FIRST LANGUAGE

INTRODUCTION TO SECOND LANGUAGE ACQUISITION

Preposition remix - Help your children create grammatically correct sentences

First and Second Language Acquisition