1 / 15

An English Writing Assistant for Non Native Speakers

An English Writing Assistant for Non Native Speakers. Projet CorrecTools ( CAPRA : Compagnon d’Apprentissage et de Perfectionnement à la Rédaction en Anglais) M. Garnier, A. Rykner Université Toulouse 2 P. Saint-Dizier CNRS France. Introduction.

Télécharger la présentation

An English Writing Assistant for Non Native Speakers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An English Writing Assistantfor Non Native Speakers Projet CorrecTools (CAPRA: Compagnon d’Apprentissage et de Perfectionnement à la Rédaction en Anglais) M. Garnier, A. Rykner Université Toulouse 2 P. Saint-Dizier CNRS France

  2. Introduction • English: main language for international communication → Necessity for Non Native Speakers of English (NNS) to produce satisfactory English texts (personal/professional spheres) • Learning and practice: necessary requirements for long-term acquisition of writing skills • Each language and linguistic community encounter specific problems in writing English (‘language transfer’) Need for an automatic English writing assistant • Presentation of our project: • Aims and challenges • Corpus constitution and error analysis • Annotation of errors • Some results of the analysis of a Thai to English corpus

  3. 1. Aims and challenges • Presence of grammatical, lexical and stylistic errors in the productions of NNS: • Make comprehension difficult + damage credibility • A lot of errors are not treated by text editors such as MS Word etc. • Didactic perspective: explanation of errors and grammar rules given alongside the corrections • Focus on pairs of languages (French to English, Thai to English): • Prototypicality of errors: easier correction process • Knowledge of the L1: more efficient analysis and correction of errors

  4. 2. Corpus constitution and error analysis • Exploratory corpus: emails, reports, scientific publications, web pages, blogs Parameters: • Variety of authors (professionals, researchers, students) • Different domains of production (business, research, personal sphere) • Different levels of control, i.e. amount of care devoted to the production of a document • First stage: manual detection, annotation, and correction of errors • Classification of errors: creation of a system of categories • Characteristics of the system: • Categories created according to linguistic criteria, i.e. NP, PP, VP, Clause and Sentence • Inclusion of two levels of subtypes of errors inside main categories • Inclusion of indications concerning broad linguistic parameters: Lexicon, Morpho-Syntax, Syntax, Semantics, Style

  5. 2. Corpus constitution and error analysis (2) • Example:

  6. 3. The annotation of errors • Errors are annotated using a standard XML formalism enriched with attributes • Schema designed so as to reflect cognitive strategies used by human correctors when detecting and correcting errors • Delimitation and characterization of errors:

  7. 3. The annotation of errors (2) • Delimitation and characterization of corrections:

  8. 3. The annotation of errors (3) • Example of an annotated error with multiple corrections: *The second stage has therefore two goals: [...] and the construction of the meaning utterance.

  9. 4. Some results on a Thai-English corpus • Preliminary study conducted on a limited corpus of English texts written by Thai native speakers • Description of corpus: • 10 scientific abstracts • 1755 words • Various research domains and writers • Steps completed so far: • Detection of errors • Classification of errors • Highlighting several aspects of error distribution • Future steps: • Annotation of errors • Collaboration with Thai native speakers in order to study the extent of transfer effects • Towards a correction system?

  10. 4. Some results on a Thai-English corpus (2) • Distribution of errors according to broad linguistic parameters (number of subtypes of errors vs. number of errors in total for each axis) Lexicon MorphoSyntax

  11. 4. Some results on a Thai-English corpus (3) • Distribution of errors according to main categories of our system (number of subtypes of errors vs. number of errors in total for each category)

  12. 4. Some results on a Thai-English corpus (4) • Distribution of errors according to subtypes of errors • Main types of errors: omission of determiner, omission of plural, erroneous subject/verb agreement, abusive NØN construction

  13. 4. Some results on a Thai-English corpus (5) • Omission of determiner: • *World of information technologies can be classifiied into 2 main groups. → The world of information technologies can be classified into 2 main groups • Omission of plural: • *Reading from book and website is a way to diagnose diseases. → Reading from books and websites is a way to diagnose diseases. • Erroneous subject/verb agreement: • *Precision depend on noise in each website. → Precisiondepends on noise in each website. • Abusive NØN construction: • *It will decrease the plant quality. → It will decrease the quality of the plant / the plant’s quality.

  14. Perspectives • French to English: • Extend the initial corpus • Investigate the relevance of learner corpora • Stabilize the classification system and the annotation schema • Focus on certain errors and start drafting rules for correction • Evaluate the needs of a population of users and the demand for such a tool • Thai to English: • Extend the initial corpus • Work with Thai researchers to evaluate the needs of potential users and assess the quality of the analyses proposed • Draft a roadmap for the continuation of the project in Thailand

  15. Kop khun khà! CorrecTools website: http://www.irit.fr/recherches/ILPL/webct/ct.html

More Related