Anaphor Resolution in Norwegian

Anaphor Resolution in Norwegian Gordana Ilic Holen Institut for lingvistiske fag Det historisk-filosofiske fakultet Universitetet i Oslo g.i.holen@hfstud.uio.no

Some technical data Hovedfagsoppgave (incl. obligatory courses, a 4 semestrer project) Aim: Making a system for resolving pronominal anaphors in Norwegian. Mentor: Janne Bondi Johannessen Implementation in (CLOS) LISP To be finished Christmas 2003 Fefor

Where did it start? Martin Hassel, 2000 Made AR system for Swedish pronouns han/ honom/ hans and hon/ henne /hennes Differences Planning to cover more pronouns A different theoretical background Fefor

The Top List • Han/ ham/ hans and hun/ henne/ hennes • Among the most used; not ambiguous • Seg and selv • Syntactic solutions • Den • Ambiguous with the determinative den (gule bilen). Fefor

The Top Wish List • De • Ambiguous with a determinative de (gule bilene) • Problems delimiting the antecedent • Det • Problems in deciding whether det is pronominal • det (gule huset) • det (regner) Fefor

Approach To be based on • Mitkov's anaphora resolution system/ MARS (Mitkov 1996, 1998) and partially on • Resolution of Anaphora Procedures/ RAP (Leass & Lappin 1994). Fefor

Why MARS and RAP • Both made for English • MARS: intuitive, fully automated • RAP: high precision • Flexible Fefor

MARS • No parsing • The AR module uses a list of preferences called antecedent indicators • Boosting • Impeding • Fully automatic, not very high precision (60 - 61%) Fefor

MARS: The algorithm • The text is POS tagged. • NPs are extracted by a NP-extractor • NPs which precede the anaphor (in a two-sentence scope) are located • Gender and number constraints are applied • Antecedent indicators are applied to the antecedent candidates that agree in gender and number. The scores (2, 1, 0 or -1) are assigned. • The NP with the highest score is proposed as antecedent. Fefor

MARS: Antecedent indicators(boosting) • First noun phrases +1 • Indicating verbs +1 • Lexical reiteration +2 / +1 • Section heading preference +1 • Collocation match +2 • Immediate reference +2 • Sequential instructions +2 • Term preference +2 Fefor

MARS: Antecedent indicators(impeding) • Indefiniteness -1 • Prepositional NPs -1 Fefor

RAP • A high precision system (86% correctly resolved anaphors) • Originally based on parsed text, but there exists a version without (Kennedy and Boguraev, 1996) • The AR module: Salience weighting Fefor

RAP: Salience weighting • Salience factors: • Sentence recency 100 • Subject emphasis 80 • Head noun emphasis 80 • Existential emphasis 70 • Accusative emphasis 50 • Non-adverbial emphasis 50 • IO and oblique component emphasis 40 Fefor

Modifications As both systems exist in versions with or without parsing, leaving this question open. Starting with using Oslo Corpus for training and adjusting • Experiment with antecedent indicators and adjust them for Norwegian • Try to combine them with RAP’s salience factors Fefor

Open for suggestions g.i.holen@hfstud.uio.no Fefor

Anaphor Resolution in Norwegian

Anaphor Resolution in Norwegian

Presentation Transcript

Norwegian Golf Federation

Norwegian Salmon

Norwegian culture

Religion in Norway : Contemporary Norwegian society

The family in Norwegian society

Bioethanol in Brazil – a norwegian experience

Norwegian Solutions

The Norwegian Rat

Norwegian economy

West Norwegian Fjords.

Perceptual distance in Norwegian retroflexion

Cluster research in Norwegian regions

Norwegian Perspective

Norwegian companies in Estonia

Traditions in Norwegian forests

Monitoring macroinvertebrater in Norwegian rivers?

Norwegian Pearl

Norwegian Solutions

norwegian

Free Norwegian Dating Site - Norwegian Friends Date