1 / 47

Learning noun phrase coreference resolution

Learning noun phrase coreference resolution. Veronique Hoste CNTS Language Technology Group University of Antwerp. Definition (Hirst, 81).

zeal
Télécharger la présentation

Learning noun phrase coreference resolution

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Learning noun phrase coreference resolution Veronique Hoste CNTS Language Technology Group University of Antwerp

  2. Definition (Hirst, 81) Anaphora is the device of making in discourse an abbreviated reference to some entity in the expectation that the perceiver will we able to disabbreviate the reference and thereby determine the identity of the entity.

  3. Definition (Hirst, 81) Anaphora is the device of making in discourse an abbreviated reference to some entity in the expectation that the perceiver will we able to disabbreviate the reference and thereby determine the identity of the entity. ANAPHOR

  4. Definition (Hirst, 81) ANTECEDENTor REFERENT Anaphora is the device of making in discourse an abbreviated reference to some entity in the expectation that the perceiver will we able to disabbreviate the reference and thereby determine the identity of the entity. ANAPHOR

  5. Definition (Hirst, 81) ANTECEDENTor REFERENT Anaphora is the device of making in discourse an abbreviated reference to some entity in the expectation that the perceiver will we able to disabbreviate the reference and thereby determine the identity of the entity. ANAPHOR RESOLUTION

  6. In other words ... • Reference = the act of using a referring expression to some extra-linguistic entity • Anaphor = refers to something in the text • If both the anaphor and its antecedent refer to the same extra-linguistic entity, they are coreferential • Anaphoric and coreferential relations do not always coincide. (e.g. bound anaphora), e.g. Most linguists prefer their own parsers.

  7. Example Kim Clijstershas won the Proximus Diamond Games in Antwerp. Belgium’s world number two secured her first title on home soil by making short work of defaiting Italy’s Silvia Farina Elia. Clijsters broke Farina Elia’s second service game but her opponent broke back immediately and it wasn’t until the eight game that the Belgian broke again to lead 5-3, from which she served out to take the set. It was Clijsters’s sixth victory over the Italian.

  8. Example Kim Clijstershas won the Proximus Diamond Games in Antwerp. Belgium’s world number two secured her first title on home soil by making short work of defaiting Italy’s Silvia Farina Elia. Clijsters broke Farina Elia’s second service game but her opponent broke back immediately and it wasn’t until the eight game that the Belgian broke again to lead 5-3, from which she served out to take the set. It was Clijsters’s sixth victory over the Italian.

  9. Example Kim Clijstershas won the Proximus Diamond Games in Antwerp. Belgium’s world number two secured her first title on home soil by making short work of defaiting Italy’s Silvia Farina Elia. Clijsters broke FarinaElia’s second service game but her/ her opponent broke back immediately and it wasn’t until the eight game that the Belgian broke again to lead 5-3, from which she served out to take the set. It was Clijsters’s sixth victory over the Italian.

  10. Example Kim Clijstershas won the Proximus Diamond Games in Antwerp. Belgium’s world number two secured her first title on home soil by making short work of defaiting Italy’s Silvia Farina Elia. Clijsters broke Farina Elia’s second service game but her/her opponent broke back immediately and it wasn’t until the eight game that the Belgian broke again to lead 5-3, from which she served out to take the set. It was Clijsters’s sixth victory over the Italian.

  11. Why? Weakness in existing IE systems Who: ….. What: ….. Where: ….. When: ….. How: ….. Information extraction

  12. Morphological and lexical knowledge Real-world knowledge Syntactic knowledge Anaphora resolution Semantic knowledge Discourse knowledge Coreference resolution, a complex problem

  13. Which anaphora? • Identity relation <-> type-token relation: “I prefer the red car, but my husband wanted the grey one.” <-> part-whole relation: “If the gas tank is empty, you should refuel the car.” • NPs • Personal and posessive pronouns • Definite and indefinite NP’s

  14. Two data sets • ENGLISH: MUC-6 and MUC-7 • The only datasets which are publicly available • Extensively used for evaluation • Articles from WSJ and NYT • DUTCH: KNACK-2002 • First Dutch coreferentially annotated corpus • Articles from KNACK 2002 on different topics: national and international politics, science, culture, …

  15. MUC-6 and MUC-7 • Message Understanding Conference • Identity relation between NPs • MUC-6: 2141coreferential NPs in train set and 2091 in test set • MUC-7: 2569 coreferential NPs in train set and 1728 in test set • E.g. Ng(02): 35,895 train inst. (4.4% pos.) and 22,699 test inst. (3.9% pos.) for MUC-7

  16. KNACK-2002 • Annotation: adapted version of MUC guidelines • Identity, bound, ISA, modality relations between NP’s • Ca. 13,266 coreferential NPs • E.g.

  17. “Ongeveer een maand geleden stuurde <COREF ID=“1”>American Airlines</COREF> <COREF ID=“2”MIN=“toplui”>enkele toplui</COREF> naar Brussel. <COREF ID=“3” TYPE=“IDENT” REF=“1”>De grote vliegtuigmaatschappij</COREF> had interesse voor DAT en wou daarover <COREF ID=“4”>de eerste minister</COREF> spreken. Maar <COREF ID=“5” TYPE=“IDENT” REF=“4”>Guy Verhofstadt</COREF> weigerde <COREF ID=“6” TYPE=“BOUND” REF=“2”>de delegatie</COREF> te ontvangen.”

  18. Anaphora resolutionthe practice

  19. Free text Tokenization POS tagging NP chunking NER Relation finding GETTING STARTED

  20. Identification of the anaphors • Identification of pleonastic pronouns Vb. “Hoe komt het dan dat hij zoveel invloed heeft in het Witte Huis” • Identification of pronouns referring to clauses, etc. • Identification of non-coreferential NP’s Vb. “Dat onvoorspelbare staten als schurkenstaten moeten worden behandeld: het zit al jaren in het gedachtegoed van Paul Wolfowitz ingebakken.”

  21. Identification of the candidate antecedents • Determine search scope • Anaphora/cataphora • N preceding (following) sentences depending on the type of the anaphor • 2 or 3 sentences for pronouns • larger scope for other NP’s with proper nouns, common nouns

  22. Approaches • The past: mostly knowledge-based techniques (constraints and preferences) e.g. Lappin & Leass (1994), Baldwin (CogNIAC, 1996) • Recently: machine learning (C4.5) Redefine coreference resolution as a CLASSIFICATION task.

  23. A classification based approach • Given two entities in a text, NP1 and NP2, classify the pair as coreferent of not coreferent. • E.g. • [Clijsters] broke [[Farina Elia]’s second service game] but [[her] opponent] broke back immediately. [her opponent] - [Farina Elia’s second service game] not coreferential [her opponent] - [Farina Elia] coreferential [her opponent] - [Clijsters] not coreferential

  24. Selected features (41) • Positional features (eg. dist_sent, dist_NP) • Local context features • Morphological and lexical features (e.g. i/j/ij-pron, j_demon, j_def, i/j/ij-proper, num_agree) • Syntactic features (e.g. i/j/ij_SBJ, appos) • String-matching features (comp_match, part_match, alias, same_head) • Semantic features (syn, hyper, same_NE, 4 features indicating semantic class)

  25. Positive, negative and test instances • Positive: combination of the anaphor with each preceding element in the coreference chain. • Negative: combination of the anaphor with each preceding NP which is not part of the coreference chain • Test: all NPs starting from the second NP in the document are considered as possible anaphor and linked to all preceding NPs as possible antecedents

  26. Baselineexperimentsand optimization

  27. Two step procedure • First step: validation • Application of Timbl and Ripper on train set; 10-fold-cv • Evaluation: accuracy, precision, recall, F-beta • Second step: testing • Training of Timbl and Ripper on train set; testing on test set. • Selection of one positive instance in case of multiple positives (e.g. through application of ordered Ripper rules, clustering)

  28. Algorithms compared • Ripper • Cohen, 95 • Rule Induction • Algorithm parameters: different class ordering principles; negative conditions or not; loss ratio values; cover parameter values • TiMBL • Memory-Based Learning • Algorithm parameters: ib1, igtree; overlap, mvdm; 5 feature weighting methods; 4 distance weighting methods; 10 values of k

  29. Baseline validation results

  30. Conclusions from baseline experiments • The concatenation of the NP-type classifiers is beneficial for Ripper, not for Timbl. • Low precision scores for Timbl (large number of false positives). The scores are up to 30% lower than the ones for Ripper. Reason: feature weighting? • Higher recall for Timbl: distinguishes better between true and false negatives.

  31. Optimization Confirmed hypothesis in previous research: The observed difference in accuracy between two algorithms can be easily overwhelmed by accuracy differences resulting from interactions of algorithm parameter settings and feature selection

  32. Optimization • Feature selection • backward elimination : start with all features and remove the features which do not contribute to prediction • bidirectional hillclimbing : start with features with highest gain ratio and perform both backward and forward selection • genetic algorithm : start with random feature set • Parameter optimization • Joint optimization by a genetic algorithm

  33. Feature selection results

  34. Parameter optimization results TiMBL Ripper

  35. Initial population Generate new population using crossover and mutation Best individual Population of candidate solutions Evaluation based on fitness Selection Genetic algorithms

  36. GA individuals Feature weighting 0,1,2,3,4 Neighbour weighting 0,1,2,3 Values: 0,1,2 k 0 1 0 1 2 0 2 1 0 2 0 0 2 1 0 2 2 0 3 2 2.0288721872 Parameters Features

  37. GA optimization results MUC6

  38. Optimization: summary • Is it worth the effort? • Yes, • optimization can lead to much larger classifier-internal variations than classifier-comparing variations • can lead to significant performance increases • leads to more reliable results • GAs are a feasible approach to search the space

  39. Anaphora resolution and the problem of skewed class distributions

  40. Problem • In an unbalanced data set, the majority class is represented by a large portion of all the instances whereas the other class, the minority class has only a small part of the instances. • Many real world data sets are highly unbalanced

  41. ML and skewed data sets • Imbalanced data sets may result in poor performances of standard classification algorithms (e.g. decision tree learners, kNN, SVMs) • => the algorithms often generate classifiers that maximize the overall classification accuracy, while completely ignoring the minority class • =>or this may lead to a classifier with many small disjuncts which tends to overfit the data

  42. Strategies for dealing with skewed data sets • Sampling • undersampling • oversampling • Adjusting misclassification costs (high cost to misclassification of the minority class) • Weighting of examples (focus on the minority class)

  43. Sampling • Undersampling: examples from the majority class are removed problem: throw away possibly useful information • Oversampling: examples from the minority class are duplicated problem: no increase of information, overfitting • General observation in ML literature: - undersampling leads to better performance - oversampling does not help

  44. Skewedness

  45. Downsampling results

  46. Changing loss ratio in Ripper • Loss ratio parameter: allows to specify the relative • cost of false positives and false negatives • Focus on recall: loss ratio < 1 • Focus on precision: loss ratio > 1

  47. Skewedness: summary • Comparison of the sensitivity of Timbl and Ripper to the skewed data set (ML past: C4.5) • Both learners: large number of FN • Ripper has a much poorer performance on the minority class (Forgetting exceptions ?) • Ripper is also more sensitive to rebalancing • No particular downsampling level or loss ratio value leads to overall best performance => yet another optimization step ...

More Related