1 / 28

Class-based nominal semantic role labeling: a preliminary investigation

Class-based nominal semantic role labeling: a preliminary investigation. Matt Gerber Michigan State University, Department of Computer Science. “John presented his findings to the committee.”. Agent. Experiencer. Theme. Introduction: semantic role labeling. The semantic role

kellan
Télécharger la présentation

Class-based nominal semantic role labeling: a preliminary investigation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Class-based nominal semantic role labeling: a preliminary investigation Matt Gerber Michigan State University, Department of Computer Science

  2. “John presented his findings to the committee.” Agent Experiencer Theme Introduction: semantic role labeling • The semantic role • Relation between a constituent and a predication • The task • Automatically identify semantic roles occurring in natural language • Problematic: which roles are the “right” ones?

  3. “John presented his findings to the committee.” Arg0 Arg2 Arg1 Introduction: PropBank (Kingsbury and Palmer 2003) • Annotated corpus of semantic roles • Base corpus: TreeBank 2 (Marcus et al., 1993) • Evaluation • CoNLL Shared Task (Carreras and Marquez, 2005) • Implications • QA: Kaisser and Webber (2007), Shen and Lapata (2007) • Coreference: Ponzetto and Strube (2006) • Information extraction: Surdeanu et al. (2003)

  4. Introduction: NomBank (Meyers, 2007) • Verbs are not the only lexical category with shallow semantic structure • [Arg0 Judge Curry] [Predicate ordered] [Arg1 Edison] [Arg2 to make average refunds of about $45]. • Judge Curry ordered [Arg0 Edison] to make average [Predicate refunds] [Arg1 of about $45]. • A more complete semantic interpretation of natural language Verbal Nominal

  5. Introduction: NomBank (Meyers, 2007) • Corpus information • Base corpus: TreeBank 2 • Distinct nominalizations: 4704 • Total attestations: ~115K • NomLex (Macleod et al., 1998) • Nominalization classes (22) Nom (deverbals) Example: Sales departments then urged [Predicate abandonment] [Arg1 of the Pico Project]. Partitive (part-whole) Example: Hallwood owns about 11 [Predicate %] [Arg1 of Integra].

  6. Research objectives • Investigate the role of NomLex classes in automated NomBank SRL • Hypotheses • (1) Classes may exhibit consistent realizations of their arguments • (2) Modeling each class separately may result in more homogeneous training data and better SRL performance

  7. Outline • Nominalization interpretation: related work • NomBank SRL • Class-based NomBank SRL • Preliminary results and analysis • Conclusions and future work

  8. Nominalization interpretation: early work • Rule-based methods • Associate syntactic configurations with grammatical functions and semantic properties • Dahl et al. (1987) • Hull and Gomez (1996) • Meyers et al. (1998) • Statistical models: Lapata (2000) • Identify underlying subject/object • [subject satellite] observation • [object satellite] observation

  9. Nominalization interpretation: recent work • SemEval (Girju, 2007) • Semantic relations between nominals • Cause-Effect: laugh wrinkles • Instrument-Agency: laserprinter • Product-Producer: honey bee • Origin-Entity: messageentity from outer-spaceorigin • Theme-Tool: news conference • Part-Whole: the door of the car • Content-Container: the grocery bag

  10. Nominalization interpretation: recent work • NomBank SRL: Jiang and Ng (2006), Liu and Ng (2007) • Direct application of verbal SRL methods • Standard feature set • Maximum entropy modeling • Best overall f-measure score: 0.7283 • NomBank-specific features had little impact

  11. Overview of NomBank SRL • Full syntactic analysis S VP S VP VP NP NP NP PP JJ NNS Judge Curry ordered Edison to make average [Predicate refunds] of about $45.

  12. Overview of NomBank SRL • Argument identification • Binary classification problem • Argument • Non-argument S VP S VP VP NP NP NP PP JJ NNS Judge Curry ordered [Edison] to make average [Predicate refunds] [of about $45].

  13. Overview of NomBank SRL • Argument classification • 22-class problem • Arg0-Arg9 • Temporal, location, etc. S VP S VP VP NP NP NP PP JJ NNS Judge Curry ordered [Arg0 Edison] to make average [Predicate refunds] [Arg1 of about $45].

  14. NomBank SRL features

  15. Class-based NomBank SRL • Simple method • Cluster nominalizations according to NomLex class membership • Train a logistic regression model for each class • Single-stage, 23-class strategy • Baseline feature set • Heuristic post-processing • Backoff • Trained over all classes

  16. Class-based NomBank SRL • Model application Hallwood owns about 11 [Predicate %] of Integra. NomLex abandonment: … abatement: … abduction: … aberration: … ability: … abolition: … abomination: … Nom Partitive Attribute Relational Backoff Hallwood owns about 11 [Predicate %] [Arg1 of Integra].

  17. Preliminary results and analysis • Evaluation configuration • Training instances: WSJ 2-21 • Testing instances: WSJ 23 • Automatically generated parse trees for training and testing • Key observations • Overall performance • Per-class performance • Class-based gains over baseline

  18. Overall evaluation results Per-class evaluation results

  19. Per-class evaluation results • General observations • Negligible overall gains compared to Liu and Ng (2007), who reported overall f-measure of 0.7283 • Some NomLex classes perform very well • Classes introduce gains as well as losses

  20. Analysis: intra-class regularity • Hypothesis 1: classes may exhibit consistent realizations of their arguments • Relational class (F1=90.94) • Regularity: argument incorporation • [Arg2 Mr. Hunt’s] [Arg0/Predicate attorney] said his client welcomed the gamble. • 100% of Relational nominalizations have an incorporated Arg0 • Constitutes 38% of test arguments for the class

  21. Analysis: intra-class regularity • Hypothesis 1: classes may exhibit consistent realizations of their arguments • Partitive class (F1=79.85) • Regularity: presence of Arg0 • 86% of Partitive instances take a single Arg0 • Compare: 15% of Nom instances take a single Arg1

  22. Analysis: class-based gains • Hypothesis 2: modeling each class separately may result in more homogeneous training data and better SRL performance • Improvements

  23. Analysis: class-based gains • Hypothesis 2: modeling each class separately may result in more homogeneous training data and better SRL performance • Losses

  24. Conclusions and future work • NomBank SRL based on classes derived from NomLex • Demonstrates negligible gains over Liu and Ng (2007) • Intra-class regularity leads to modest gains in some classes • NomLex ambiguity causes losses in others

  25. Conclusions and future work • In-depth class modeling • Identification of class-specific regularities not captured by the current feature set • Further partitioning of the Nom class? • NomLex class disambiguation

  26. Thanks! Any questions?

  27. References • Carreras, X. & Màrquez, L. (2005), 'Introduction to the CoNLL-2005 Shared Task: Semantic Role Labeling'. • Dahl, D. A.; Palmer, M. S. & Passonneau, R. J. (1987), Nominalizations in PUNDIT, in 'Proceedings of the 25th annual meeting on Association for Computational Linguistics', Association for Computational Linguistics, Morristown, NJ, USA, pp. 131--139. • Girju, R.; Nakov, P.; Nastase, V.; Szpakowicz, S.; Turney, P. & Yuret, D. (2007), SemEval-2007 Task 04: Classification of Semantic Relations between Nominals, in 'Proceedings of the 4th International Workshop on Semantic Evaluations'. • Hull, R. & Gomez, F. (1996), Semantic Interpretation of Nominalizations, in 'Proceedings of AAAI'. • Jiang, Z. & Ng, H. (2006), Semantic Role Labeling of NomBank: A Maximum Entropy Approach, in 'Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing'. • Kaisser, M. & Webber, B. (2007), Question Answering based on Semantic Roles, in 'ACL 2007 Workshop on Deep Linguistic Processing', Association for Computational Linguistics, Prague, Czech Republic, pp. 41--48. • Kingsbury, P. & Palmer, M. (2003), Propbank: the next level of treebank, in 'Proceedings of Treebanks and Lexical Theories'. • Lapata, M. (2000), The Automatic Interpretation of Nominalizations, in 'Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence', AAAI Press / The MIT Press, , pp. 716--721.

  28. References (cont’d) • Liu, C. & Ng, H. (2007), Learning Predictive Structures for Semantic Role Labeling of NomBank, in 'Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics', Association for Computational Linguistics, Prague, Czech Republic, pp. 208--215. • Macleod, C.; Grishman, R.; Meyers, A.; Barrett, L. & Reeves, R. (1998), Nomlex: A lexicon of nominalizations, in 'Proceedings of the Eighth International Congress of the European Association for Lexicography'. • Marcus, M.; Santorini, B. & Marcinkiewicz, M. A. (1993), 'Building a large annotated corpus of English: the Penn TreeBank', Computational Linguistics19, 313-330. • Meyers, A. (2007), 'Annotation Guidelines for NomBank - Noun Argument Structure for PropBank', Technical report, New York University. • Meyers, A.; Macleod, C.; Yangarber, R.; Grishman, R.; Barrett, L. & Reeves, R. (1998), Using NOMLEX to produce nominalization patterns for information extraction, in 'Proceedings of the COLING-ACL Workshop on the Computational Treatment of Nominals'. • Ponzetto, S. P. & Strube, M. (2006), Exploiting semantic role labeling, WordNet and Wikipedia for coreference resolution, in 'Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics', Association for Computational Linguistics, Morristown, NJ, USA, pp. 192--199. • Shen, D. & Lapata, M. (2007), Using Semantic Roles to Improve Question Answering, in 'Proceedings of the Conference on Empirical Methods in Natural Language Processing and on Computational Natural Language Learning', pp. 12-21. • Surdeanu, M.; Harabagiu, S.; Williams, J. & Aarseth, P. (2003), Using predicate-argument structures for information extraction, in 'Proceedings of the 41st Annual Meeting on Association for Computational Linguistics'.

More Related