Class-based nominal semantic role labeling: a preliminary investigation

Class-based nominal semantic role labeling: a preliminary investigation Matt Gerber Michigan State University, Department of Computer Science

“John presented his findings to the committee.” Agent Experiencer Theme Introduction: semantic role labeling • The semantic role • Relation between a constituent and a predication • The task • Automatically identify semantic roles occurring in natural language • Problematic: which roles are the “right” ones?

“John presented his findings to the committee.” Arg0 Arg2 Arg1 Introduction: PropBank (Kingsbury and Palmer 2003) • Annotated corpus of semantic roles • Base corpus: TreeBank 2 (Marcus et al., 1993) • Evaluation • CoNLL Shared Task (Carreras and Marquez, 2005) • Implications • QA: Kaisser and Webber (2007), Shen and Lapata (2007) • Coreference: Ponzetto and Strube (2006) • Information extraction: Surdeanu et al. (2003)

Introduction: NomBank (Meyers, 2007) • Verbs are not the only lexical category with shallow semantic structure • [Arg0 Judge Curry] [Predicate ordered] [Arg1 Edison] [Arg2 to make average refunds of about $45]. • Judge Curry ordered [Arg0 Edison] to make average [Predicate refunds] [Arg1 of about $45]. • A more complete semantic interpretation of natural language Verbal Nominal

Introduction: NomBank (Meyers, 2007) • Corpus information • Base corpus: TreeBank 2 • Distinct nominalizations: 4704 • Total attestations: ~115K • NomLex (Macleod et al., 1998) • Nominalization classes (22) Nom (deverbals) Example: Sales departments then urged [Predicate abandonment] [Arg1 of the Pico Project]. Partitive (part-whole) Example: Hallwood owns about 11 [Predicate %] [Arg1 of Integra].

Research objectives • Investigate the role of NomLex classes in automated NomBank SRL • Hypotheses • (1) Classes may exhibit consistent realizations of their arguments • (2) Modeling each class separately may result in more homogeneous training data and better SRL performance

Outline • Nominalization interpretation: related work • NomBank SRL • Class-based NomBank SRL • Preliminary results and analysis • Conclusions and future work

Nominalization interpretation: early work • Rule-based methods • Associate syntactic configurations with grammatical functions and semantic properties • Dahl et al. (1987) • Hull and Gomez (1996) • Meyers et al. (1998) • Statistical models: Lapata (2000) • Identify underlying subject/object • [subject satellite] observation • [object satellite] observation

Nominalization interpretation: recent work • SemEval (Girju, 2007) • Semantic relations between nominals • Cause-Effect: laugh wrinkles • Instrument-Agency: laserprinter • Product-Producer: honey bee • Origin-Entity: messageentity from outer-spaceorigin • Theme-Tool: news conference • Part-Whole: the door of the car • Content-Container: the grocery bag

Nominalization interpretation: recent work • NomBank SRL: Jiang and Ng (2006), Liu and Ng (2007) • Direct application of verbal SRL methods • Standard feature set • Maximum entropy modeling • Best overall f-measure score: 0.7283 • NomBank-specific features had little impact

Overview of NomBank SRL • Full syntactic analysis S VP S VP VP NP NP NP PP JJ NNS Judge Curry ordered Edison to make average [Predicate refunds] of about $45.

Overview of NomBank SRL • Argument identification • Binary classification problem • Argument • Non-argument S VP S VP VP NP NP NP PP JJ NNS Judge Curry ordered [Edison] to make average [Predicate refunds] [of about $45].

Overview of NomBank SRL • Argument classification • 22-class problem • Arg0-Arg9 • Temporal, location, etc. S VP S VP VP NP NP NP PP JJ NNS Judge Curry ordered [Arg0 Edison] to make average [Predicate refunds] [Arg1 of about $45].

NomBank SRL features

Class-based NomBank SRL • Simple method • Cluster nominalizations according to NomLex class membership • Train a logistic regression model for each class • Single-stage, 23-class strategy • Baseline feature set • Heuristic post-processing • Backoff • Trained over all classes

Class-based NomBank SRL • Model application Hallwood owns about 11 [Predicate %] of Integra. NomLex abandonment: … abatement: … abduction: … aberration: … ability: … abolition: … abomination: … Nom Partitive Attribute Relational Backoff Hallwood owns about 11 [Predicate %] [Arg1 of Integra].

Preliminary results and analysis • Evaluation configuration • Training instances: WSJ 2-21 • Testing instances: WSJ 23 • Automatically generated parse trees for training and testing • Key observations • Overall performance • Per-class performance • Class-based gains over baseline

Overall evaluation results Per-class evaluation results

Per-class evaluation results • General observations • Negligible overall gains compared to Liu and Ng (2007), who reported overall f-measure of 0.7283 • Some NomLex classes perform very well • Classes introduce gains as well as losses

Analysis: intra-class regularity • Hypothesis 1: classes may exhibit consistent realizations of their arguments • Relational class (F1=90.94) • Regularity: argument incorporation • [Arg2 Mr. Hunt’s] [Arg0/Predicate attorney] said his client welcomed the gamble. • 100% of Relational nominalizations have an incorporated Arg0 • Constitutes 38% of test arguments for the class

Analysis: intra-class regularity • Hypothesis 1: classes may exhibit consistent realizations of their arguments • Partitive class (F1=79.85) • Regularity: presence of Arg0 • 86% of Partitive instances take a single Arg0 • Compare: 15% of Nom instances take a single Arg1

Analysis: class-based gains • Hypothesis 2: modeling each class separately may result in more homogeneous training data and better SRL performance • Improvements

Analysis: class-based gains • Hypothesis 2: modeling each class separately may result in more homogeneous training data and better SRL performance • Losses

Conclusions and future work • NomBank SRL based on classes derived from NomLex • Demonstrates negligible gains over Liu and Ng (2007) • Intra-class regularity leads to modest gains in some classes • NomLex ambiguity causes losses in others

Conclusions and future work • In-depth class modeling • Identification of class-specific regularities not captured by the current feature set • Further partitioning of the Nom class? • NomLex class disambiguation

Thanks! Any questions?

References • Carreras, X. & Màrquez, L. (2005), 'Introduction to the CoNLL-2005 Shared Task: Semantic Role Labeling'. • Dahl, D. A.; Palmer, M. S. & Passonneau, R. J. (1987), Nominalizations in PUNDIT, in 'Proceedings of the 25th annual meeting on Association for Computational Linguistics', Association for Computational Linguistics, Morristown, NJ, USA, pp. 131--139. • Girju, R.; Nakov, P.; Nastase, V.; Szpakowicz, S.; Turney, P. & Yuret, D. (2007), SemEval-2007 Task 04: Classification of Semantic Relations between Nominals, in 'Proceedings of the 4th International Workshop on Semantic Evaluations'. • Hull, R. & Gomez, F. (1996), Semantic Interpretation of Nominalizations, in 'Proceedings of AAAI'. • Jiang, Z. & Ng, H. (2006), Semantic Role Labeling of NomBank: A Maximum Entropy Approach, in 'Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing'. • Kaisser, M. & Webber, B. (2007), Question Answering based on Semantic Roles, in 'ACL 2007 Workshop on Deep Linguistic Processing', Association for Computational Linguistics, Prague, Czech Republic, pp. 41--48. • Kingsbury, P. & Palmer, M. (2003), Propbank: the next level of treebank, in 'Proceedings of Treebanks and Lexical Theories'. • Lapata, M. (2000), The Automatic Interpretation of Nominalizations, in 'Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence', AAAI Press / The MIT Press, , pp. 716--721.

References (cont’d) • Liu, C. & Ng, H. (2007), Learning Predictive Structures for Semantic Role Labeling of NomBank, in 'Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics', Association for Computational Linguistics, Prague, Czech Republic, pp. 208--215. • Macleod, C.; Grishman, R.; Meyers, A.; Barrett, L. & Reeves, R. (1998), Nomlex: A lexicon of nominalizations, in 'Proceedings of the Eighth International Congress of the European Association for Lexicography'. • Marcus, M.; Santorini, B. & Marcinkiewicz, M. A. (1993), 'Building a large annotated corpus of English: the Penn TreeBank', Computational Linguistics19, 313-330. • Meyers, A. (2007), 'Annotation Guidelines for NomBank - Noun Argument Structure for PropBank', Technical report, New York University. • Meyers, A.; Macleod, C.; Yangarber, R.; Grishman, R.; Barrett, L. & Reeves, R. (1998), Using NOMLEX to produce nominalization patterns for information extraction, in 'Proceedings of the COLING-ACL Workshop on the Computational Treatment of Nominals'. • Ponzetto, S. P. & Strube, M. (2006), Exploiting semantic role labeling, WordNet and Wikipedia for coreference resolution, in 'Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics', Association for Computational Linguistics, Morristown, NJ, USA, pp. 192--199. • Shen, D. & Lapata, M. (2007), Using Semantic Roles to Improve Question Answering, in 'Proceedings of the Conference on Empirical Methods in Natural Language Processing and on Computational Natural Language Learning', pp. 12-21. • Surdeanu, M.; Harabagiu, S.; Williams, J. & Aarseth, P. (2003), Using predicate-argument structures for information extraction, in 'Proceedings of the 41st Annual Meeting on Association for Computational Linguistics'.

Class-based nominal semantic role labeling: a preliminary investigation

Class-based nominal semantic role labeling: a preliminary investigation

Presentation Transcript

CS 388: Natural Language Processing: Semantic Role Labeling

SEMANTIC ROLE LABELING BY TAGGING SYNTACTIC CHUNKS

Semantic Role Labeling

Semantic Role Labeling of Implicit Arguments for Nominal Predicates

Starting from Scratch in Semantic Role Labeling

Semantic Role Labeling

Automatic Semantic Role Labeling

Generalized Inference with Multiple Semantic Role Labeling Systems

A Memory-Based Approach to Semantic Role Labeling

A Preliminary Investigation

Class Investigation

Two-Phase Semantic Role Labeling based on Support Vector Machines

DEPENDENCY PARSING ， Framenet , SEMANTIC ROLE LABELING, SEMANTIC PARSING

Semantic Role Labeling with support vector machines

A Preliminary Investigation of Supercell Longevity

Forest-based Semantic Role Labeling

Semantic Geometric Features: A Preliminary Investigation of Automobile Identification

Role of a Nominal Anchor

Preliminary Investigation Study

CS 388: Natural Language Processing: Semantic Role Labeling

Semantic Role Labeling on Nouns

Robust Semantic Role Labeling for Nominals