Template-Based Event Extraction

Template-Based Event Extraction Kevin Reschke – Aug 15th 2013 Martin Jankowiak, MihaiSurdeanu, Dan Jurafsky, Christopher Manning

Outline • Recap from last time • Distant supervision • Plane crash dataset • Current work • Fully supervised setting • MUC4 terrorism dataset Underlying theme: Joint Inference Models

Goal: Knowledge Base Population … <Plane Crash> <Flight Number = Flight 14> <Operator = Delta> <Fatalities = 40> <Crash Site = Mississippi> … “… Delta Flight 14 crashed in Mississippi killing 40 …” Knowledge Base News Corpus

Distant Supervision Use known events to automatically label training data. <Plane crash> <Flight Number = Flight 11> <Operator = USAir> <Fatalities = 200> <Crash Site = Toronto> Training Knowledge-Base One year after [USAir]Operator[Flight 11]FlightNumbercrashed in [Toronto]CrashSite, families of the [200]Fatalitiesvictims attended a memorial service in [Vancouver]NIL.

Plane Crash Dataset 80 plane crashes from Wikipedia infoboxes. Training set: 32; Dev set: 8; Test set: 40 Corpus: Newswire data from 1989 – present.

Extraction Models • Local Model • Train and classify each mention independently. • Pipeline Model • Classify sequentially; use previous label as feature. • Captures dependencies between labels. • E.g., Passengers and Crew go together: “4 crew and 200 passengers were on board.” • Joint Model • Searn Algorithm (Daumé III et al., 2009). • Jointly models all mentions in a sentence.

Results

Fully Supervised Setting: MUC4 Terrorism Dataset • 4th Message Understanding Conference (1992). • Terrorist activities in Latin America. • 1700 docs ( train / dev / test = 1300 / 200 / 200 ). • 50/50 mix of relevant and irrelevant doc.

MUC4 Task • 5 slots types: • Perpetrator Individual (PerpInd) • Perpetrator Organization (PerpOrg) • Physical Target (Target) • Victim (Victim) • Weapon (Weapon) • Task: Identify all slot fills in each document. • Don’t worry about differentiating multiple events.

MUC4 Example THE ARCE BATTALION COMMAND HAS REPORTED THAT ABOUT 50 PEASANTSOF VARIOUS AGES HAVE BEEN KIDNAPPED BY TERRORISTSOF THE FARABUNDO MARTI NATIONAL LIBERATION FRONT [FMLN] IN SAN MIGUEL DEPARTMENT. Victim PerpInd PerpOrg

MUC4 Example NIL THE ARCE BATTALION COMMAND HAS REPORTED THAT ABOUT 50 PEASANTSOF VARIOUS AGES HAVE BEEN KIDNAPPED BY TERRORISTSOF THE FARABUNDO MARTI NATIONAL LIBERATION FRONT [FMLN] IN SAN MIGUEL DEPARTMENT. Victim PerpInd PerpOrg NIL

Baseline Results • Local Mention Model • Multiclass logistic regression. • Pipeline Mention Model • Previous non-NIL label (or “none”) is feature for current mention.

Observation 1: • Local context is insufficient. • Need sentence-level measure. (Patwardhan & Riloff, 2009) Two bridges were destroyed . . . in Baghdad last night in a resurgence of bomb attacks in the capital city. . . . and $50 million in damage was caused by a hurricane that hit Miami on Friday. . . . to make way for modern, safer bridges that will be constructed early next year. ✓ ✗ ✗

Baseline Models + Sentence Relevance • Binary relevance classifier – unigram / bigram features • HardSent: • Discard all mentions in irrelevant sentences. • SoftSent: • Sentence relevance is feature for mention classification.

Observation 2: • Sentence relevance depends on surrounding context.(Huang & Riloff, 2012) “Obama was attacked.” (political attack vs. terrorist attack) “He use a gun.” (weapon in terrorist event?)

Joint Inference Models • Idea: Model sentence relevance and mention labels jointly – yield globally optimal decisions. • Machinery: Conditional Random Fields (CRFs). • Model joint probability of relevance labels and mention labels conditioned on input features. • Encode dependencies among labels. • Software: Factorie (http://factorie.cs.umass.edu) • Flexibly design CRF graph structures. • Learning / Classification algorithms with exact and approximate inference.

First Pass • Fully joint model. S M M M • Approximate inference a likely culprit.

Second Pass • Two linear-chain CRFs with relevance threshold. S S S M M M

Analysis • Many errors are reasonable extractions, but come from irrelevant documents. • Learned CRF model weights: The kidnappers were accused of kidnapping several businessmen for high sums of Money. RelLabel<+,<NIL>> = -0.071687 RelLabel<+,Vict> = 0.716669 RelLabel<-,Vict> = -1.688919 ... • RelRel<+, +> = -0.609790 • RelRel<+, -> = -0.469663 • RelRel<-, +> = -0.634649 • RelRel<-, -> = 0.572855

Possibilities for improvement • Label-specific relevance thresholds. • Leverage Coref(Skip Chain CRFs). • Incorporate doc-level relevance signal.

State of the art • Huang & Riloff (2012) • P / R / F1 : 0.58 / 0.60 / 0.59 • CRF sentence model with local mention classifiers. • Textual cohesion features to model sentence chains. • Multiple binary mention classifiers (SVMs).

Future Work • Apply CRF models to plane crash dataset. • New terrorism dataset from Wikipedia. • Hybrid models: combine supervised MUC4 data with distant supervision on Wikipedia data.

Thanks!

Template-Based Event Extraction

Template-Based Event Extraction

Presentation Transcript

Template-based Email

Template based shape descriptor

Template-based Authoring

Event-based Programming

Information Extraction Lecture 11 – Event Extraction and Multimodal Extraction

Open Domain Event Extraction from Twitter

Ontology-based Information Extraction

Summarization using Event Extraction Base System

Event Extraction Using Distant Supervision

Coreference Based Event-Argument Relation Extraction on Biomedical Text

Template Based Development Environment

Information Extraction for New Event Detection

EVENT EXTRACTION

Theme Based Blog Opinion Extraction

Template based diagram generation

Event Management RFP Template

Event Management Template

Introduction to “Event Extraction”