ACE Annotation

ACE Annotation Ralph Grishman New York University

ACE(Automatic Content Extraction) • Government evaluation task for information extraction • 6 evaluations since 2000 • next one Nov. 2005 • incremental increases in task complexity • (Current) criteria for what to annotate: • interest to Government sponsors • good inter-annotator agreement • reasonable density of annotations • initially for news, now for wider range of genres (trade-off between coverage and agreement)

Types of Annotations Entities Relations Events • Inter-annotator agreement measured by ‘value’ metric • roughly 1.00 - % missing - % spurious

Entities • Objects of the discourse • (Semantic) Types: • persons, organizations, geo-political entities, [non-political] locations, facilities, vehicles, weapons • Two levels of annotation: • mentions (individual names, nominals, pronouns) • entities (sets of coreferring mentions) • Inter-annotator agreement around 0.90

Relations • Binary, generally static relationships between entities • Main types: • physical (location), part-whole, personal-social, org-affiliation, gen-affiliation, and agent-artifact • Example: the CEO of Microsoft • Inter-annotator agreement (given entities) around 0.75 - 0.80 Org-affiliation

Events • New for 2005 • Types: • life (born/marry/die), movement, transaction, business (start / end), personnel (hire / fire), conflict (attack), contact (meet), justice • Example: Chinapurchasedtwo subs from Russiain 1998. transfer-ownership:buyer (trigger) artifact seller time • Inter-annotator agreement (given entities) around 0.55-0.60 • some events (born, hire/fire, justice) fairly clear-cut • others (attack, meet, move) hard to delimit • coreference sometimes hard • No causal / subevent linkage -- too hard (maybe in 2006?)

Corpora • Genres • newswire and broadcast news • adding weblogs, conversational telephone, talk shows, usenet this year • Multi-lingual • English, Chinese, Arabic (since 2003) • Volume • 2004 set: 140 KW training, 50 KW test per language • Distributed by LDC

A (Nearly) Semantic Annotation • Annotation criteria primarily truth-conditional, not linguistic • although annotations are linked back to text • e.g., event triggers • and some constraints are included to improve inter-annotator agreement • e.g., event arguments must be in same sentence as trigger • Event arguments are filled in using ‘true beyond a reasonable doubt’ rule “An attack in the Middle East killed two Israelis.” • Both the attack and die events are tagged as occurring in the Middle East

ACE Annotation

ACE Annotation

Presentation Transcript

ACE

Annotation

ACE

Annotation

Annotation

Annotation

Annotation

Annotation

Annotation

Annotation

Annotation

ACE

Annotation

ACE

Annotation consistency using annotation intersections

ACE

Annotation

ANNOTATION

Annotation

ACE