1 / 8

ACE Annotation

ACE Annotation. Ralph Grishman New York University. ACE (Automatic Content Extraction). Government evaluation task for information extraction 6 evaluations since 2000 next one Nov. 2005 incremental increases in task complexity (Current) criteria for what to annotate:

darryl
Télécharger la présentation

ACE Annotation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ACE Annotation Ralph Grishman New York University

  2. ACE(Automatic Content Extraction) • Government evaluation task for information extraction • 6 evaluations since 2000 • next one Nov. 2005 • incremental increases in task complexity • (Current) criteria for what to annotate: • interest to Government sponsors • good inter-annotator agreement • reasonable density of annotations • initially for news, now for wider range of genres (trade-off between coverage and agreement)

  3. Types of Annotations Entities Relations Events • Inter-annotator agreement measured by ‘value’ metric • roughly 1.00 - % missing - % spurious

  4. Entities • Objects of the discourse • (Semantic) Types: • persons, organizations, geo-political entities, [non-political] locations, facilities, vehicles, weapons • Two levels of annotation: • mentions (individual names, nominals, pronouns) • entities (sets of coreferring mentions) • Inter-annotator agreement around 0.90

  5. Relations • Binary, generally static relationships between entities • Main types: • physical (location), part-whole, personal-social, org-affiliation, gen-affiliation, and agent-artifact • Example: the CEO of Microsoft • Inter-annotator agreement (given entities) around 0.75 - 0.80 Org-affiliation

  6. Events • New for 2005 • Types: • life (born/marry/die), movement, transaction, business (start / end), personnel (hire / fire), conflict (attack), contact (meet), justice • Example: Chinapurchasedtwo subs from Russiain 1998. transfer-ownership:buyer (trigger) artifact seller time • Inter-annotator agreement (given entities) around 0.55-0.60 • some events (born, hire/fire, justice) fairly clear-cut • others (attack, meet, move) hard to delimit • coreference sometimes hard • No causal / subevent linkage -- too hard (maybe in 2006?)

  7. Corpora • Genres • newswire and broadcast news • adding weblogs, conversational telephone, talk shows, usenet this year • Multi-lingual • English, Chinese, Arabic (since 2003) • Volume • 2004 set: 140 KW training, 50 KW test per language • Distributed by LDC

  8. A (Nearly) Semantic Annotation • Annotation criteria primarily truth-conditional, not linguistic • although annotations are linked back to text • e.g., event triggers • and some constraints are included to improve inter-annotator agreement • e.g., event arguments must be in same sentence as trigger • Event arguments are filled in using ‘true beyond a reasonable doubt’ rule “An attack in the Middle East killed two Israelis.” • Both the attack and die events are tagged as occurring in the Middle East

More Related