1 / 22

Dealing with Italian Temporal Expressions: the ITA-Chronos System

Dealing with Italian Temporal Expressions: the ITA-Chronos System. Matteo Negri Fondazione Bruno Kessler - IRST, Trento - Italy negri@itc.it EVALITA 2007 - Evaluation of NLP Tools for Italian Rome - Italy September 10, 2007. Outline.

meli
Télécharger la présentation

Dealing with Italian Temporal Expressions: the ITA-Chronos System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Dealing with Italian Temporal Expressions: the ITA-Chronos System Matteo Negri Fondazione Bruno Kessler - IRST, Trento - Italy negri@itc.it EVALITA 2007 - Evaluation of NLP Tools for Italian Rome - Italy September 10, 2007

  2. Dealing with Italian Temporal Expressions: the ITA-Chronos System Outline • Chronos: a multilingual system for TE recognition/normalization • System description • Some examples • Results at EVALITA 2007

  3. Dealing with Italian Temporal Expressions: the ITA-Chronos System Chronos • Multilingual (ITA/ENG) tool for TE recognition and normalization according to the TIMEX2 standard • Approach • Rule-based system • ENG-Chronos: 1500 rules • ITA-Chronos: 981 rules • Six phases: Preprocessing, Detection, Braketing, Information Gathering, Anchors Selection, Normalization • ENG-Chronos participated in TERN-04 with good results on the “Recognition+Normalization Task” • Ranked 2nd, with 76% TERN-Value (best system: 78%)

  4. Dealing with Italian Temporal Expressions: the ITA-Chronos System ITA-Chronos: System Architecture Plain Text Tagged Text Intermediate Annotation Tokenization, POS Tagging, Multiwords Recognition DetectionBasic Tagging Rules Attributes Normalization Bracketing Composition Rules Dates Normalization Information GatheringTagging Rules for: SET, Anchor_Dir, Anchor_Val, MOD Type, T_Cat, Heur, Op, Quant, Val_Ext Anchors Selection Detection and Bracketing Normalization

  5. Dealing with Italian Temporal Expressions: the ITA-Chronos System STEP1: Preprocessing • The first phase of the process performs: • Tokenization • POS tagging • Multiwords recognition • The preprocessed input text is then passed to the TE detection phase, where around 400 tagging rules are in charge of finding all the TEs it contains.

  6. Dealing with Italian Temporal Expressions: the ITA-Chronos System STEP2: Detection • Markable expressions are detected considering the presence of lexical triggers in the input text • “anno”, “oggi”, “Venerdì”, “Natale”, “quotidianamente”, “10/09/2007”, “1982”, etc. • Basic Tagging Rules • Regular expressions checking for: word senses, parts of speech, symbols, or words satisfying specific predicates …“E” = preposition …“N” = numeral …TimeUnit-p satisfied by: “secondo”, “minuto”, “ora”, “giorno”, “settimana”, “mese”, etc. Tagging rule matching with “Fra tre giorni”

  7. Dealing with Italian Temporal Expressions: the ITA-Chronos System STEP3: Bracketing • Considers the context surrounding the detected triggers • “inizio”, “fine”, “prima”, “dopo”, “fa”, “successivo”, “precedente”, “durante”, “circa”, “almeno”, “3”, “sesto”, etc. • Composition rules: • In charge of handling conflicts between possible multiple taggings (e.g. when a recognized TE contains, overlaps, or is adjacent to one or more detected TEs) Tutta la notte di sabato Tutta la notte la notte la notte di sabato sabato Tutta la notte di sabato Composition rulefor handling inclusions

  8. Dealing with Italian Temporal Expressions: the ITA-Chronos System STEP4: Information gathering • Goal: mine relevant information for normalization • Considers triggers+context to assign values to • TIMEX2 attributes(e.g. SET, MOD, ANCHOR_DIR) • TEMPORARY attributes(e.g. Type, T_Cat, Heur, Op, Quant) • This is done by running separate sets of specialized tagging rules • Such information is stored in the Intermediate Annotation, and input to the normalization component

  9. Dealing with Italian Temporal Expressions: the ITA-Chronos System Information Gathering: Example • TIMEX2 attributes • MOD: “più di”, “circa”, “oltre” … • SET: “ogni”, “tutti” … • ANCHOR_DIR: “prima”, “durante”, “dopo”... • TEMPORARY attributes • type:[T-ABS | T-REL] • t-cat: [second, minute, hour, day,…] • op: [=, +, -] • quant: [n≥0] • heur: [CR-DATE | PR-DATE]

  10. Dealing with Italian Temporal Expressions: the ITA-Chronos System Information Gathering: Example oltretre anni dopo Detected TE • TIMEX2 attributes • MOD: “più di”, “circa”, “oltre” … • SET: “ogni”, “tutti” … • ANCHOR_DIR: “prima”, “durante”, “dopo”... • TEMPORARY attributes • type:[T-ABS | T-REL] • t-cat: [second, minute, hour, day,…] • op: [=, +, -] • quant: [n≥0] • heur: [CR-DATE | PR-DATE]

  11. Dealing with Italian Temporal Expressions: the ITA-Chronos System Information Gathering: Example oltre treanni dopo • TIMEX2 attributes • MOD: “più di”, “circa”, “oltre” … • SET: “ogni”, “tutti” … • ANCHOR_DIR: “prima”, “durante”, “dopo”... • TEMPORARY attributes • type:[T-ABS | T-REL] • t-cat: [second, minute, hour, day,…] • op: [=, +, -] • quant: [n≥0] • heur: [CR-DATE | PR-DATE] MORE_THAN

  12. Dealing with Italian Temporal Expressions: the ITA-Chronos System Information Gathering: Example oltretreanni dopo • TIMEX2 attributes • MOD: “più di”, “circa”, “oltre” … • SET: “ogni”, “tutti” … • ANCHOR_DIR: “prima”, “durante”, “dopo”... • TEMPORARY attributes • type:[T-ABS | T-REL] • t-cat: [second, minute, hour, day,…] • op: [=, +, -] • quant: [n≥0] • heur: [CR-DATE | PR-DATE] MORE_THAN ENDING

  13. Dealing with Italian Temporal Expressions: the ITA-Chronos System Information Gathering: Example oltre tre anni dopo • TIMEX2 attributes • MOD: “più di”, “circa”, “oltre” … • SET: “ogni”, “tutti” … • ANCHOR_DIR: “prima”, “durante”, “dopo”... • TEMPORARY attributes • type:[T-ABS | T-REL] • t-cat: [second, minute, hour, day,…] • op: [=, +, -] • quant: [n≥0] • heur: [CR-DATE | PR-DATE] MORE_THAN ENDING T-REL

  14. Dealing with Italian Temporal Expressions: the ITA-Chronos System Information Gathering: Example oltretre anni dopo • TIMEX2 attributes • MOD: “più di”, “circa”, “oltre” … • SET: “ogni”, “tutti” … • ANCHOR_DIR: “prima”, “durante”, “dopo”... • TEMPORARY attributes • type:[T-ABS | T-REL] • t-cat: [second, minute, hour, day,…] • op: [=, +, -] • quant: [n≥0] • heur: [CR-DATE | PR-DATE] MORE_THAN ENDING T-REL YEAR

  15. Dealing with Italian Temporal Expressions: the ITA-Chronos System Information Gathering: Example oltretreanni dopo • TIMEX2 attributes • MOD: “più di”, “circa”, “oltre” … • SET: “ogni”, “tutti” … • ANCHOR_DIR: “prima”, “durante”, “dopo”... • TEMPORARY attributes • type:[T-ABS | T-REL] • t-cat: [second, minute, hour, day,…] • op: [=, +, -] • quant: [n≥0] • heur: [CR-DATE | PR-DATE] MORE_THAN ENDING T-REL YEAR +

  16. Dealing with Italian Temporal Expressions: the ITA-Chronos System Information Gathering: Example oltre tre anni dopo • TIMEX2 attributes • MOD: “più di”, “circa”, “oltre” … • SET: “ogni”, “tutti” … • ANCHOR_DIR: “prima”, “durante”, “dopo”... • TEMPORARY attributes • type:[T-ABS | T-REL] • t-cat: [second, minute, hour, day,…] • op: [=, +, -] • quant: [n≥0] • heur: [CR-DATE | PR-DATE] MORE_THAN ENDING T-REL YEAR + 3

  17. Dealing with Italian Temporal Expressions: the ITA-Chronos System Information Gathering: Example oltretre anni dopo • TIMEX2 attributes • MOD: “più di”, “circa”, “oltre” … • SET: “ogni”, “tutti” … • ANCHOR_DIR: “prima”, “durante”, “dopo”... • TEMPORARY attributes • type:[T-ABS | T-REL] • t-cat: [second, minute, hour, day,…] • op: [=, +, -] • quant: [n≥0] • heur: [CR-DATE | PR-DATE] MORE_THAN ENDING T-REL YEAR + 3 PR-DATE

  18. Dealing with Italian Temporal Expressions: the ITA-Chronos System Intermediate Annotation: Example adige20041007_id413938 “…Così il 31 Luglio del 2002, quindioltre tre anni dopol’incidente, il giovane venne nuovamente ricoverato e sottoposto ad un intervento che si dimostrerà risolutivo…” …quindi <TIMEX2MOD=“MORE_THAN” ANCHOR_DIR=“ENDING”type=“T-REL” t-cat=“YEAR” op=“+” quant=“3”, heur=“PR-DATE>oltre tre anni dopo </TIMEX2> l’incidente… Plain Text Detection and Bracketing Intermediate Annotation

  19. Dealing with Italian Temporal Expressions: the ITA-Chronos System STEP5: Anchors Selection • Goal: connect each detected T-REL to an appropriate anchor date • While the meaning of T-ABSs (“13 Marzo 2005”) is context-independent, T-RELs (“tre anni dopo”) can only be interpreted with respect to e reference TE • The “heur” attribute is used for this purpose • 2 heuristics: CR-DATE: connects a T-REL to the document’s creation date (found at the beginning of the doc, or induced from doc’s name. e.g. “adige20041007_…) PR-DATE: connects a T-REL to the nearest detected TE with a compatible granularity (a “t-cat” with at least the same degree of specificity) t-cat= “month” “month”, “week”, “day”,“century”

  20. Dealing with Italian Temporal Expressions: the ITA-Chronos System STEP6: Dates Normalization • Goal: fill the VAL attribute of each detected TE T-ABSs: regular expressions considering their superficial form (“1990s” “199”) T-RELs: rewriting rules considering the anchor(e.g. “2002”) the operator (“OP”) to be applied (e.g. “+”) the quantity (“QUANT”) to be added/subtracted (e.g. “3”) tre anni dopo “2002” “+” “3” 2005

  21. Dealing with Italian Temporal Expressions: the ITA-Chronos System ITA-Chronos at EVALITA 2007 • Results over the EVALITA-07 test set (27’15’’ computation time, ~50 words/sec) • Higher scores on MOD and SET attributes • Activated by the presence of triggers that are easy to identify • Lower scores with ANCHOR_VAL and ANCHOR_DIR • Require the analysis of a larger context, e.g. including verb tense

  22. Dealing with Italian Temporal Expressions: the ITA-Chronos System Web Demo http://www.qallme.itc.it/server/chronos/italian

More Related