1 / 100

Temporal Information Extraction

Temporal Information Extraction. Inderjeet Mani imani@mitre.org. Outline. Introduction Linguistic Theories AI Theories Annotation Schemes Rule-based and machine-learning methods. Challenges Links. Motivation: Question-Answering. When is Ramadan this year?

cachet
Télécharger la présentation

Temporal Information Extraction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Temporal Information Extraction Inderjeet Mani imani@mitre.org

  2. Outline • Introduction • Linguistic Theories • AI Theories • Annotation Schemes • Rule-based and machine-learning methods. • Challenges • Links

  3. Motivation: Question-Answering • When is Ramadan this year? • What was the largest U.S. military operation since Vietnam? • Tell me the best time of the year to go cherry-picking. • How often do you feed a pet gerbil? • Is Gates currently CEO of Microsoft? • Did the Enron merger with Dynegy take place? • How long did the hostage situation in Beirut last? • What is the current unemployment rate? • How many Iraqi civilian casualties were there in the first week of the U.S. invasion of Iraq? • Who was Secretary of Defense during the Gulf War?

  4. Single-document sentence extraction summarizers are plagued by dangling references especially temporal ones Multi-Document summarizers can be misled by the weakness of vocabulary overlap methods leads to inappropriate merging of distinct events Motivation: Coherent and Faithful Summaries ..worked in recent summers.. ..was the source of the virus last week.. …where Morris was a computer science undergraduate until June.. …..whose virus program three years ago disrupted…

  5. before 02172004 02182004 during run finishes during twist ankle before push An Example Story Feb. 18, 2004 Yesterday Holly wasrunning a marathon whenshe twisted her ankle. David hadpushed her. 1. When did the running occur? Yesterday. 2. When did the twisting occur? Yesterday, during the running. 3. Did the pushing occur before the twisting? Yes. 4. Did Holly keep running after twisting her ankle? 5. Probably not.

  6. Temporal Information Extraction Problem • Input:A natural language discourse • Output: representation of events and their temporal relations Feb. 18, 2004 Yesterday Holly wasrunning a marathon whenshe twisted her ankle. David hadpushed her.

  7. IE Methodology Raw Corpus Initial Tagger Annotation Editor Annotation Guidelines Machine Learning Program Rule Apply Learn-ed Rules Raw Corpus Annotated Corpus Annotated Corpus • Idea for Temporal IE: • Make progress by focusing on a particular top-down slice (i.e., time), using its rich structure

  8. Events Time Language Theories AI & logic Formal Linguistics

  9. Linguistic Theories • Events • Event Structure (event subclasses and parts) • Tense (indicates location of event in time, via verb inflections, modals, auxiliaries, etc.) • Grammatical Aspect (indicates whether event is ongoing, finished, completed) • Time Adverbials • Relations between events and/or times • temporal relations • we will also need discourse relations

  10. Tense • All languages that have tense (in the semantic sense of locating events in time) can express location in time • Location can be expressed relative to a deictic center that is the current ‘moment’ of speech, or ‘speech time’, or ‘speech point’ • e.g., tomorrow, yesterday, etc. • Languages can also express temporal locations relative to a coordinate system • a calendar, e.g., 1991 (A.D.), • a cyclically occurring event, e.g., morning, spring, • an arbitrary event, e.g., the day after he married her. • A language may have tense in the above semantic sense, without expressing it using tense morphemes • Instead, aspectual morphemes and/or modals and auxiliaries may be used.

  11. Mandarin Chinese • Has semantic tense • Lacks tense morphemes • Instead, it uses ‘aspect’ markers to indicate whether an event is ongoing (-zhai, -le), completed (-wan), terminated (-le, -guo), or in a result state (-zhe) • But aspect markers are often absent 我 看 电视 wo kan dianshi* I watch / will watch / watched TV *Example from Congmin Min, MS Thesis, Georgetown, 2005.

  12. Burmese* • No semantic tense, but all languages that lack semantic tense all have a realis/irrealis distinction. • Events that are ongoing or that were observed in the past are expressed by sentence-final realis particles –te, -tha, -ta, and –hta. • For unreal or hypothetical events (including future and present and hypothetical past events), the sentence-final irrealis particles –me, -ma, and –hma are used. *Comrie, B. Tense. Cambridge, 1985.

  13. E R S time Tense as Anaphor: Reichenbach • A formal method for representing tense, based on which one can locate events in time • Tensed utterances introduce references to 3 ‘time points’ • Speech Time: S • Event Time: E • Reference Time: R SI had [mailed the letter]E [when John came & told me the news]R E < R < S • Three temporal relations are defined on these time points • at, before, after • 13 different relations are possible N.B. the concept of ‘time point’ is an abstraction –- it can map to an interval

  14. Tense is determined by relation between R and S R=S, R<S, R>S Aspect is determined by relation between E and R E=R, E < R, E> R Relation of E relative to S not crucial Represent R<S=E as E>R<S Only 7 out of 13 relations are realized in English 6 different forms, simple future being ambiguous Progressive no different from simple tenses But I was eating a peach>I ate a peach Reichenbachian Tense Analysis E>R<S E<R>S

  15. G: It is always going to be the case that .H: It always has been the case that .F: It will be at some point in the future be the case that . P: It was at some point in the past the case that .F = ¬G¬ P = ¬H¬ System Kt:(a)  H F  : What is, has always been going to be;(b  G P  : What is, will always have been;(c) H()  (H H): Whatever always follows from what always has been, always has been;(d) G()  (G G): Whatever always follows from what always will be, always will be. Priorean Tense Logic

  16. Tense as Operator: Prior • Free iteration captures many more tenses, • I would have slept PFP • But also expresses many non-NL tenses • PPPP [It was the case]4 John had slept

  17. Event Classes (Lexical Aspect) • ACCOMPLISHMENTSbuild, cook, destroy • culminate (telic) • X is Ving does not entail that X has Ved. • John booked a flight in an hour, John stopped building a house • ACHIEVEMENTS notice,win, blink, find, reach • instantaneous accomplishments • *John dies for an hour, *John wins for an hour, *John stopped reaching New York • STATIVES know, sit, be clever, be happy, killing, accident • can refer to state itself (ingressive) John knows , or to entry into a state (inceptive) John realizes • *John is knowing Bill, *Know the answer, *What John did was know the answer • ACTIVITIESwalk, run, talk, march, paint • if it occurs in period t, a part of it (the same activity) must occur for most sub-periods of t • X is Ving entails that X has Ved • John ran for an hour,*John ran in an hour

  18. Aspectual Composition • Expressions of one class can be transformed into one of another class by combining with another expression. • e.g., an activity can be changed into an accomplishment by adding an adverbial phrase expressing temporal or spatial extent • I walked (activity) • I walked to the station / a mile / home (accomplishment) • I built my house (accomplishment). • I built my house for an hour (activity). • Moens & Steedman (1988) – implement aspectual composition in a transition network

  19. Example: Classifying Question Verbs • Androutsopoulos’s (2002) NLITDB system allows users to pose temporal questions in English to an airport database that uses a temporal extension of SQL • Verbs in single-clause questions with non-future meanings are treated as states • Does any tank contain oil? • Some verbs may be ambiguous between a (habitual) state and an accomplishment • Which flight lands on runway 2? • Does flight BA737 land on runway 2 this afternoon • Activities are distinguished using the imperfective paradox: • Were any flights taxiing? implies that they taxied • Were any flights taxiing to gate 2? does not imply that they taxied. • So, taxiwill be given • an activity verb sense, one that doesn’t expect a destination argument, and • an accomplishment verb sense, one that expects a destination argument.

  20. built.a.h Grammatical Aspect • Perfective – focus on situation as a whole • John built a house • Imperfective – focus on internal phases of situation • John was building a house was building.a.h

  21. Inferring Temporal Relations • Yesterday Holly was running a marathon when she twisted her ankle. FINISHES David had pushed her. BEFORE • I had mailed the letter when John came & told me the news AFTER • Simpson made the callat 3. Later, he was spotteddriving towards Westwood. AFTER • Max entered the room. Mary stood up/was seated on the desk. AFTER/OVERLAP • Max stood up. John greeted him. AFTER • Max fell. John pushed him. BEFORE • Boutros-Ghali Sunday opened a meeting in Nairobi of ....He arrived in Nairobi from South Africa BEFORE • John bought Mary some flowers. He picked out three red roses. DURING

  22. Linguistic Information Needed for Temporal IE • Events • Tense • Aspect • Time adverbials • Explicit temporal signals (before, since, at, etc.) • Discourse Modeling • For disambiguation of time expressions based on context • For tracking sequences of events (tense/aspect shifts) • For computing Discourse Relations • Commonsense Knowledge • For inferring Discourse Relations • For inferring event durations

  23. Narrative Ordering • Temporal Discourse Interpretation Principle (Dowty 1979) • Reference time for the current sentence is a time consistent with its time adverbials if any, or else it immediately follows reference time of the previous sentence. • The overlap of statives is a pragmatic inference,(hinting at a theory of defaults) • A man entered the White Hart. He was wearing a black jacket. Bill served him a beer. • Discourse Representation Theory (Kamp and Reyle 1993) • In successive past tense sentences which lack temporal adverbials, events advance the narrative forward, while states do not. • Overlapping statives come out of semantic inference rules • Neither theory explicitly represents discourse relations, though they are needed (e.g., 6-8 above)

  24. A man entered the White Hart. He was wearing a black jacket. Bill served him a beer. Discourse Representation Theory (example) Rpt  {} e1, t1, x, y enter(e1, x, y), man(x), y= theWhiteHart t1 < n, e1  t1 Rpt  e1 ---------------------------------------------------------- e2, t2, x1, y1 PROG(wear(e2, x1, y1)), black-jacket(y1), x1=x t2 < n, e2 ο t2, e1  e2 ---------------------------------------------------------- e3, t3, x2, y2, z serve(e3, x2, y2, z), beer(z), x2=Bill, y2=x t3 < n, e3  t3 Rpt  e3 e1 < e3

  25. Overriding Defaults • Lascarides and Asher (1993)*: temporal ordering is derived entirely from discourse relations (that link together DRS’s, based on SDRT formalism). • Example • Max switched off the light. The room was pitch dark. • Default inference: OVERLAP • Use an inference rule that if the room is dark and the light was just switched off, the switching off caused the room to become dark. • Inference: AFTER • Problem: requires large doses of world knowledge *L&P 1993

  26. Outline • Introduction • Linguistic Theories • AI Theories • Annotation Schemes • Rule-based and machine-learning methods. • Challenges • Links

  27. Time and Events in Logic Events Time Time Time Events Events Instants Intervals Intervals Intervals Instants Instants

  28. Instant Ontology • Consider the event of John’s reading the book • Decompose into an infinite set of infinitesimal instants • Let T be a set of temporal instants. • Let < (BEFORE) be a temporal ordering relation between instants • Properties: irreflexive, antisymmetric, transitive, and complete • Antisymmetric => time has only one direction of movement • Irreflexive and Transitive => time is non-cyclical • Complete => < is a total ordering

  29. P = The race is on T-R = the time of running the race T-AR = the time after running the race R and AR have to meet somewhere If we choose instants, there is some instant x where T-R and AR meet Either we have P and not P both true at x, or there is a truth value gap at x This is called the Divided Instant Problem (D.I.P.) x T-AR T-R ? P not P Instants -- Problem Where Truth Values Change

  30. Ordering Relations on Intervals • Unlike instants, where we have only <, we can have at least 3 ordering relations on intervals • Precedence <: I1 <I2 iff t1  I1, t2  I2, t1 < t2 (where < is defined over instants) • Temporal Overlap O: I1 OI2 iff I1 I2   • Temporal Inclusion : I1 I2 iff I1 I2

  31. Instants versus Intervals • Instants • We understand the idea of truth at an instant • In cases of continuous change, e.g., a tossed ball, we need a notion of a durationless event in order to explain the trajectory of the ball just before it falls • Intervals • We often conceive of time as broken up in terms of events which have a certain duration, rather than as a (infinite) sequence of durationless instants. • Many verbs do not describe instantaneous events., e.g., has read, ripened • Duration expressions like yesterday afternoon aren’t construed as instants

  32. Allen’s Interval-Based Ontology* • Instants are banished • So, avoids the divided instant problem • Short duration intervals will be instant-like • Uses 13 relations • Relations are mutually exclusive • All 13 relations can be expressed using meet: • XY [Before (X, Y)  Z [meet(X, Z) & meet(Z, Y)]] *James F. Allen, ‘Towards a General Theory of Action and Time’, Artificial Intelligence 23 (1984): 123–54.

  33. A A is EQUAL to B B A is BEFORE B B is AFTER A A B A MEETS B B is MET by A A B A OVERLAPS B B is OVERLAPPED by A A B A A STARTS B B is STARTED by A B A A FINISHES B B is FINISHED by A B A A DURING B B CONTAINS A B Allen’s 13 Temporal Relations <, > m, mi o, oi s, si f, fi d, di

  34. Temporal Closure: Sputlink* in TANGO *Verhagen (2005)

  35. Situation Calculus Holds(Have(John, book), t1) Holds(Have(Mary, book), t2) Holds(Have(Z, Y), Result(give(X, Y, Z), t)) t-i are states Concurrent actions cannot be represented No duration of actions or delayedeffects Event Calculus HoldsAt(Have(J, B), t1) HoldsAt(Have(M, B), t2) Terminates(e1, Have(J, B)) Initiates(e1, Have(M, B)) Happens(e, t) [t is a time point] Involves non-monotonic reasoning Handles frame problem using circumscription AI Reasoning about Events • John gave a book to Mary

  36. Temporal Question-Answering using IE + Event Calculus • Mueller (2004)*: Takes instantiated MUC terrorist event templates and represents information in EC • Adds commonsense knowledge about terrorist domain • e.g., if a bomb explodes, it’s no longer activated • Commonsense knowledge includes frame axioms • e.g., if an object starts falling, then its height will be released from the commonsense law of inertia • Example temporal questions • Was the car dealership damaged before the high-power bombs exploded? Ans: No. • Requires reasoning that the damage did not occur at all times t prior to the explosion • Problem: requires large doses of world knowledge *Mueller, Erik T. (2004). Understanding script-based stories using commonsense reasoning. Cognitive Systems Research, 5(4), 307-340.

  37. Temporal Question Answering using IE + Temporal Databases • In NLITDB, semantic relation between a question event and the adverbial it combines with is inferred by a variety of inference rules. • State + ‘point’ adverbial • Which flight was queueing for runway 2 at 5:00 pm?: • state coerced to an achievement, viewed as holding at the time specified by the adverbial. • Activity + point adverbial • can mean that the activity holds at that time, or that the activity starts at that time, e.g., Which flight queued for runway 2 at 5:00 pm? • An accomplishment may indicate inception or termination • Which flight taxied to gate 4 at 5:00 pm? can mean the taxiing starts or ends at 5 pm.

  38. Outline • Introduction • Linguistic Theories • AI Theories • Annotation Schemes • Rule-based and machine-learning methods. • Challenges • Links

  39. IE Methodology Raw Corpus Initial Tagger Annotation Editor Annotation Guidelines Machine Learning Program Rule Apply Learn-ed Rules Raw Corpus Annotated Corpus Annotated Corpus

  40. Events in NLP • Topic: well-defined subject for searching • document- or collection-level • Template: structure with slots for participant named entities • document-level • Mention: linguistic expression that expresses an underlying event • phrase-level (verb/noun)

  41. Event Characteristics • Can have temporal a/o spatial locations • Can have types • assassinations, bombings, joint ventures, etc. • Can have members • Can have parts • Can have people a/o other objects as participants • Can be hypothetical • Can have not happened

  42. MUC Event Templates Wall Street Journal, 06/15/88 MAXICARE HEALTH PLANS INC and UNIVERSAL HEALTH SERVICES INC have dissolved a joint venture which provided health services.

  43. ACE Event Templates • Four additional attributes for each event mention • Polarity (it did or did not occur) • Tense (past, present, future) • Modality (real vs. hypothetical) • Genericity (specific vs. generic) • Argument slots (4 -7) specific to each event • E.g., Trial-Hearing event has slots for the Defendant, Prosecutor, Adjudicator, Crime, Time, and Place. From Lisa Ferro @MITRE

  44. Mention-Level Events • Event expressions: • tensed verbs;has left, was captured, will resign; • stative adjectives;sunken, stalled, on board; • event nominals;merger, Military Operation, war; • Dependencies between events and times: • Anchoring;John left on Monday. • Orderings;The party happened after midnight. • Embedding;John said Mary left.

  45. TIMEX2 (TIDES/ACE) Annotation Scheme Time Points <TIMEX2 VAL="2000-W42">the third week of October</TIMEX2> Durations <TIMEX2 VAL=“PT30M”>half an hour long</TIMEX2> Indexicality <TIMEX2 VAL=“2000-10-04”>tomorrow</TIMEX2> He wrapped up a <TIMEX2 VAL="PT3H" ANCHOR_DIR="WITHIN" ANCHOR_VAL="1999-07-15">three-hour</TIMEX2> meeting with the Iraqi president in Baghdad <TIMEX2 VAL="1999-07-15">today</TIMEX2>. Sets <TIMEX2 VAL=”XXXX-WXX-2" SET="YES” PERIODICITY="F1W" GRANULARITY=“G1D”>every Tuesday</TIMEX2> Fuzziness <TIMEX2 VAL=“1990-SU”>Summer of 1990 </TIMEX2> <TIMEX2 VAL=“1999-07-15TMO”>This morning</TIMEX2> <TIMEX2 VAL=“2000-10-31TNI” MOD=“START”>early last night</TIMEX2>

  46. TIMEX2 Inter-annotator Agreement • Georgetown/MITRE (2001) • 193 English docs, .79 F Extent, .86 F VAL • 5 annotators • Annotators deviate from guidelines, and produce systematic errors (fatigue?) • several years ago: PXY instead of PAST_REF • all day: P1D instead of YYYY-MM-DD • LDC (2004) • 49 English docs, .85 F Extent, .80F VAL • 19 Chinese docs, .83 Extent • 2 annotators

  47. Example of Annotator Difficulties (TERN 2004*) • *Time Expression Recognition and Normalization Competition (timex2.mitre.org)

  48. TIMEX2 – A Mature Standard • Extensively debugged • Detailed guidelines for English and Chinese • Evaluated for English, Arabic, Chinese, Korean, Spanish, French, Swedish, and Hindi • Applied to news, scheduling dialogues, other types of data • Corpora available through ACE, MITRE

  49. Temporal Relations in ACE • Restricted to verbal events (verbs of scheduling, occurrence, aspect etc.) • The event and the timex must be in the same sentence • Eight temporal relations • Within The bombing occurred [during] the night. • Holds They were meeting[all] night. • Starting, Ending The talks [ended (on)] Monday. • Before, After The initial briefs have to be filed [by] 4 p.m. Tuesday” • At-Beginning, At-End Sharon met with Bill [at the start] of the three-day conference From Lisa Ferro @MITRE

More Related