1 / 23

A Word at a Time

A Word at a Time. Computing Word Relatedness using Temporal Semantic Analysis. Kira Radinsky , Eugene Agichteiny , Evgeniy Gabrilovichz , Shaul Markovitch. A rich source of information can be revealed by studying the patterns of word occurrence over time Example: “peace” and “war”

jude
Télécharger la présentation

A Word at a Time

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Word at a Time Computing Word Relatedness using Temporal Semantic Analysis Kira Radinsky, EugeneAgichteiny, EvgeniyGabrilovichz, ShaulMarkovitch

  2. A rich source of information can be revealed by studying the patterns of word occurrence over time • Example: “peace” and “war” • Corpus: New York Times over 130 years • Word <=> time series of its occurrence in NYT articles • Hypothesis:Correlation between 2 words time series • Semantic Relation • Proposed method: Temporal Semantic Analysis (TSA) Introduction

  3. Introduction

  4. Introduction

  5. 1. TSA

  6. 3 main steps: • Represent words as concepts vectors • Extract temporal dynamics for each concept • Extend static representation with temporal dynamics Temporal Semantic Analysis

  7. 1. Words as concept vectors

  8. c : concept represented by a sequence of words wc1,…,wck • d : a document • ε : proximity relaxation parameter (ε = 20 in the experiments) • c appears in d if its words appear in d with a distance of at most ε words between each pair wci, wcj • Example: “Great Fire of London” 2. Temporal dynamics

  9. t1,…,tn : a sequence of consecutive discrete time points (days) • H = D1,…,Dn: history represented by a set of document collections, where Di is a collection of documents associated with time ti • the dynamics of a concept c is the time series of its frequency of appearance in H 2. Temporal dynamics

  10. 3. Extend static representation

  11. 2. Using TSA for computing Semantic Relatedness

  12. Compare by weighted distance between time series of concept vectors • Combine it with the static semantic similarity measure Using TSA for computing Semantic Relatedness

  13. t1, t2 : words • C(t1) = {c1,…,cn}and C(t2) = {c1,…,cm}: sets of concepts of t1 and t2 • Q(c1,c2) : function that determines relatedness between two concepts c1 and c2 using their dynamics (time series) Algorithm

  14. Algorithm

  15. Pearson's product-moment coefficient: • A statistic method for measuring similarity of two random variables • Example: “computer” and “radio” Cross Correlation

  16. Measure similarity between 2 time series that may differ in time scale but similar in shape • Used in speech recognition • It defines a local cost matrix • Temporal Weighting Function Dynamic Time Warping

  17. 3. Experimentations

  18. New York Times archive (1863 – 2004) • Each day: average of 50 abstracts of article • 1.42 Gb of texts • 565 540 distinct words • A new algorithm to automatically benchmark word relatedness tasks • Same vector representation for each method tested • Comparison to human judgment (WS-353 and Amazon MTurk) Experimentations: Setup

  19. TSA vs. ESA

  20. TSA vs. Temporal Word Similarity

  21. Word Frequency Effects

  22. Size of Temporal Concept Vector

  23. Two innovations: • Temporal Semantic Analysis • Anew method for measuring semantic relatedness of terms • Many advantages (robustness, tunable, can be used to study language evolution over time) • Significant improvements in computing words relatedness Conclusion

More Related