Download
local and global algorithms for disambiguation to wikipedia n.
Skip this Video
Loading SlideShow in 5 Seconds..
Local and Global Algorithms for Disambiguation to Wikipedia PowerPoint Presentation
Download Presentation
Local and Global Algorithms for Disambiguation to Wikipedia

Local and Global Algorithms for Disambiguation to Wikipedia

68 Vues Download Presentation
Télécharger la présentation

Local and Global Algorithms for Disambiguation to Wikipedia

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Local and Global Algorithms for Disambiguation to Wikipedia Lev Ratinov1, Dan Roth1, Doug Downey2, Mike Anderson3 1University of Illinois at Urbana-Champaign 2Northwestern University 3Rexonomy March 2011

  2. Information overload

  3. Organizing knowledge It’s a version of Chicago – the standard classic Macintosh menu font, with that distinctive thick diagonal in the ”N”. Chicago was used by default for Mac menus through MacOS 7.6, and OS 8 was released mid-1997.. ChicagoVIIIwas one of the early 70s-era Chicago albums to catch my ear, along with Chicago II.

  4. Cross-document co-reference resolution It’s a version of Chicago – the standard classic Macintosh menu font, with that distinctive thick diagonal in the ”N”. Chicago was used by default for Mac menus through MacOS 7.6, and OS 8 was released mid-1997.. ChicagoVIIIwas one of the early 70s-era Chicago albums to catch my ear, along with Chicago II.

  5. Reference resolution: (disambiguation to Wikipedia) It’s a version of Chicago – the standard classic Macintosh menu font, with that distinctive thick diagonal in the ”N”. Chicago was used by default for Mac menus through MacOS 7.6, and OS 8 was released mid-1997.. ChicagoVIIIwas one of the early 70s-era Chicago albums to catch my ear, along with Chicago II.

  6. The “reference” collection has structure It’s a version of Chicago – the standard classic Macintosh menu font, with that distinctive thick diagonal in the ”N”. Chicago was used by default for Mac menus through MacOS 7.6, and OS 8 was released mid-1997.. ChicagoVIIIwas one of the early 70s-era Chicago albums to catch my ear, along with Chicago II. Is_a Is_a Used_In Released Succeeded

  7. Analysis of Information Networks It’s a version of Chicago – the standard classic Macintosh menu font, with that distinctive thick diagonal in the ”N”. Chicago was used by default for Mac menus through MacOS 7.6, and OS 8 was released mid-1997.. ChicagoVIIIwas one of the early 70s-era Chicago albums to catch my ear, along with Chicago II.

  8. Here – Wikipedia as a knowledge resource …. but we can use other resources Is_a Is_a Used_In Released Succeeded

  9. Talk outline • High-level algorithmic approach. • Bi-partite graph matching with global and local inference. • Local Inference. • Experiments & Results • Global Inference. • Experiments & Results • Results, Conclusions • Demo

  10. Problem formulation - matching/ranking problem Text Document(s)—News, Blogs,… Wikipedia Articles

  11. Local approach Text Document(s)—News, Blogs,… Wikipedia Articles • Γ is a solution to the problem • A set of pairs (m,t) • m: a mention in the document • t: the matched Wikipedia Title

  12. Local approach Text Document(s)—News, Blogs,… Wikipedia Articles • Γ is a solution to the problem • A set of pairs (m,t) • m: a mention in the document • t: the matched Wikipedia Title Local score of matching the mention to the title

  13. Local + Global : using the Wikipedia structure Text Document(s)—News, Blogs,… Wikipedia Articles A “global” term – evaluating how good the structure of the solution is

  14. Can be reduced to an NP-hard problem Text Document(s)—News, Blogs,… Wikipedia Articles

  15. A tractable variation Text Document(s)—News, Blogs,… Wikipedia Articles • Invent a surrogate solution Γ’; • disambiguate each mention independently. • Evaluate the structure based on pair-wise coherence scores Ψ(ti,tj)

  16. Talk outline • High-level algorithmic approach. • Bi-partite graph matching with global and local inference. • Local Inference. • Experiments & Results • Global Inference. • Experiments & Results • Results, Conclusions • Demo

  17. I. Baseline : P(Title|Surface Form) P(Title|”Chicago”)

  18. II. Context(Title) Context(Charcoal)+= “a font called __ is used to”

  19. III. Text(Title) Just the text of the page (one per title)

  20. Putting it all together • City Vs Font: (0.99-0.0001, 0.01-0.2, 0.03-0.01) • Band Vs Font: (0.001-0.0001, 0.001-0.2, 0.02-0.01) • Training ranking SVM: • Consider all title pairs. • Train a ranker on the pairs (learn to prefer the correct solution). • Inference = knockout tournament. • Key: Abstracts over the text – learns which scores are important.

  21. Example: font or city? Text(Chicago_city), Context(Chicago_city) It’s a version of Chicago – the standard classic Macintosh menu font, with that distinctive thick diagonal in the ”N”. Text(Chicago_font), Context(Chicago_font)

  22. Lexical matching Text(Chicago_city), Context(Chicago_city) Cosine similarity, TF-IDF weighting It’s a version of Chicago – the standard classic Macintosh menu font, with that distinctive thick diagonal in the ”N”. Text(Chicago_font), Context(Chicago_font)

  23. Ranking – font vs. city Text(Chicago_city), Context(Chicago_city) 0.2 0.8 0.5 0.1 It’s a version of Chicago – the standard classic Macintosh menu font, with that distinctive thick diagonal in the ”N”. 0.2 0.5 0.3 0.3 Text(Chicago_font), Context(Chicago_font)

  24. Train a ranking SVM Text(Chicago_city), Context(Chicago_city) (0.5, 0.2 , 0.1, 0.8) It’s a version of Chicago – the standard classic Macintosh menu font, with that distinctive thick diagonal in the ”N”. [(0.2, 0, -0.2, 0.3), -1] (0.3, 0.2, 0.3, 0.5) Text(Chicago_font), Context(Chicago_font)

  25. Scaling issues – one of our key contributions Text(Chicago_city), Context(Chicago_city) It’s a version of Chicago – the standard classic Macintosh menu font, with that distinctive thick diagonal in the ”N”. Text(Chicago_font), Context(Chicago_font)

  26. Scaling issues Text(Chicago_city), Context(Chicago_city) This stuff is big, and is loaded into the memory from the disk It’s a version of Chicago – the standard classic Macintosh menu font, with that distinctive thick diagonal in the ”N”. Text(Chicago_font), Context(Chicago_font)

  27. Improving performance Text(Chicago_city), Context(Chicago_city) Rather than computing TF-IDF weighted cosine similarity, we want to train a classifier on the fly. But due to the aggressive feature pruning, we choose PrTFIDF It’s a version of Chicago – the standard classic Macintosh menu font, with that distinctive thick diagonal in the ”N”. Text(Chicago_font), Context(Chicago_font)

  28. Performance (local only): ranking accuracy

  29. Talk outline • High-level algorithmic approach. • Bi-partite graph matching with global and local inference. • Local Inference. • Experiments & Results • Global Inference. • Experiments & Results • Results, Conclusions • Demo

  30. Co-occurrence(Title1,Title2) The city senses of Boston and Chicago appear together often.

  31. Co-occurrence(Title1,Title2) Rock music and albums appear together often

  32. Global ranking • How to approximate the “global semantic context” in the document”? (What is Γ’?) • Use only non-ambiguous mentions for Γ’ • Use the top baseline disambiguation for NER surface forms. • Use the top baseline disambiguation for all the surface forms. • How to define relatedness between two titles? (What is Ψ?)

  33. Ψ : Pair-wise relatedness between 2 titles: Normalized Google Distance Pointwise Mutual Information

  34. What is best the Γ’? (ranker accuracy, solvable mentions)

  35. Results – ranker accuracy (solvable mentions)

  36. Results: Local + Global

  37. Talk outline • High-level algorithmic approach. • Bi-partite graph matching with global and local inference. • Local Inference. • Experiments & Results • Global Inference. • Experiments & Results • Results, Conclusions • Demo

  38. Conclusions: • Dealing with a very large scale knowledge acquisition and extraction problem • State-of-the-art algorithmic tools that exploit usingcontent & structure of the network. • Formulated a framework for Local & Global reference resolution and disambiguation into knowledge networks • Proposed local and global algorithms: state of the art performance. • Addressed scaling issue: a major issue. • Identified key remaining challenges (next slide).

  39. We want to know what we don’t know • Not dealt well in the literature • “As Peter Thompson, a 16-year-old hunter, said ..” • “Dorothy Byrne, a state coordinator for the Florida Green Party…” • We train a separate SVM classifier to identify such cases. The features are: • All the baseline, lexical and semantic scores of the top candidate. • Score assigned to the top candidate by the ranker. • The “confidence” of the ranker on the top candidate with respect to second-best disambiguation. • Good-Turing probability of out-of-Wikipedia occurrence for the mention. • Limited success; future research.

  40. Comparison to the previous state of the art (all mentions, including OOW)

  41. Demo