1 / 39

Roberto Navigli

BabelNet and beyond: a huge multilingual semantic network and its potential for interconnecting migration routes. Roberto Navigli. http://lcl.uniroma1.it. 16th June 2016 – Rome. Roberto Navigli. Associate Professor in the Department of Computer Science (Sapienza, Rome)

keena
Télécharger la présentation

Roberto Navigli

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BabelNet and beyond: a huge multilingual semantic network and its potential for interconnecting migration routes Roberto Navigli http://lcl.uniroma1.it 16th June 2016 – Rome

  2. Roberto Navigli • Associate Professor in the Department of Computer Science (Sapienza, Rome) • Principal investigator of several projects: • ERC Starting Grant(MultiJEDI) • FP7 CSA (LIDER) • Google Focused Research Award(co-PI) • Managing a team of 10 researchers, out of which 6 Ph.D. students BabelNet, Babelfy and Beyond! Roberto Navigli

  3. Outline of the talk • BabelNet: a huge multilingual semantic network • Babelfy: a state-of-the-art multilingual disambiguation system • What's next: how to help interconnect/detect migration flows with the aid of our technologies BabelNet, Babelfy and Beyond! Roberto Navigli

  4. INTEGRATING KNOWLEDGE BabelNet, Babelfy and Beyond! Roberto Navigli

  5. The resource diaspora BabelNet, Babelfy and Beyond! Roberto Navigli

  6. The resource diaspora • There are many online dictionaries and encyclopedias • Each covers one or a limited number of languages • The knowledge found in different resources is often complementary • To get coverage of more languages • To get additional information about the entry • To obtain links to geographical information • However, each resource provides different meaning inventories BabelNet, Babelfy and Beyond! Roberto Navigli

  7. BabelNet: Unifying Lexical Knowledge Resources into a Single Semantic Network Key Objective 1: create knowledge for all languages MultiWordNet WOLF BalkaNet MCR GermaNet WordNet BabelNet, Babelfy and Beyond! Roberto Navigli

  8. WordNet [Miller et al., 1990; Fellbaum, 1998] semantic relation concepts BabelNet, Babelfy and Beyond! Roberto Navigli

  9. Wikipedia [The Web Community, 2001-today] (unspecified) semantic relation • Playing with senses • Bla bla bla bla bla bla bla • Bla bla bla bla bla bla bla • Bla bla bla bla bla bla bla • Bla bla bla bla bla bla bla • Bla bla bla bla bla bla bla concepts BabelNet, Babelfy, Video games with a purpose & the Wikipedia Bitaxonomy Roberto Navigli

  10. Merging entries from different resources into BabelNet • We collect lexicalizations, definitions, translations, images, etc. from each of the merged resources WordNet 10 BabelNet, Babelfy and Beyond! Roberto Navigli

  11. BabelNet: concepts and semantic relations • We encode knowledge as a labeled directed graph: • Each vertex is a Babel synset (=synonym set) • Each edge is a semantic relation between synsets: • is-a (balloon is-a aircraft) • part-of (gasbag part-of balloon) • instance-of (Einstein instance-of physicist) • … • unspecified/relatedness (balloon related-to flight) balloonEN, BallonDE, aerostatoES, aerostatoIT, pallone aerostaticoIT, mongolfièreFR BabelNet, Babelfy and Beyond! Roberto Navigli

  12. What is BabelNet? • A merger of resources of different kinds: META Prize 2015: BabelNet Roberto Navigli

  13. What is BabelNet? • A merger of resources of different kinds: • WordNet: the most popular computational lexicon of English • Open Multilingual WordNet: a collection of open wordnets • WoNeF: a French WordNet • Wikipedia: the largest collaborative encyclopedia • Wikidata: the largest collaborative knowledge base • Wiktionary: the largest collaborative dictionary • OmegaWiki: a medium-size collaborative multilingual dictionary • GeoNames: a worldwide geographical database • Microsoft Terminology: a computer science thesaurus • High-quality automatic sense-based translations BabelNet, Babelfy and Beyond! Roberto Navigli

  14. What is BabelNet? • A merger of resources of different kinds: BabelNet, Babelfy and Beyond! Roberto Navigli

  15. Why do we need BabelNet? • Multilinguality: the same concept is expressed in tens of languages BabelNet, Babelfy and Beyond! Roberto Navigli

  16. Why do we need BabelNet? • Multilinguality: the same concept is expressed in tens of languages • Coverage: 271 languages and 14 million entries! • 6Mconcepts and 7.7M named entities • 119M word senses • 378Msemantic relations (27 relations per concept on avg.) • 11M images associated with concepts • 41M textual definitions • 2M concepts with domains associated BabelNet, Babelfy and Beyond! Roberto Navigli

  17. Why do we need BabelNet? • Multilinguality: the same concept is expressed in tens of languages • Coverage: 271 languages and 14 million entries! • Concepts and named entities together: dictionary and encyclopedic knowledge is semantically interconnected Multilingual Web Access – WWW 2015 Roberto Navigli META Prize 2015: BabelNet Roberto Navigli 20/06/2016 17

  18. Why do we need BabelNet? • Multilinguality: the same concept is expressed in tens of languages • Coverage: 271 languages and 14 million entries! • Concepts and named entities together: dictionary and encyclopedic knowledge is semantically interconnected • "Dictionary of the future": semantic network structure with labeled relations, pictures, multilingual synsets Multilingual Web Access – WWW 2015 Roberto Navigli META Prize 2015: BabelNet Roberto Navigli 20/06/2016 18

  19. Why do we need BabelNet? • Multilinguality: the same concept is expressed in tens of languages • Coverage: 271 languages and 14 million entries! • Concepts and named entities together: dictionary and encyclopedic knowledge is semantically interconnected • "Dictionary of the future": semantic network structure with labeled relations, pictures, multilingual synsets • Media coverage and prestigious prizes BabelNet, Babelfy and Beyond! Roberto Navigli 19

  20. ADDRESSING LEXICAL AMBIGUITY BabelNet, Babelfy and Beyond! Roberto Navigli

  21. Lexical ambiguity! • Thomas and Mario played as strikers in Munich. BabelNet, Babelfy and Beyond! Roberto Navigli

  22. Word Sense Disambiguation and Entity Linking • Thomasand Mario are strikers playing in Munich BabelNet, Babelfy and Beyond! Roberto Navigli 22

  23. Multilingual Joint Word SenseDisambiguation(MultiJEDI) Key Objective 2: use all languages to disambiguate one BabelNet, Babelfy and Beyond! Roberto Navigli

  24. So what? BabelNet, Babelfy and Beyond! Roberto Navigli 24

  25. BabelNet, Babelfy and Beyond! Roberto Navigli Step 1: Find all possible meanings of words Ambiguity! • “Thomas and Mario are strikers playing in Munich” Munich (City) Seth Thomas Mario (Character) striker (Sport) Mario (Album) Striker (Video Game) Thomas Müller FC Bayern Munich Mario Gómez Striker (Movie) Thomas (novel) Munich (Song) 20/06/2016 25

  26. BabelNet, Babelfy and Beyond! Roberto Navigli Step 2: Connect all the candidate meanings • Thomasand Marioare strikersplaying in Munich 20/06/2016 26

  27. BabelNet, Babelfy and Beyond! Roberto Navigli Step 3: Extract a dense subgraph • Thomas and Mario are strikers playing in Munich 20/06/2016 27

  28. BabelNet, Babelfy and Beyond! Roberto Navigli Step 3: Extract a dense subgraph • Thomas and Mario are strikers playing in Munich 20/06/2016 28

  29. BabelNet, Babelfy and Beyond! Roberto Navigli Step 4: Select the most reliable meanings • “Thomas and Mario are strikers playing in Munich” Munich (City) Seth Thomas Mario (Character) striker (Sport) Mario (Album) Striker (Video Game) Thomas Müller FC Bayern Munich Mario Gómez Striker (Movie) Thomas (novel) Munich (Song) 20/06/2016 29

  30. Experimental Results: Fine-grained (Multilingual) Disambiguation SemEval-2007 task 17 SemEval-2013 task 12 Senseval-3 BabelNet, Babelfy and Beyond! Roberto Navigli 31

  31. The Crazy Polyglot! Multilingual Web Access – WWW 2015 Roberto Navigli

  32. Live demo – Crazy polyglot! EN In todayʼs knowledge and information society FR le paysage lexicographique est plus hétérogène que jamais. IT Possono le risorse stand-alone competere ES con múltiples funciones, portale lexicográficas multilingüe y servicios web, ZH Web服务,定 制 的 喜 好 和 个 人 用 户 的 个 人 资 料 ? BabelNet, Babelfy and Beyond! Roberto Navigli

  33. 1) Geographical named entities are interlinked • Each geographical entity comes with: • geolocation information • translations in dozens of languages • connections to other concepts and named entities (e.g. politicians, important places, concepts, events, etc.) BabelNet, Babelfy and Beyond! Roberto Navigli

  34. 2) Named entities, events and actions are expressed in any language • We can process tweets, facebook/instagram/blog posts and identify these entities and interconnect them independently of the language they are expressed in Οδεύουμε προς τη #Μακεδονία (We are moving to #Macedonia) Greek police started phase 2 of #Idomeni evacuation, emptying camp near Polykastro-1,828 people 2B moved إخلاء! حاصرت شرطة مكافحة الشغب محطة الغاز EKO! اللاجئين رافضا تركه! أي إشعار مسبق (EKO Evacuation! Riot police have surrounded EKO gas station! Refugees refusing to leave! No prior notice given) سيتم نقل سكانIdomeni إلى مخيمات جديدة، بما في ذلك في ثاني أكبر مدينة، سالونيك. (Idomeni residents will be moved to new camps, including in the second-largest city, Thessaloniki.) BabelNet, Babelfy and Beyond! Roberto Navigli

  35. 2) Named entities , events and actions are expressed in any language • We can process tweets, facebook/instagram/blog posts and identify these entities and interconnect them independently of the language they are expressed in Οδεύουμε προς τη #Μακεδονία (We are moving to #Macedonia) Greek police started phase 2 of #Idomeni evacuation, emptying camp near Polykastro-1,828 people 2B moved إخلاء! حاصرت شرطة مكافحة الشغب محطة الغاز EKO! اللاجئين رافضا تركه! أي إشعار مسبق (EKO Evacuation! Riot police have surrounded EKO gas station! Refugees refusing to leave! No prior notice given) سيتم نقل سكانIdomeni إلى مخيمات جديدة، بما في ذلك في ثاني أكبر مدينة، سالونيك. (Idomeni residents will be moved to new camps, including in the second-largest city, Thessaloniki.) BabelNet, Babelfy and Beyond! Roberto Navigli

  36. 3) Predicting where the migration flows are moving next • Intentions can be automatically identified and extracted from text • Including the next most popular actions and events (e.g. moving, evacuating, going back, etc.) • Integrated with GPS and satellite views of the places Recent achievements in multilingual NLP Roberto Navigli

  37. Summarizing… + helping detect migration flows with our technologies BabelNet, Babelfy and Beyond! Roberto Navigli 38

  38. Roberto Navigli Linguistic Computing Laboratory http://lcl.uniroma1.it @RNavigli

More Related