1 / 59

Representation Learning for Word, Sense, Phrase, Document and Knowledge

Representation Learning for Word, Sense, Phrase, Document and Knowledge. Natural Language Processing Lab , Tsinghua University Yu Zhao , Xinxiong Chen, Yankai Lin, Yang Liu Zhiyuan Liu , Maosong Sun. Contributors. Yankai Lin. Yu Zhao. Xinxiong Chen. Yang Liu.

adamdaniel
Télécharger la présentation

Representation Learning for Word, Sense, Phrase, Document and Knowledge

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. RepresentationLearningforWord, Sense, Phrase, Document and Knowledge Natural Language Processing Lab, Tsinghua University Yu Zhao, Xinxiong Chen, Yankai Lin, Yang Liu Zhiyuan Liu, Maosong Sun

  2. Contributors YankaiLin YuZhao XinxiongChen Yang Liu

  3. ML = Representation+ Objective + Optimization

  4. GoodRepresentationisEssentialfor GoodMachineLearning

  5. Representation Learning MachineLearning Systems RawData YoshuaBengio.Deep Learning of Representations.AAAI2013Tutorial.

  6. NLPTasks: Tagging/Parsing/Understanding Knowledge Representation Document Representation Phrase Representation Sense Representation WordRepresentation UnstructuredText

  7. NLPTasks: Tagging/Parsing/Understanding Knowledge Representation Document Representation Phrase Representation Sense Representation WordRepresentation UnstructuredText

  8. TypicalApproachesforWordRepresentation • 1-hotrepresentation:basisofbag-of-wordmodel star [0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, …] sun [0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, …] sim(star,sun)= 0

  9. TypicalApproachesforWordRepresentation • Count-baseddistributionalrepresentation

  10. Distributed Word Representation • Eachwordisrepresentedasadenseandreal-valuedvectorinalow-dimensionalspace

  11. Typical ModelsofDistributed Representation Neural Language Model YoshuaBengio.A neural probabilistic language model. JMLR 2003.

  12. Typical ModelsofDistributed Representation word2vec Tomas Mikolov et al. Distributed representations of words and phrases and their compositionality. NIPS 2003.

  13. Word Relatedness

  14. SemanticSpaceEncodeImplicitRelationshipsbetweenWords W(‘‘China“)−W(‘‘Beijing”) ≃ W(‘‘Japan“)−W(‘‘Tokyo")

  15. Applications: Semantic Hierarchy Extraction Fu, Ruiji, et al. Learning semantic hierarchies via word embeddings. ACL 2014.

  16. Applications: Cross-lingual JointRepresentation Zou, Will Y., et al. Bilingual word embeddings for phrase-based machine translation. EMNLP 2013.

  17. Applications: Visual-Text Joint Representation Richard Socher, et al. Zero-Shot Learning Through Cross-Modal Transfer. ICLR 2013.

  18. Re-search, Re-invent word2vec≃ MF NeuralLanguageModels DistributionalRepresentation SVD Levyand Goldberg. Neural word embedding as implicit matrix factorization. NIPS2014.

  19. NLPTasks: Tagging/Parsing/Understanding Knowledge Representation Document Representation Phrase Representation Sense Representation WordRepresentation UnstructuredText

  20. WordSenseRepresentation Apple

  21. Multiple Prototype Methods J. Reisingerand R. Mooney. Multi-prototype vector-space models of word meaning. HLT-NAACL2010. EHuang,etal. Improving word representations via global context and multiple word prototypes. ACL2012.

  22. Nonparametric Methods Neelakantanetal.Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space.EMNLP2014.

  23. Joint Modeling of WSD and WSR WSD ? ? WSR Jobs Founded Apple Chen Xinxiong, et al. A Unified Model for Word Sense Representation and Disambiguation. EMNLP 2014.

  24. Joint Modeling of WSD and WSE

  25. Joint Modeling of WSD and WSE WSDonTwoDomainSpecificDatasets

  26. NLPTasks: Tagging/Parsing/Understanding Knowledge Representation Document Representation Phrase Representation Sense Representation WordRepresentation UnstructuredText

  27. PhraseRepresentation • Forhigh-frequencyphrases,learnphraserepresentationbyregardingthemaspseudowords: Log Angeles  log_angeles • Many phrasesareinfrequentandmanynewphrasesgenerate • Webuildaphraserepresentationfromitswordsbasedonthesemanticcomposition natureof languages

  28. SemanticCompositionforPhraseRepresent. + neural network neuralnetwork

  29. SemanticCompositionforPhraseRepresent. HeuristicOperations Tensor-VectorModel Zhao Yu, et al. Phrase Type Sensitive Tensor Indexing Model for Semantic Composition. AAAI 2015.

  30. SemanticCompositionforPhraseRepresent. Model Parameters

  31. Visualization for Phrase Representation

  32. NLPTasks: Tagging/Parsing/Understanding Knowledge Representation Document Representation Phrase Representation Sense Representation WordRepresentation UnstructuredText

  33. Document as Symbols for DR

  34. Semantic CompositionforDR: CNN

  35. Semantic CompositionforDR: RNN

  36. Topic Model • Collapsed Gibbs Sampling • Assign each word in a document with an approximately topic

  37. Topical Word Representation Liu Yang, et al. Topical Word Embeddings. AAAI 2015.

  38. NLPTasks: Tagging/Parsing/Understanding Knowledge Representation Document Representation Phrase Representation Sense Representation WordRepresentation UnstructuredText

  39. Knowledge Bases and Knowledge Graphs • Knowledgeisstructuredasagraph • Eachnode=anentity • Eachedge=arelation • Arelation=(head,relation,tail): • head=subjectentity • relation=relationtype • tail=objectentity • Typicalknowledge bases • WordNet:LinguisticKB • Freebase:WorldKB

  40. Research Issues • KG isfarfromcomplete, we need relation extraction • Relation extraction from text: information extraction • Relation extraction from KG: knowledge graph completion • Issues: KGs are hard to manipulate • Highdimensions: 10^5~10^8 entities, 10^7~10^9 relation types • Sparse: few valid links • Noisy and incomplete • How: Encode KGs into low-dimensional vector spaces

  41. Typical Models-NTN Energy Model Neural Tensor Network (NTN)

  42. TransE:Modeling Relations as Translations • Foreach(head,relation,tail),relationworksasatranslationfromheadtotail

  43. TransE:Modeling Relations as Translations • Foreach(head,relation,tail),makeh+r = t

  44. LinkPredictionPerformance On Freebase15K:

  45. TheIssueofTransE • Havedifficultiesformodelingmany-to-manyrelations

  46. ModelingEntities/RelationsinDifferentSpace • Encodeentitiesandrelationsindifferentspace, anduserelation-specificmatrixtoproject Lin Yankai, et al. Learning Entity and Relation Embeddings for Knowledge Graph Completion. AAAI 2015.

  47. ModelingEntities/RelationsinDifferentSpace • Foreach(head,relation,tail),makehxW_r+r = txW_r head relation tail + =

  48. Cluster-based TransR (CTranR)

  49. Evaluation: Link Prediction Which genre is the movie WALL-E? WALL-E _has_genre?

  50. Evaluation: Link Prediction Which genre is the movie WALL-E? WALL-E _has_genre Animation Computer animation Comedy film Adventure film Science Fiction Fantasy Stop motion Satire Drama Connecting

More Related