1 / 33

Scratch

Attention. (almost). from. Scratch. Paraphrase. Identification. using. Attention. Amir Hadifar Polytechnic University of Tehran. Overview. What problem attention solve? What is Attention? Some Applications. [ Recurrent Models of Visual Attention by Mnih et.al 2014 ].

ediea
Télécharger la présentation

Scratch

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Attention (almost) from Scratch Paraphrase Identification using Attention Amir Hadifar Polytechnic University of Tehran

  2. Overview • What problem attention solve? • What is Attention? • Some Applications

  3. [Recurrent Models of Visual Attention by Mnih et.al 2014] [Photo from videoblocks.com]

  4. A A A A I am a Sentence [Cho et.al 2014; Sutskever et.al 2014; Goldberg 2017; Olah2015] [RNN & Attention diagrams derived from distil.pub]

  5. A A A 1 3 1.2 This is Good Softmax 0.10 0.77 0.13

  6. A [Cho et.al 2014; Sutskever et.al 2014; Goldberg 2017; Olah2015]

  7. سلام دنیا B A B A World Hello

  8. [Photo from youtube.com]

  9. سلام دنیا B A B A World Hello [Bahdanao et.al 2015]

  10. B B A A A A … [Bahdanao et.al 2015]

  11. B B A A A A … [Loung et.al 2015]

  12. B B A A A A … [Loung et.al 2015]

  13. B B A A A A …

  14. [Bahdanao et.al 2015]

  15. [Bahdanao et.al 2015]

  16. Entails Premise: دیروز باران آمد Contradict Hypothesis: هوا ابری بود Neither [Images are from blog.fastforwardlabs.com; KDNuggets.com]

  17. Word Attention A A A A Word Encoder … … [Li et.al 2015; Yang et.al 2016]

  18. Word Attention A A A A Word Encoder … … [Li et.al 2015; Yang et.al 2016]

  19. Softmax Sentence Attention B B B B Sentence Encoder … … [Li et.al 2015; Yang et.al 2016]

  20. [Hierarchical Attention Networks for Documents Classification by yang et.al 2016]

  21. A A A A A A چربی​های کم​کردن مقدار کاهش بدن BMI

  22. A A A A A A چربی​های کم​کردن مقدار کاهش بدن BMI

  23. A A A A A A چربی​های کم​کردن مقدار کاهش بدن BMI

  24. A A A A A A چربی​های کم​کردن مقدار کاهش بدن BMI

  25. B B B B بدن چربی​های کم​کردن 0 مقدار BMI کاهش 0 Pooling Pooling چربی​های بدن کم​کردن مقدار BMI کاهش 0 0

  26. Duplicate: Yes/No Feed Forward Network Attention Layer B B B B بدن چربی​های کم​کردن 0 مقدار BMI کاهش 0

  27. Performance for paraphrase identification on the Qoura dataset [Rows 2 to 8 are taken from Tomaret.al 2017]

  28. According to Quora: ground truth labels contains some amount of noise Error analysis for paraphrase identification on the Qoura dataset [www.data.quora.com/First-Quora-Dataset-Release-Question-Pairs]

  29. Last words • Still needed to explore • Skip-Connection • Other family members • Neural Turing Machine • Adaptive Computation Time [distill.pub/2016/augmented-rnns/]

  30. References • Y. Goldberg. (2017). Neural Network Methods for Natural Language Processing • D. Bahdanau, K. Cho, and Y. Bengio. (2015). Neural Machine Translation by Jointly Learning to Align and Translate • Z. Yang, D. yang, C. Dyer, X. He, A. Smola, and E. Hovy. (2016). Hierarchical Attention Networks for Document Classification • I. Sutskever, O. Vinyals, and Q. Le. (2014). Sequence to Sequence Learning with Neural Networks • K. Cho, B. Merrienboer, C. Gulcehre, F. Bougares, H. Schwenk, and Y. Bengio. (2014). Learning Phrase Representation using RNN Encoder-Decoder for Statistical Machine Translation • CrisOlah. (2015). colah.github.io/posts/2015-08-Understanding-LSTMs • M. Loung, H. Pham, and C. Manning. (2015). Effective Approaches to Attention-based Neural Machine Translation • J. Li, M. Loung, and D. Jurafsky. (2015). A Hierarchical Neural Auto-encoder for Paragraphs and Documents • Z. Wang, W. Hamza, and R. Florian. (2017). Bilateral Multi-perspective Matching for Natural Language Sentences • G. Tomar, T. Duque, O. Tackstrom, J. Uszkoreit, and D. Das. (2017). Neural Paraphrase Identification of Questions with Noisy Pretraining

  31. Any Question? A A A A Attention Thanks for your

More Related