1 / 24

NEVER-ENDING LANGUAGE LEARNER

NEVER-ENDING LANGUAGE LEARNER. Student: Nguyễn Hữu Thành Phạm Xuân Khoái Vũ Mạnh Cầm Instructor: PhD Lê Hồng Phương. Hà Nội , January 11 2014. Idea: Build a structuring KB. What is KB? Categories: cities, companies, sport teams…. Relations: hasOfficeIn ( organisation , location)

sol
Télécharger la présentation

NEVER-ENDING LANGUAGE LEARNER

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. NEVER-ENDING LANGUAGE LEARNER Student: NguyễnHữuThành PhạmXuânKhoái VũMạnhCầm Instructor: PhD LêHồngPhương HàNội, January 11 2014

  2. Idea: Build a structuring KB. • What is KB? • Categories: cities, companies, sport teams…. • Relations: hasOfficeIn(organisation, location) • Noun Phrase • What is structuring KB?

  3. Idea: Structuring Knowledge Base football uses equipment climbing skates helmet Canada Sunnybrook Miller uses equipment city company hospital Wilson country hockey Detroit GM politician CFRB radio Pearson Toronto play hired hometown airport competeswith home town StanleyCup Maple Leafs city company Red Wings city stadium won won Toyota team stadium Connaught city paper league league acquired city stadium NHL Maple Leaf Gardens member Hino created plays in economic sector Globe and Mail Sundin Prius writer automobile Toskala Skydome Corrola Milson

  4. Ideas: using Machine Learning • Machine Learning: a branch of artificial intelligence, concerns the construction and study of systems that can learn from data.

  5. Seed examples Initial ontology Knowledge Base (KB) Ideas NELL Web Human trainers

  6. Ideas: the task • run 24x7, forever • each day: • Reading task: extract more facts from the web to populate the initial ontology. • Learning task: learn to read (perform #1) better than yesterday.

  7. Knowledge Base Knowledge Integrator Data Resources Beliefs NELL Architecture Candidate facts 1 2 CSEAL CPL CMC RL 3 Subsystem Components

  8. Coupled Pattern Learner (CPL) Learns to extract category and relation instances/ pattern from unstructure text. Learns contextual pattern that high-precision extractor for each predicate. Eg: + Trang An la ten mot co gai. + Trang An la ten mot cong ty.  Use it to improve high-precision

  9. Input/Output - Input : + Larger text corpus + Initial ontology containing the information. Output: + Proposed instances/ contextual pattern for each predicate.

  10. Input: An ontology O, and a text corpus C Output: Trusted instances/patterns for each predicate for i=1,2,...,∞ do foreach predicate p in O do EXTRACTcandidate instances/contextual patterns using recently promoted patterns/instances; FILTERcandidates that violate coupling; RANKcandidate instances/patterns; PROMOTEtop candidates; end end

  11. Example:Samsung vừa tung clip chế nhạo sản phẩm mới của Nokia.

  12. New candidate facts Coupled SEAL Beliefs CSEAL Internet

  13. Coupled SEAL • SEAL (Set Expander for Any Language): expands entities automatically by utilizing resources from the Web • CSEAL adds mutual-exclusion and type-checking constraints

  14. Coupled SEAL • Coupled SEAL :: A semi-structured extractor • Queries the internet with sets of beliefs from each category or relation; mines lists and tables for instances • Uses mutual exclusion relationships to provide negative examples for filtering overly general lists and tables • 5 queries/category 10 queries/relation fetches 50 web pages/query • probabilities assigned as in CPL

  15. Coupled SEAL • Example:

  16. KB New candidate facts Coupled Morphological Classifier CMC Data Resources CMC classify NP based on various morphological features (words, capitalization, affixes)

  17. Coupled Morphological Classifier • Ex1: Bach Mai hotel  hotel(Bach Mai) • Ex2: Mai  person(Mai) • Ex3: tradition  noun(tradition)

  18. Coupled Morphological Classifier • Beliefs from KB are used as training instances • CMC examines candidate facts proposed by other components and classifies up to 30 new beliefs/candidate

  19. Candidate facts New candidate facts Rule Learner RL Beliefs RL uses categories and relations in KB as its input and make new relations for KB.

  20. Rule Learner • Example 1: playSport(Rooney, football)  athlete(Rooney), sport(football) • Example2: isCapital(Hanoi, Vietnam), liveIn(Thanh, Hanoi), roommate(Thanh, Khoai), roommate(Khoai, Cam)  liveIn(Thanh, Vietnam), roommate(Thanh, Cam), liveIn(Khoai, Hanoi)…..

  21. Rule Learner • Some kinds of Rule Learner Systems: OneR, Ridor, PART, JRip, ConjunctiveRule. • Clip: https://www.youtube.com/watch?v=5On-tDeu2ic

  22. Initial result • Running 24x7, since January, 12, 2010 • Inputs: • ontology defining >600 categories and relations • 10-20 seed examples of each • 100,000 web search queries per day • ~ 5 minutes/day of human guidance • Result: • KB with > 15 million candidate beliefs, growing daily • learning to reason, as well as read • automatically extending its ontology

  23. Initial result • Demo: • http://rtw.ml.cmu.edu/rtw/kbbrowser/beverage:beer

  24. References • NELL article: http://www.cs.cmu.edu/~acarlson/papers/carlson-aaai10.pdf • http://rtw.ml.cmu.edu/rtw/kbbrowser/beverage:beer • http://videolectures.net/akbcwekex2012_mitchell_language_learning/ • Tom Mitchell’s seminar: http://www.youtube.com/watch?v=51q2IajH94A • RL: http://mydatamining.wordpress.com/2008/04/14/rule-learner-or-rule-induction/

More Related