1 / 15

Named Entity Recognition in Tweets: TwitterNLP

Twitter NLP. Named Entity Recognition in Tweets: TwitterNLP. Ludymila Lobo . Ludymila Lobo. Resources. Reading material

nate
Télécharger la présentation

Named Entity Recognition in Tweets: TwitterNLP

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Twitter NLP Named Entity Recognition in Tweets:TwitterNLP Ludymila Lobo Ludymila Lobo

  2. Resources • Reading material • Named Entity Recognition in Tweets, RITTER, Alan, CLARK, Sam, Mausam and ETZIONI, Oren. Obtained on Association for Computational Linguistics website, at https://aclweb.org/anthology/D/D11/D11-1141.pdf • http://www.academia.edu/1128304/Shallow_parsing_as_part-of-speech_tagging • Twitter NLP Tool • https://github.com/aritter/twitter_nlp • AplicationwithTwitter NLP • statuscalendar.com • CollectingTweets • https://dev.twitter.com • http://www.webdevdoor.com/jquery/twitter-feed-authentication-search • https://github.com/abraham/twitteroauth • http://sourceforge.net/projects/xampp/ http://www.webdevdoor.com/jquery/twitter-feed-authentication-search/

  3. Why Twitter? • Big amount of data (even more than Library of Congress -Washington D.C.)*, with 151 millions of itens • Real time information, some times more up-to-date than articles. • http://pt.wikipedia.org/wiki/Library_of_Congress *Hachman (2011)

  4. Challenges • Noisy and informal nature • Diversity of entities (companies, products, bands, teams, movies, etc), that are not relatively frequent, which makes a sample of Tweets with a few examples • Lack of context • http://twitter.com

  5. Tool • https://github.com/aritter/twitter_nlp • Unzip file, on Linux terminal type: • sh build.sh

  6. Tool • statuscalendar.com

  7. How it works Chunking (shallow parsing) POS (Part of Speech) ->NLP, clustering @paulwalk o It b-np 's b-vp theb-np view i-np fromb-pp whereb-advp I b-np 'm b-vp living i-vp for b-pp twob-np weeks i-np best ADJ ADV NP V betterADJ ADV V DET close ADV ADJ V N cutV N VN VD evenADV DET ADJ V grantNP N V hit V VD VN N DET

  8. How it works POS (Part of Speech) ->NLP, clustering Capitalization classifier: Predicts whether or not a tweet is informatively capitalized (using SVM learning) NER (Named Entity Recognition) Chunking (shallow parsing) Tom Hanks was awesome in Forrest Gump actor movie

  9. Tool @cityofcalgary: Free swimming and golf tomorrow for @cbc Sports Day in Canada #yyc #sportsday http://ow.ly/2G4sf @cityofcalgary/O :/O Free/O swimming/O and/O golf/O tomorrow/O for/O @cbc/O Sports/B-other Day/I-other in/O Canada/B-geo-loc #yyc/O #sportsday/O http://ow.ly/2G4sf/O Adam Beyer: Swedish Techno Pioneer: When it comes to his own DJing and sound, he's slightly more diverse and likes... Adam/B-personBeyer/I-person:/O Swedish/O Techno/O Pioneer/O :/O When/O it/O comes/O to/O his/O own/O DJing/O and/O sound/O ,/O he/O 's/Oslightly/O more/O diverse/O and/O likes/O

  10. https://dev.twitter.com How to retrieve data from Twitter?

  11. <?php session_start(); require_once("twitteroauth/twitteroauth/twitteroauth.php"); //Path to twitteroauthlibrary $search = "wpi OR #WPI"; $notweets = 50; $consumerkey = “123456"; $consumersecret = “123456"; $accesstoken = "123456"; $accesstokensecret = “123456"; functiongetConnectionWithAccessToken($cons_key, $cons_secret, $oauth_token, $oauth_token_secret) { $connection = newTwitterOAuth($cons_key, $cons_secret, $oauth_token, $oauth_token_secret); return $connection; } $connection = getConnectionWithAccessToken($consumerkey, $consumersecret, $accesstoken, $accesstokensecret); $search = str_replace("#", "%23", $search); $tweets = $connection->get("https://api.twitter.com/1.1/search/tweets.json?q=".$search."&count=".$notweets); echojson_encode($tweets); ?> http://www.webdevdoor.com/jquery/twitter-feed-authentication-search/

  12. How to retrieve data from Twitter? • Authentication library https://github.com/abraham/twitteroauth Download and include in the same folder as the code

  13. How to retrieve data from Twitter? http://sourceforge.net/projects/xampp/

  14. How to retrieve data from Twitter? Copytheproject folder to C:\xampp\htdocs

  15. How to retrieve data from Twitter? http://localhost/TwitterStreams/tweet.phpon a browser

More Related