Download
instagram hashtag sentiment analysis n.
Skip this Video
Loading SlideShow in 5 Seconds..
Instagram #Hashtag Sentiment Analysis PowerPoint Presentation
Download Presentation
Instagram #Hashtag Sentiment Analysis

Instagram #Hashtag Sentiment Analysis

658 Vues Download Presentation
Télécharger la présentation

Instagram #Hashtag Sentiment Analysis

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Instagram #Hashtag Sentiment Analysis Jared Plumb Nipun Gunawardena Nan Xiao Hao Zhang

  2. Can we successfully predict the sentiment of an Instagram hashtag?

  3. Instagram Overview • Instagram – Photo sharing social network • Each post can contain a caption and hashtags • Hashtags group posts into categories • Express extra information/emotion about the post

  4. Hashtag Overview • Composed of words, phrases, and acronyms • Often contain misspellings, made up words, and slang/vernacular #love #myfriendsarehotterthanyourfriends #likeabos #ugly #depresstion #selfharmmm

  5. Sentiment Analysis Overview • Natural language processing method used to identify sentiment within text • Early work by Pang [1] analyzed movie reviews from IMDB • Later work by Davidov [2] analyzed the sentiment of Twitter posts using hashtags and smileys.

  6. Our Process • Analyze the sentiment of individual Instagram hashtags by using a Naïve Bayes Classifier • Naïve Bayes: • Initially used for spam detection • Simple but powerful • While other methods (SVM) may often work better, Naïve Bayes often used as a baseline

  7. Naïve Bayes Classifier Assume independence & generalize Convert to algorithm

  8. Training Data • Pang and Lee’s movie reviews • Approximately 1400 evenly split positive and negative movie reviews from IMDB • Only used a subset of this data • Hu and Liu positive/negative word lists • Approximately 6800 unevenly split popular positive and negative words from the English language • Includes common online misspellings • Positive/Negative hashtags we mined • Trained randomly on 20% of this data

  9. N-grams • N continuous “units’ of language • Words are typically used as units in sentiment analysis 2gram(“I like turtles”) = [“I like”, “like turtles”] • We used characters as units • Through experiment, we found 3-grams or 4-grams worked best • Removed non-alpha/numeric characters 4gram(“Love it”) = [“Love”, “ovei”, “veit”]

  10. Results – Pos/Neg Wordlists

  11. Results – Movie Reviews

  12. Results – Hashtags

  13. Conclusion/Interesting Notes • Positive/Negative word list performs best • Hashtags may do better with more popular hashtags • Movie reviews don’t perform well • At first glance, Instagram is overwhelmingly positive • Sentiment analysis may have an effective 80% accuracy limit • Neutral posts weren’t counted

  14. Thanks! #Questions?

  15. References • [1] Pang, Bo, Lillian Lee, and ShivakumarVaithyanathan. "Thumbs up?: sentiment classification using machine learning techniques." Proceedings of the ACL-02 conference on Empirical methods in natural language processing-Volume 10. Association for Computational Linguistics, 2002. v1.1. • [2] Davidov, Dmitry, Oren Tsur, and Ari Rappoport. "Enhanced sentiment learning using twitter hashtags and smileys." Proceedings of the 23rd International Conference on Computational Linguistics: Posters. Association for Computational Linguistics, 2010. • [3] http://www.informationweek.com/software/information-management/expert-analysis-is-sentiment-analysis-an-80--solution/d/d-id/1087919? • [4] http://www.socialmediaexplorer.com/social-media-monitoring/never-trust-sentiment-accuracy-claims/

  16. Image Sources • [1] http://a2.mzstatic.com/us/r30/Purple/v4/11/0a/6c/110a6c60-3bf1-0f8d-089b-ab82407774ad/mzl.ikicqhss.png