1 / 21

Click to Add Title

Click to Add Title. A Systematic Framework for Sentiment Identification by Modeling User Social Effects. Kunpeng Zhang Assistant Professor Department of Information and Decision Sciences University of Illinois at Chicago kzhang6@uic.edu. Agenda. Introduction Problem statement Methodology

Télécharger la présentation

Click to Add Title

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Click to Add Title ASystematicFrameworkforSentimentIdentificationbyModelingUserSocialEffects KunpengZhang AssistantProfessor DepartmentofInformationandDecisionSciences UniversityofIllinoisatChicago kzhang6@uic.edu

  2. Agenda • Introduction • Problemstatement • Methodology • Experimentsandresults • Conclusionandfuturework A World-Class Education, A World-Class City

  3. Co-authors • YiYang,Ph.D.studentatNorthwesternUniversity • AaronSun,ResearchScientist,SamsungResearchAmerica • HengchangLiu,AssistantProfessoratUniversityofScienceandTechnologyofChina A World-Class Education, A World-Class City

  4. Introduction • User generated content on social media platforms • Data analysis for intelligent marketing decisions • Voice of consumers • Positive / negative aspects A World-Class Education, A World-Class City

  5. ProblemStatement • Given a sentence (usually, it is user-generated content on social media platforms, such as comments on Facebook, tweets on Twitter, review on Amazon.com, etc.), we classify it into one of three categories: • Positive: directly or indirectly praise something, e.g. “I love it! (^_^)” • Negative: directly or indirectly criticize something, e.g. “We don’t like it at all. ” • Objective: No sentiments, or express a fact. e.g. “Apple will release a new iPhone in next two months.” A World-Class Education, A World-Class City

  6. Previous Work • Bag-of-word approaches • Collecting keywords [5, 7, 21, 26] • Rule-based methods • From the perspective of language characteristics [6, 22] • Machine learning based methods • Sentence-level and document-level [7, 8, 10, 29] • However, • None of them considers user social effects… A World-Class Education, A World-Class City

  7. Methodology • Systematic framework • Classification problem • 4 major features: • Peer influence • User preference • User profile • Textual sentiment A World-Class Education, A World-Class City

  8. Methodology 1 – User Preference (UserPref) • User preference can somehow reflects user sentiments. • Item-based collaborative filtering on user-item matrix • Row: user (millions) • Column: brand (thousands) • The element mij is 1 if user i “likes” brand j, otherwise 0 m11, m12,…………,m1n m21, m22, …………,m2n …………… mm1,mm2, ……….., mmn Note: “like” – like a brand on Facebook, following a brand on Twitter, give a high rating for a product on Amazon, etc. A World-Class Education, A World-Class City

  9. Methodology 1 – User Preference (UserPref) • Two important issues using collaborative filtering • Data sparsity • Integrate multiple low-lever items into fewer high-lever items • “Mac” and “iPhone”  “Computer and Electronics” • Similarity calculation and preference prediction • Which similarity measure is better? • Cosine, Pearson correlation, Tanimoto correlation,log-likelihood based, Euclidean distance-based. • Weighted sum strategy to approximate user preference A World-Class Education, A World-Class City

  10. Methodology 2 – Peer Influence (PeerInf) • Herding behavior in social psychology. • We assume that if most of previous comments in one discussion are positive, it is likely to give a positive comment, and similarly for the negative case. • We randomly pick 1, 000 posts from 5 different Facebook pages and 1, 000 discussion threads from 5 different airlines on the Flyertalk.com forum. The average number of comments per post and per thread is 794 and 32, respectively. • The sentiments are identified by the state-of-the-art textual algorithm. A World-Class Education, A World-Class City

  11. Methodology 2 – Peer Influence A World-Class Education, A World-Class City

  12. Methodology 2 – Peer Influence Modeling A World-Class Education, A World-Class City

  13. Methodology 3 – User Profile (GenCat) • Female are more positive than male and fashion page has a higher percentage of positive sentiments than politician page on Facebook and Twitter. A World-Class Education, A World-Class City

  14. Methodology 4 – Textual Sentiment (TextSent) • State-of-the-art textual sentiment identification algorithm • Ensemble method integrating three individual algorithms • Semantic rules based on language characteristics • Numeric strength computing • Bag-of-word • Accuracy: ~86% A World-Class Education, A World-Class City

  15. Experiments and Results • Data collection • Facebook: posts, comments, likes, user profile • Twitter: tweets, follower, user profile • Amazon: product and reviews • Flyertalk (airline discussion forum): discussions • Data cleaning • Remove spam users A World-Class Education, A World-Class City

  16. Experiments and Results • The features of learning model for 4 datasets and their differences. Topic is modified based on the raw Facebook category. “×”: missed; “√”: existing. A World-Class Education, A World-Class City

  17. Experiments and Results • Similarity measure check. • MAE and RMSE to compare the average estimated error between real preference and predicted preference • Hadoop-based collaborative filtering implemented by Mahout. • Takes 34 and 21 minutes to approximate user preferences for Facebook and Twitter • Can NOT complete in 10 hours for single CPU. A World-Class Education, A World-Class City

  18. Experiments and Results • Facebook data • Twitter data • Amazon.com data A World-Class Education, A World-Class City

  19. Experiments and Results • Classification accuracy (SS: semantic + syntactic features used in [28]) A World-Class Education, A World-Class City

  20. Conclusion and Future Work • We propose a systematic framework to identify social media sentiments by modeling user social effects: user preference, peer influence, user profile, and textual sentiment itself. • However, • More networked data could be incorporated. • More efficient algorithms to calculate user preference. A World-Class Education, A World-Class City

  21. Thank you A World-Class Education, A World-Class City

More Related