1 / 28

Using a Sentiment Map for Visualizing Credibility of News Sites on the Web

Using a Sentiment Map for Visualizing Credibility of News Sites on the Web. Yukiko Kawai*, Yusuke Fujita *, Tadahiko Kumamoto**, Jianwei Zhang*, Katsumi Tanaka*** * Kyoto Sangyo University, Japan ** Chiba Institute of Technology, Japan *** Kyoto University, Japan. Outline.

dong
Télécharger la présentation

Using a Sentiment Map for Visualizing Credibility of News Sites on the Web

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Using a Sentiment Mapfor Visualizing Credibility of News Sites on the Web Yukiko Kawai*, Yusuke Fujita*, Tadahiko Kumamoto**, Jianwei Zhang*, Katsumi Tanaka*** * Kyoto Sangyo University, Japan ** Chiba Institute of Technology, Japan *** Kyoto University, Japan

  2. Outline • Background • Research goal • System overview • Offline processing • Online processing • Experimental evaluation • Conclusion and future work

  3. Background • To answer this question, I want toread some news to have an opinion about this topic. • Rapid spread of web news sites (e.g., MSN, GoogleNews) • Different sites may have different opinions about the topic A question: What is your attitude towards “Iraq war”? agreeor disagree?

  4. Background A misconception may be caused, if sites’ tendencies are not known in advance If it is a pro-war site I agree this war ??? News Site If it is an anti-war site Is the Iraq war right or wrong? I disagree this war Sentiment tendencies of sites positive Site A ??? Information credibility is improved Well, I have now opinions on different sites negative Is the Iraq war right or wrong? positive Site B This may cause a more fair-minded judgment negative

  5. Outline • Background • Research goal • System overview • Offline processing • Online processing • Experimental evaluation • Conclusion and future work

  6. 6 A concept of sentiment map A query is “Iraq war” Mapping Graph of sentiment based on location Positive Top ranked articles from each news site Negative Demonstration

  7. Outline • Background • Research goal • System overview • Offline processing • Online processing • Experimental evaluation • Conclusion and future work

  8. Web System overview Online processing (Runtime processing) Offline processing (Preprocessing) news sites Yomiuri (Osaka) Yomiuri (Tokyo) Asahi (Tokyo) ・・・ sentiment map query crawling 1) retrieve articles from each news site 2) rank the articles based on tf-idf in each site articles database (including tf-idf, sentiment values) news articles collection morphological analysis 3) calculate the average of sentiment values for each site tf-idf value calculation sentiment values calculation sentiment dictionary 4) generate a sentiment map

  9. Outline • Background • Research goal • System overview • Offline processing • Online processing • Experimental evaluation • Conclusion and future work

  10. Offline processing • News articles collection • Crawl news articles from various news sites and store them into DB • News articles analysis • Eliminate HTML tags • Make morphological analysis to extract nouns, verbs, and adjectives • Calculate tf-idf values of extracted word j for each news article pi • Attach a sentiment vector to each news article • Use a sentiment dictionary Fj: the frequency of word j appearing on article pi Fall: the number of all words on pi N: the number of all articles Nj: the number of articles including j

  11. Sample of sentiment dictionary e = a, b, c, d ⇔ ⇔ ⇔ ⇔ Oc(death) = 0.260 • Sentiment value Oe(w) of an entry word w • A value between 0~1, (e.g., 0: dark, 1: bright) • Calculated by analyzing co-occurrence with the original sentimentwords, based on 200 million articles of Nikkei newspapers

  12. Calculation of Sentiment value Oe(w) • Sentiments and their corresponding original sentiment words e1 e2 Sentiment value: df(e): occurrence times of original sentiment words e df(e&w): co-occurrence times of original sentiment words e and an entry word w

  13. Calculation of Sentiment value Oe(w) • Sentiments and their corresponding original sentiment words e1 e2 Sentiment value of word “death” on the dimension c: Oc(death) = 0.260 Because df(“comfortable” & “death”), df(“peaceful” & “death”), df(“slow” & “death”) << df(“tension”& “death”), df(“emergency”& “death”)

  14. Sentiment vector O(TEXT) of a news article • a news article text =TEXT • TEXT has the number of n keywords • keywords = {w} • Each sentiment value Oe(TEXT) • Sentiment vector O(w) of the article for the keyword w

  15. Outline • Background • Research goal • System overview • Offline processing • Online processing • Experimental evaluation • Conclusion and future work

  16. Online processing • When a user enters query keywords, • Retrieve news articles including the keywords • Rank articles based on tf-idf values for each news site • Calculate the average of sentiment vectors of top n articles for each site • Attach sentiment graphs to corresponding locations of news sites • Also present a list of articles grouped by each site

  17. Outline • Background • Research goal • System overview • Offline processing • Online processing • Experimental evaluation • Conclusion and future work

  18. Experimental evaluation • Query: Daisuke Matsuzaka • A famous Japanese Major Leaguer • A reviewer read all the retrieved articles of different news sites and decided the sentiments of each news site • positive, negative or neutral • For comparison, numeric sentiment values given from our system are categorized to discrete values • positive, negative or neutral

  19. Experimental evaluation • Precision is about 70% • There exist some distinctions among different news sites ⇔ ⇔ ⇔ ⇔

  20. Outline • Background • Research goal • System overview • Offline processing • Online processing • Experimental evaluation • Conclusion and future work

  21. Conclusion and future work • Conclusion • Developed a system called sentiment map for visualizing the sentiment distinction of different news sites • Tested its effectiveness • A prototype: http://klab.kyoto-su.ac.jp/~fujita/cgi-bin/Fuzilla/News/ • Future work • More experiments • Sentiment analysis of readers and information recommendation based on it

  22. Thank you for your attention

  23. Sample of sentiment dictionary e = a, b, c, d Se(w): impression value Me(w): weight Sc(death) = 0.260 Mc(death) = 1.306

  24. Sentiment value Oe(w) of an entry word w • Original impression words and their correspondence with sentiments e1 e2 • Sentiment value Oe(w) of an entry word w • A value between 1~0, (1: positive, 0: negative) • Calculated by analyzing the co-occurrence with the original impression words, based on Nikkei Newspaper Full Text Database (about 200 million articles)

  25. Sentiment value Oe(w) of an entry word w e1 e2 Se(w): impression value Me(w): weight Sentiment value of word “death” on the dimension c: Oc(death) = 0.260 “comfortable” and “death”, “peaceful” and “death” << “tension” and “death”, “emergency” and “death”

  26. A proposition of sentiment map 27 positive 0.5 0 -0.5 negative query is “scandal” Sentiment map for each news site Top ranked articles from each news site Demonstration

  27. System overview 28 Web Online processing (Runtime processing) Offline processing (Preprocessing) news sites Yomiuri (Osaka) Yomiuri (Tokyo) Asahi (Tokyo) ・・・ query sentiment map crawling 1) retrieve articles from each news site 2) rank the articles based on tf-idf in each site articles database (including tf-idf, sentiment values) news articles collection morphological analysis 3) calculate the average of sentiment values for each site tf-idf value calculation sentiment values calculation sentiment dictionary 4) generate a sentiment map

More Related