1 / 96

Text Emotion Recognition on micro-blog

Text Emotion Recognition on micro-blog. Student: Yi-An Chu Advisor: Chung-Hsien Wu Department of Computer Science and Information Engineering National Cheng Kung University, Tainan, TAIWAN. Outline. Introduction Background Motivation Related Work Proposed Approach LDA-base Model

schuelke
Télécharger la présentation

Text Emotion Recognition on micro-blog

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Text Emotion Recognitionon micro-blog Student:Yi-An Chu Advisor: Chung-Hsien Wu Department of Computer Science and Information Engineering National Cheng Kung University, Tainan, TAIWAN

  2. Outline • Introduction • Background • Motivation • Related Work • Proposed Approach • LDA-base Model • Experiments • primary experiment result • Conclusion • FutureWork

  3. Background • The emotion recognition is an important component of affective computing and has been implemented in many kinds of media • Speech • Facial expression • Text • Physiological signals • Among these media, the advantage of text includes • Easy to collect from web • Non-acted emotional data can be collected by the social network

  4. Background • Micro-blog is a newly emerging and popular social media in recent years with the characteristics as follows • Limitation of maximum word length • Post message frequently • large amount of unstructured or missing data • Famous micro-blogs: Facebook, Plurk, Twitter • User intension [Akshay.2007] • Daily Chatter • Conversations • Sharing information/URLs

  5. Background • Potential application of emotion recognition on micro-blog • Human-Computer Interaction • Music Recommendation • Product Recommendation • Mental health caring • Personality analysis

  6. Motivation • Short statements in micro-blogs are harder to classify because of the shortage of context, yet it is not rare for the statements to include words that may be linked to sentiments directly On blog: 太開心了! 報告弄了一個多禮拜,經過多次修改,總算弄完了 回想起來…………………. ………………… ……….. On micro-blog: 呼~終於弄完了

  7. Related Work • Research on micro-blog focus on two aspects • Opinion mining(sentiment analysis) • investigate people’s opinion toward specific topic or products • Share URL/information • Topic extracting • Emotion detection(sentiment detection) • help improve human-computer interaction • Daily Chat

  8. Related Work • Emotion detection granularity • Document-level [Yang. 2007] • determining the overall emotion in articles • how to take sentence-level information into account • Blogs (e.g. LiveJournal) • Sentence–level • determining the overall emotion in sentences • how to take word-level information into account • Microblog (e.g. twitter , plurk) • Word–level [Chen. 2012] • Emotion dictionary construction (e.g. NTUSD) • Polarity oriented predict

  9. Related Work • For sentence-level • Feature extraction • Emotion dictionary , polarity word • Point-wise mutual information (PMI) • N-gram , POS • Modeling approach • Support vector machine • Naïve Bayes • Maximum Entropy • Conditional random field • PMIis most common feature in sentence-level[Soo. 2010 ][Chou. 2010 ][Lin. 2012 ][Wang. 2011 ]

  10. Related Work-Cognitive theory • Emotion Association Rules [Wu, 2006]

  11. Problem I • Most words are not directly with emotions • In micro-blog ,words may have the similarity meaning(topic) and posts may have similar type(event) ,which may have relation to emotion Event : series of posts Timeline Event 3 Event 2 Event 1

  12. Problem I • Topic • 食物 • Event • 寫作業 捏飯糰 真的 很 有 快感 --------------------------------- 胃 在 翻騰 這 炸醬麵 怎麼 永遠 都 吃 不 完 我 的 程式 怎麼 一直 出現 很 好笑 的 bug ------------------------------------------------------------- 子集合 什麼 鬼 呀 -------------------------------------------------------------- 又 要 來 寫 java 作業 了 --------------------------------------------------------- 終於 debug 掉 了

  13. Approach I • We proposed a two layer LDA-based model which consider the event and topic information for emotion recognition w1 w2 wk z1 z2 zj v1 v2 vi e1 e2 en E Second Layer Emotion-Event Model E:Emotion V:Event Z:Topic W:Word V First Layer Event-Word Model Z W

  14. System Framework Training Corpus Training phase Two layer LDA sentence CKIP Segment Emotion-Event Model Training Stop Word/ Steaming Event-Word Model Training Emotion Model Training Stop Word Corpus Emotion-Event Model Event-Word Model Emotion Recognition Model Compound Word Prediction Emotion Recognition CKIP Segment Stop Word/ Steaming Emotion Output sentence Test Corpus Test phase

  15. Stop Word/Steaming • First 100 words from Sinica

  16. System Framework Training Corpus Training phase Two layer LDA sentence CKIP Segment Emotion-Event Model Training Stop Word/ Steaming Event-Word Model Training Emotion Model Training Stop Word Corpus Emotion-Event Model Event-Word Model Emotion Recognition Model Compound Word Prediction Emotion Recognition CKIP Segment Stop Word/ Steaming Emotion Output sentence Test Corpus Test phase

  17. Event-Word Training • Approximate inference • Gibbs sampling

  18. Emotion-Event Training • Approximate inference • Gibbs sampling

  19. System Framework Training Corpus Training phase Two layer LDA sentence CKIP Segment Emotion-Event Model Training Stop Word/ Steaming Event-Word Model Training Emotion Model Training Stop Word Corpus Emotion-Event Model Event-Word Model Emotion Recognition Model Compound Word Prediction Emotion Recognition CKIP Segment Stop Word/ Steaming Emotion Output sentence Test Corpus Test phase

  20. Emotion Model |Z| |E| |P| φ θ β δ λ α |B| |V| |p| ρ ε z w γ

  21. Emotion Model • Emotion’s probability • Word probability Traditional Approach

  22. System Framework Training Corpus Training phase Two layer LDA sentence CKIP Segment Emotion-Event Model Training Stop Word/ Steaming Event-Word Model Training Emotion Model Training Stop Word Corpus Emotion-Event Model Event-Word Model Emotion Recognition Model Compound Word Prediction Emotion Recognition CKIP Segment Stop Word/ Steaming Emotion Output sentence Test Corpus Test phase

  23. Compound Word Prediction • Calculate compound word and topic similarity CKIPalgorithm for unknown words

  24. Compound Word Learning • χ2 chi-square score [Chen. 2004] [Lu. 2007] [Qiu. 2009]

  25. Corpus • Annotated flow chart Posts Event Boundary Detection Emotion Annotate In Event

  26. Corpus • Corpus • 2015 posts • 1585 events • 3 emotion labels • Positive • Negative • Neutral • Positive: 335 posts • Negative : 669 posts • Neutral : 1011 posts

  27. Experiment Setup • Topic number: 4 • iteration : 2000(default) • Alpha : 0.1 • Beta : 50/|Z|

  28. Experiment Result • Without Event merge • Training for 2015 posts

  29. Experiment Result • With Event merge • Training for 1585 events

  30. Experiment Result • Without any neutral post • Training 709 events

  31. Discussion • Neutral has serious overlap with positive and negative • Event may contain too less posts • Now average 1.26 posts in a event • Maybe it’s hard to detect event just from post content • We can try use period time for grouping events

  32. Thank you for attention

  33. Experiment Result • With Event merge • Training for 1585 events

  34. Experiment

  35. System Framework Training Corpus E-HowNet EGRs Training phase sentence word Model Training Semantic Class Definition Seed Lexicon Construction Compound Word Learning Semantic Class Labeling Semantic Class Definition Semantic Class Lexicons Emotion Recognition Model sentence Semantic Class Labeling Compound Word Prediction Emotion Recognition Emotion Output Test Corpus Test phase

  36. Semantic Class Labeling • algorithm

  37. System Framework Training Corpus E-HowNet EGRs Training phase sentence word Model Training Semantic Class Definition Seed Lexicon Construction Compound Word Learning Semantic Class Labeling Semantic Class Definition Semantic Class Lexicons Emotion Recognition Model sentence Semantic Class Labeling Compound Word Prediction Emotion Recognition Emotion Output Test Corpus Test phase

  38. Model Training • Consider syntactic structure • temporal , causal adverbs E.g. 終於、否則、但是 • Labeled data would group a tuple(action,object,sentiment), disconnect by function words • 今天本來很開心的但是下午就很低落 • (null,null,[正面情感]),(null,null,[負面情感]) • CRF-based model

  39. Emotion Recognition • Most likely emotion can be detected

  40. Corpus • Corpus • 10 users with their history plurks • over 10000 posts • over 800 posts a user • From 2010-2012

  41. Proposed ApproachI-concept • Based on appraisal theory and emotion generation rules , we distinguish emotional related words to 3 category and 16 semantic label • For previous example • 感冒 =>[有害的東西] 消失=>[失去] • 傷心 =>[負面情感] 難過=>[負面情感] • ”今天真是令人<難過>又<傷心>的一天”=> [負面情感][負面情感] • ”<感冒>症狀終於<消失>了”=> => [有害的東西][失去] • Different features for text classification

  42. Proposed ApproachII-concept • We use semantic classification to classify to corresponding semantic labels • For previous example • ”我把報告<寫完>了” =>[達成] • ”我把報告<弄完>了”=>[達成] • ”我把報告<弄出來>了”=>[達成]

  43. Appraisal theory • Appraisal theory • the nature of the emotional reaction can be best predicted on the basis of the individual’s appraisal of an antecedent situation, object, or event • the elements considered in the appraisal process which are called appraisal criteria • suddenness, • familiarity, • Predictability • intrinsic pleasantness, etc.

  44. Semantic Class Definition • Emotion Generation Rule is employed to extract the semantic labels

  45. Baseline SVM • BaselineSVM • 5-fold cross validation • Features • Top-100 PMIfor each emotion • Binary features • Training 2863 posts , Test 715 post negative positive neutral

  46. PMI Analysis • Top 11 positive PMI words have 5 on seed sentiment lexicon • Words not in sentiment would have poor performance

  47. Motivation • The performance of emotion recognition is limited with only information of word polarity 今天真是令人難過又傷心的一天 難過=>[負面] 傷心=>[負面] 負面情緒 (正確結果) 感冒症狀終於消失了 感冒=>[負面] 消失=>[負面] 負面情緒 (錯誤結果) Emotion Dictionary

  48. Contributions • Compound Word Learning for Emotion Recognition • Learning Approach • Seed Lexicon Construction • RW: PMI(Category)-based, Dictionary-based approach • Relationship between sentence tags and segmented words • Duplicated emotion Keywords among each emotion • More and more non-sentiment words are included in dictionary and then degrade the recognition performance • EGR -> Semantic Labels (Categories) • Input : Post • Process: Post ->Word by CKIP,Word->SemanticLabel • Rank and Selection Criterion • Frequency and Quartile • Learning by SINICA Affix/Suffix Dictionary • SINICA Affix/Suffix Dictionary • Chi-square Similarity (with Context?)

  49. Problem • Posts of micro-blog are full of missing data • E.g., foreign words, homonyms, proper nouns, compound word • Three most dominant types of Chinese missing data in balanced corpus [Chen, 2000] • compound nouns (about 51%), compound verbs (about 34%) • proper names (about 15%) • Phenomena of compound words • A compound word comprises more than one morphemes • Sparse data with similar meaning and emotion • ”我把報告<寫完>了” • ”我把報告<弄完>了” • ”我把報告<弄出來>了”

More Related