Selecting Attributes for Sentiment Classification Using Feature Relation Networks
This paper presents a novel approach to sentiment classification using Feature Relation Networks (FRN), which integrates both semantic and syntactic relationships among n-gram features. The study addresses challenges in sentiment analysis, such as noise and redundancy in large feature sets. Through extensive experimentation across various datasets, the FRN method demonstrated significant improvements in classification accuracy and computational efficiency compared to traditional methods. The findings highlight the potential for FRN in advanced sentiment analysis applications.
Selecting Attributes for Sentiment Classification Using Feature Relation Networks
E N D
Presentation Transcript
Selecting Attributes for Sentiment Classification Using Feature Relation Networks Presenter : Jian-Ren ChenAuthors : Ahmed Abbasi, Stephen France, Zhu Zhang,and Hsinchun Chen2011 , IEEETKDE
Outlines • Motivation • Objectives • Methodology • Experiments • Conclusions • Comments
Motivation Sentiment analysis has emerged as a method for miningopinions from such text archives. challenging problem: • requires the use of large quantities of linguistic features • integrate these heterogeneous n-gram categories into a single feature set - noise、redundancy and computationallimitations polarity intensity I don’t like you、I hate you
n-gram-(Markov model) 天氣:晴天、陰天、雨天 美麗vs美痢 “HAPAX” and “DIS” tags I hate Jim replaced with “I hate HAPAX”
Objectives • Feature Relation Network (FRN) considers semantic information and also leverages the syntactic relationships between n-gram features. • - enhanced sentiment classificationon extended sets of heterogeneous n-gram features.
Methodology - Subsumption Relations A subsumes B(A → B) “I love chocolate” unigram : I, LOVE, CHOCOLATE bigrams:I LOVE, LOVE CHOCOLATE trigrams :I LOVE CHOCOLATE What about the bigrams and trigrams? It depends on their weight. Their weight exceeds that of their general lower order counterparts by threshold t.
Methodology- Parallel Relations A parallel B(A - B) POS tag:“ADMIRE_VP”→ “like” semantic class: “SYN-Affection”→ “love” A and B have a correlation coefficient greater than some threshold p, one of the attributes is removed to avoid redundancy.
Experiments-Parameter t (0.0005, 0.005, 0.05, and 0.5) p (0.80, 0.90, and 1.00)
Conclusions • FRN had significantly higher best accuracy and bestpercentagewithin-one across three testbeds. • The ablation and parameter testing results play an important role for the subsumption and parallel relation thresholds.
Comments • Advantages - accuracy、computationally efficient • Disadvantage - ablation and parameter is sensitive • Applications - sentiment classification - feature selection method