1 / 14

Ensemble Learning for Sentiment Analysis

Ensemble Learning for Sentiment Analysis. Robert Christensen, Haibo Ding, Mengyang Wang, Fei Luo Dec 10 2013. Sentiment Analysis. Sentiment Analysis Research area of NLP Analyze people’s opinion, sentiments, emotions etc. One basic task is to classify the polarity of a given text.

maalik
Télécharger la présentation

Ensemble Learning for Sentiment Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ensemble Learning for Sentiment Analysis Robert Christensen, Haibo Ding, Mengyang Wang, FeiLuo Dec 10 2013

  2. Sentiment Analysis • Sentiment Analysis • Research area of NLP • Analyze people’s opinion, sentiments, emotions etc. • One basic task is to classify the polarity of a given text. • Why? • business • People’s opinions influence out behaviors ( Choice we make, and what we will buy)

  3. The Problem • Sentiment polarity classification • Classify the given text as positive or negative • In our experiments, we try to classify the movie reviews • For example: • Positive: “A deep and meaningful film” • Negative: “It’s like watching a nightmare made flesh.”

  4. Why Ensemble Learning? • INTUITION: Combining predictions of multiple classifiers(an ensemble)is more accurate than a single classifier. • Justification: • easy to find quite good “rules of thumb” however hard to find single highly accurate prediction rule. • If the training set is small and the hypothesis space is large then there may be many equally accurate classifiers. • Exhaustive global search in the hypothesis space is expensive so we can combine the predictions of several locally accurate classifiers.

  5. Ensemble Learning

  6. Ensemble with different algorithms Train Set Classify algorithms CA 1 CA 2 CA n ……………. Model 1 Model2 Model n ……………. Test Set Result 1 Result2 Resultn ……………. Ensemble Final results

  7. Ensemble on sampled data Train Set Random sampling sampletrain1 sampletrain1 sampletrain1 ……………. Same Classify algorithms Model 1 Model2 Model n ……………. Test Set Result 1 Result2 Resultn ……………. Final results Ensemble

  8. Experiments and Results • Data • Stanford sentiment analysis data (download on web) • Train set size: 6920 • Test set size: 1821 • Development set size: 872 • Classification algorithms • Naïve Bayes, SVM, MaxEnt, Logistic Regresion • Measurement • Classification Accuracy

  9. Baseline Results • Features: • Bag of words Table 1. Baseline Results

  10. Results of Ensemble Method 1 • Ensemble with different classification algorithms • Features: • Probabilities from stage-1 classifiers (NB, SVM, MaxEnt) Table 2. Performance of our ensemble system using different classifiers.

  11. Results of Ensemble method 2 • Ensemble by sampling train data • Features: • Probabilities from classifiers trained on samples (using MaxEnt classifier) • Stage-1 classifiers’ accuracy 76.83% (max), 70.47%(avg) Table 3. Performance of our ensemble system using sampled train data.

  12. Results of Ensemble method 2 • Ensemble by sampling train data • Features: • Probabilities from classifiers trained on samples (using NB classifier) • Stage-1 classifiers’ accuracy 78.03% (max), 71.56%(avg) Table 4. Performance of our ensemble system using sampled train data.

  13. Conclusion • Ensemble method slightly improves accuracy of multiple stage-1 classifiers. • Ensemble method provides a flexible method to effectively combine multiple trained classifiers. • Future work includes studying the effectiveness of various stage-2 classifier features.

  14. References • http://classes.engr.oregonstate.edu/eecs/fall2011/cs434/notes/ensemble.pdf • Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank Richard Socher et al. EMNLP-2013 • Opinion mining and sentiment analysis. B. Pang and L. Lee. 2008. Foundations and Trends in Information Retrieval

More Related