1 / 10

Sentence-level Subjectivity Analysis

Sentence-level Subjectivity Analysis. CS581 Project Presentation by Ece Egemen. Sentence-level Subjectivity Analysis. Subjective sentences vs. objective sentences «T he attempt is courageous , even if the result is wildly uneven .»

lance
Télécharger la présentation

Sentence-level Subjectivity Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Sentence-level Subjectivity Analysis CS581 Project Presentation by Ece Egemen

  2. Sentence-level Subjectivity Analysis • Subjective sentences vs. objective sentences • «The attempt is courageous , even if the result is wildly uneven.» • «Frodoand Sam take Gollumprisoner and continue on to Mordoron the mission to destroy the One Ring.» • Supervised learning approaches • Requires extremely large training data • Self-training Semi supervised approach • Requires a small labeled training set • Large unlabeled test set

  3. Similar Work • B. Wang, B. Spencer, C. X. Ling, H. Zhang, “Semi-supervised self-training for sentence subjectivity classification” • Features • Subjectivity Clues • Opinion Finder • Subjectivity Patterns • Sundance Information Extraction System • Pronouns, model verbs, adjectives, cardinal numbers, adverbs • In the forms of (0,1, >=2) • Parser • OpenNLP • Self-training • Underlying Classifier • Naive Bayes, C4.4, C4.5, Naive Bayes Tree (NBTree) • Selection Metric • Confidence Degree • Value Difference Metric (VDM) • Evaluating the distance between instances from the differences amongfeature conditional probability estimates

  4. Approach • Features • Subjectivity Clues • Cluesthat listed in the work by Riloff et. al. (2003). • Strong subjectivity clues: inditcates subjective opinion • e.g., charming, flattering, masterpiece, unbelievably • Weak subjectivity clues: may indicate subjective or objective opinion • e.g., abandoned, evaluation, impress, precise • Subjectivity Patterns • Similar to AutoSlog-TS (Riloff, 2006) • Exhaustively search for all patterns in the training set • Assignment of patterns • Frequency and subjectivity percentage • Manual Syntactic template Example pattern <subj> passive-verb <subj> was satisfied <subj> active-verb <subj> complained passive-verb prep <np> was worried about <np> • Frequencies of all previous features • Number of x / length of the sentence

  5. Approach cont.’d • Training process • Underlying classifier • Naive Bayes, j48 decision tree are tried • Support Vector Machines (SVM) • Best supervised result with 77.2632 % correctly classified instances. • Self-training • Train with underlying classifier • Take the unlabeled data which was labeled by classifier with a confidence degree and add it to the training data • Iterate until the end of data or a certain number of iterations or classification accuracy falls below a percentage • Data • Pang and Lee (2004)’s subjectivity data set v1.0

  6. Key Aspects • Features • Only subjectivity clues and subjectivity patterns • No polarity information (from SentiWordNet etc.) • Frequency added • Parser • StanfordNLP • Classifier • SVM • Possibility to create a highly accurate training data

  7. Results • Results with SVM

  8. Results cont.’d • After a few iterations classifier hits a wall • We can label large number of data with high confidence • What can we do to increase the coverage on the data?

  9. Future Work • Value Difference Metric can be applied. • May improve overall performance and good comparison • Learning extraction methods process can be improved. • Another semi-supervised approach • We can experiment with different datasets • Multi-Perspective Question Answering Corpus (MPQA) • Measuring our performance and good comparison • Automation of self-training process • This may solve the problem we had. • We will be able to choose the number of instances that will be added to the training set at each iteration. • Improve the learning process at each iteration • Choosing confidence interval as 0.90 seems fine but not enough

  10. References B. Wang, B. Spencer, C. X. Ling, H. Zhang, “Semi-supervised self-training for sentence subjectivity classification” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 5032 LNAI, 344 – 355, 2008. E. Riloff and J. Wiebe, “Learning extraction patterns for subjective expressions,” Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2003. E. Riloff, J. Wiebe, and T. Wilson, “Learning subjective nouns using extraction pattern bootstrapping,” Proceedings of the Conference on Natural Language Learning (CoNLL), pp. 25–32, 2003. B. Pang, L. Lee. “A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts” Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL). 2004 Riloff, E. “Automatically generating extraction patterns from untagged text”. Proceedings of the AAAI 1996 (1996) R.Remus. “Improving sentence-level subjectivity classification through readability measurement”. Proceedings of the 18th International Nordic Conference of Computational Linguistics (NODALIDA-2011). (2011)

More Related