1 / 60

Ling 570 Day 9: Text Classification and Sentiment Analysis

Ling 570 Day 9: Text Classification and Sentiment Analysis. Outline. Questions on HW #3 Discussion of Project #1 Text Classification Sentiment Analysis. Project #1. Your goal: political text analysis. Take a document, predict whether it is more Republican or Democratic

tyne
Télécharger la présentation

Ling 570 Day 9: Text Classification and Sentiment Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ling 570 Day 9:Text Classification and Sentiment Analysis

  2. Outline • Questions on HW #3 • Discussion of Project #1 • Text Classification • Sentiment Analysis

  3. Project #1

  4. Your goal: political text analysis • Take a document, predict whether it is more Republican or Democratic • We have harvested blog posts from: • The Democratic National Committee • The Republican National Committee • Fox News • The Huffington Post

  5. First task • Can you reconstruct the party affiliation of a given document? • We will gather some novel posts, held out from your training data • You predict the political part of each of these posts to the best of your ability

  6. Second task • Is the media biased? Is a particular news source biased? • Using the classifier that you’ve learned, see whether documents from a particular news source seem to be left- or right-leaning. • What features are most indicative of the party of a given document? • Do you think your classifier is effective in detecting media bias? Why or why not?

  7. Text Classification

  8. Text classification • Also known as “text categorization” • Often an instance of supervised learning • Start with a large body of pre-classified data • Try to map new documents into one of these classes

  9. Text classification test “We transcribed the samples of this unusual language in IPA…” classes – often hierarchical train

  10. Classification methods • Manual • Yahoo, back in the day, had a manually curated hierarchy of useful web content • Can be very accurate, consistent… • …but it’s very expensive • Need to move to automatic methods

  11. Text categorization • Given: • A document • is the set of all possible documents • But we need to represent them usefully somehow! • Often times we have a high-dimensional representation • A fixed set of categories • Determine: • The category of some new document

  12. Machine learning:Supervised classification • Given: • Instance descriptions • A set of outcomes • A training set • Determine: • A classifier • Classification is a clear instance of this problem

  13. Bayesian methods • Learning based on probability theory • Bayes theorem plays a big role • Build a generative model that approximates how data is produced • Prior probability of each class • Model gives a posterior probability of output given inputs • Naïve Bayes: • Bag of features (generally words) • Assumes each feature is independent

  14. Bag of words representation According to a study published in the October issue of Current Biology entitled 'Spontaneous human speech mimicry by a cetacean,' whales can talk. Not to burst your bubble ring or anything, but now that we've suckered you in, let's clarify what we mean by 'talk.' A beluga whale named 'NOC' (he was named for an incredibly annoying sort of Canadian gnat), that lived at the National Marine Mammal Foundation (NMMF) in San Diego up until his death five years ago, had been heard making some weird kinds of vocalizations. At first, nobody was sure that it was him: divers hearing what sounded like 'two people were conversing in the distance just out of range for our understanding.' But then one day, a diver in NOC's tank left the water after clearly hearing someone tell him to get out. It wasn't someone, though: it was some whale, and that some whale was NOC.

  15. Bag of words representation

  16. Bayes’ Rule for text classification • For a document and a class

  17. Bayes’ Rule for text classification • For a document and a class

  18. Bayes’ Rule for text classification • For a document and a class So…

  19. Bayes’ Rule for text classification • For a document and a class • So… • Divide by to get:

  20. Back to text classification

  21. Back to text classification is just

  22. Back to text classification is just the count of science docs / total docs

  23. Back to text classification is just the count of science docs / total docs But how do we model the whole matrix ?

  24. The “Naïve” part of Naïve Bayes • Assume that everything is conditionally independent given the class:

  25. Return of smoothing… • is…

  26. Return of smoothing… • is… • The number of science documents containing whale • Divided by the number of science documents

  27. Return of smoothing… • is… • The number of science documents containing whale • Divided by the number of science documents • What is ?

  28. Return of smoothing… • is… • The number of science documents containing whale • Divided by the number of science documents • What is ? • 0! Need to smooth…

  29. Return of smoothing… • is… • The number of science documents containing whale • Divided by the number of science documents • What is ? • 0! Need to smooth… • What would Add-One (Laplace) smoothing look like?

  30. Exercise

  31. Benchmark dataset #1:20 newsgroups • 18,000 documents from 20 distinct newsgroups • A now mostly unused technology for sharing textual information, with hierarchical topical groups

  32. Results:

  33. Evaluation methods • “macro”-averaging: • Compute Precision and Recall for each category • Take average of per-category precision and recall values

  34. Evaluation methods • There is also “macro”-averaging: • Compute Precision and Recall for each category • Take average of per-category precision and recall values

  35. Evaluation methods • What is the analogue of precision and recall for multiclass classification? • We can still compute precision and recall as usual for each category • Then add up these numbers to compute precision and recall • This is called “micro-averaging”, and focuses on document level accuracy

  36. Feature selection

  37. Sentiment Analysis

  38. Sentiment Analysis • Consider movie reviews: • Given a review from a site like Rotten Tomatoes, try to detect if the reviewers liked it • Some observations: • Humans can quickly and easily identify sentiment • Easier that performing topic classification, often • Suspicion: Certain words may be indicative of sentiment

  39. Simple Experiment[Pang, Lee, Vaithyanathan, EMNLP 2002] • Ask two grad students to come up with a list of words changed with sentiment • Create a very simple, deterministic classifier based on this: • Count number of positive and negative hits • Break ties to increase accuracy

  40. Simple Experiment[Pang, Lee, Vaithyanathan, EMNLP 2002] • Ask two grad students to come up with a list of words changed with sentiment • Create a very simple, deterministic classifier based on this: • Count number of positive and negative hits • Break ties to increase accuracy • Compare to automatically extracted lists

  41. Toward more solid machine learning • Prior decision rule was very heuristic • Just count the number of charged words • Ties are a significant issue • What happens when we shift to something more complex?

  42. Toward more solid machine learning • Prior decision rule was very heuristic • Just count the number of charged words • Ties are a significant issue • What happens when we shift to something more complex? • Naïve Bayes • Maximum Entropy (aka logistic regression, aka log-linear models) • Support Vector Machines

  43. Experimental results Baseline was 69% accuracy. Here we get just under 79% with all words, just using frequency. What happens when we use binary features instead?

  44. Experimental results Unigrams are pretty good – what happens when we add bigrams?

  45. Experimental results Why are just bigrams worse than unigrams and bigrams together?

  46. Experimental results

More Related