1 / 21

Towards a Robust Metric of Opinion

Towards a Robust Metric of Opinion. Kamal Nigam, Matthew Hurst Intelliseek Applied Research Center AAAI Spring Symposium on Exploring Attitude and Affect in Text 2004. Abstract. This paper describes an automated system for detecting polar expressions about a topic of interest.

ronda
Télécharger la présentation

Towards a Robust Metric of Opinion

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Towards a Robust Metric of Opinion Kamal Nigam, Matthew Hurst Intelliseek Applied Research Center AAAI Spring Symposium on Exploring Attitude and Affect in Text 2004

  2. Abstract • This paper describes an automated system for detecting polar expressions about a topic of interest. • The two elementary components are • a shallow NLP polar language extraction system • a machine learning based topic classifier. • Based on these components, discuss how to measure the overall sentiment about a particular topic as expressed in online messages authored by many different people. • The goal of research is to create text analysis techniques that facilitate real-world market research over first-person commentary form the internet.

  3. Introduction • The general approach is: • Segment the corpus into individual expression (sentence level) • Use a general-purpose polar language module and a topic classifier to identify individual polar expressions about the topic of interest. • Aggregate these individual expressions into a single score.

  4. Class of Polar Expressions • Focus on 2 general aspects of expression. • Opinion:: Statements used by the author to communicate opinion reflect a personal state. • I love this car. • I hate the BrightScreen LCD’s resolution. negative opinion. • Evaluative factual:: objective but describe what is generally held to be a desirable or undesirable state. • The screen is broken. (describing an undesirable state.) • My BrightScreen LCD had many dead pixels. negative oriented. • context and time dependent.

  5. Definition of Polar Expressions • The first dimension is explicit versus implicit language • direct statements I like it. • subjective evaluative language It is good. • Implicit expression, on the other hand, involve sarcasm, irony, idiom and other deeper cultural referents. • Ex: It’s great if you like dead batteries. • The next dimension is which an opinion is being expressed. This may be an established ‘real world’ concept, or it may be a hypothetical ‘possible worlds’ concept. • Ex :: I love my new car • I am searching for the best possible deal

  6. Recognizing Polar Language • A third aspect that concerns us is that of modality, conditionals and temporal aspects of the language used. • “I might enjoy it” have a clear evaluative element (enjoy it) but do not express a definite opinion. • “If it were larger...” describes a condition • Matter of attribution • “I think you will like it.” it might mean the author likes it and assumes I have the same taste, and so on.

  7. Definition of Polar Expressions • A polar phrase extraction system was implemented. • A lexicon is developed which is tuned to the domain being explored. • Each item in lexicon is a pairing or a word and its POS.

  8. Recognizing Polar Language The interpretation phase then carries out two tasks: • Tokenization step: • This car is really great. • {this, car, is, really, great} • {this_DT, car_NN, is_VB, really_RR, great_JJ} • {this_DT, car_NN, is_VB, really_RR, great_JJ;+} • the elevation of semantic information from lower constituents to higher, applying negation logic where appropriate, and assembling larger constituents from smaller. Rules are applied in a certain order

  9. Recognizing Polar Language • Chunking phase • {(this DT) DET, (car NN) BNP, (is VB) BVP,(really RR, great JJ;+) BADJP}. • Chunked input processed to form high-order groupings of a limited set of syntactic patterns. • The simple syntactic patterns used to combine semantic features are: • Predicative modification (it is good), • Attributive modification (a good car), • Equality (it is a good car), and • Polar clause (it broke my car). • Negation of the following types are captured by the system: • Verbal attachment (it is not good, it isn’t good), • Adverbial negatives (I never really liked it, it is never any good), • Determiners (it is no good), and • Superordinate scope (I don’t think they made their best offer).

  10. Topic Detection in Online Message • Winnow classifier (Littlestone 1988; Blum 1997; Dagan, Karov, & Roth 1997), an online learning algorithm that finds a linear separator between the class of documents that are topical and the class of documents that are irrelevant. • Documents are modeled with the standard bag-of-words representation that discards the ordering of words and notices only whether or not a word occurs in a document. • Cw(x) = 1 if word w occurs in document x and cw(x) = 0 otherwise. • fw is the weight for feature w. • If h(x) > V then the classifier predicts topical, and otherwise predicts irrelevant. • The basic Winnow algorithm proceeds as: Initialize all fw to 1. For each labeled document x in the training set,calculate h(x). • If the document is topical, but Winnow predicts irrelevant, update each weight fw where cw(x) = 1 by: fw *= 2 • If the document is irrelevant, but Winnow predicts topical,update each weight fw where cw(x) = 1 by: fw/= 2

  11. If a document is judged by the classifier to be irrelevant, we predict that all sentences in that document are also irrelevant. If a document is judged to be topical, then we further examine each sentence in that document. • Given each sentence and our text classifier, we simply form a bag-of-words representation of the sentence as if an entire document consisted of that single sentence. We then run the classifier on the derived pseudo-document. • If the classifier predicts topical, then we label the sentence as topical, otherwise we label the sentence as irrelevant. • Distribution between training and testing samples? • Adapting a given document classifier into a high precision/low recall sentence classifier. • Topical training documents is relatively small. • A sentence is shorter than average documents, the words in the sentence need to be very topical in order for the classifier to predict positive. • Leads directly to a loss in recall.

  12. Intersection of Topic and Polarity • Asserted that a sentence judged to be polar and also judged to be topical is indeed expressing polarity about the topic.

  13. Experimental Testbed • 34000 Automotive domain messages from online resources (usenet, online message boards, etc.) • Trained topic classifier and prepared polarity lexicon. • Hand-labeled a separate random sample of 822 messages for topicality, 88(11%) were topical. • hand labeled all 1298 sentences in the 88 topical messages for topicality, polarity (positive and/or negative) • For the 7649 sentences in messages that were not topical, every sentence was automatically labeled as topically irrelevant, and thus containing no topical polarity either. • Evaluate the polarity module in isolation using the 1298 sentences with the complete polarity labels. • Use the full dataset for evaluating topic and its combination with polarity. • To evaluate the difficulty of the task for humans, we had a second labeler repeat the hand-labeling on the 1298 sentences.

  14. Experimental Result Sentences predicted as topical positive: • – The B&W display is great in the sun. • – Although I really don’t care for a cover, I like what COMPANY-A has done with the rotating screen, or even better yet, the concept from COMPANY-B with the horizontally rotating screen and large foldable keyboard. • – At that time, superior screen. Sentences predicted as topical negative: • – Compared to the PRODUCT’s screen this thing is very very poor. • – I never had a problem with the PRODUCT-A, but did encounter the ”Dust/Glass Under The Screen Problem” associated with PRODUCT-B. • – broken PRODUCT screen • – It is very difficult to take a picture of a screen. • – In multimedia I think the winner is not that clear when you consider that PRODUCT-A has a higher resolution screen than PRODUCT-B and built in camera.

  15. Experimental Result • Precision is very similar to human agreement. • Recall is significantly lower than human performance. • Indirect language (implicit polar language,definition) • Recall of negative polarity is quite a bit lower than positive one. • Confirms observation that language used for negative commentary is much more varied than positive.

  16. Experimental Result • The recall of sentence level is lower than message level. • Add anaphora resolution to improve recall. • Used the truth data to test basic assumption for correlating topic and polar language. • Assumes that any expression of polarity in a topical sentence is expressing polarity about the topic itself. • Topical sentences contain positive polarity:91%(90/99) about the topic, while negative polarity:80%(60/75) • It is upper bound.

  17. Experimental Result • The precision for both positive and negative topical extraction is lower than for positive and negative extraction in isolation.

  18. Metric for Topic and Polarity: Discussion • How to use the polarity and topic modules to compile an aggregate score for a topic based on expressions contained in the data. • Envision an aggregate topical orientation metric to be a function of • The total number of expressions in a domain • The true frequency of expressions that are topical • The true frequency of topical expressions that are positively oriented for that topic • The true frequency of topical expressions that are negatively oriented for that topic • Simplistic metric might be just the ratio of positive to negative expression about a topic. • The metric should not only support a single estimation of orientation, but also include some confidence in its measure. This will allow us to compare two or more competing topics of interest and say with a quantifiable probability that one topic is more favorably received than another.

  19. Metric for Topic and Polarity: Discussion • Estimate true frequencies of topic and polarity as an exercise in Bayesian statistics. • The model we propose has a set of parameters that are fixed for each domain and topic • With probability Ptopic any expression will be written about the topic of interest. • With probability Ppos|topic any topical expression will be positive about the topic • With probability Pneg|topic any topical expression will be negative about the topic.

  20. Metric for Topic and Polarity: Discussion • Observe expressions by seeing the output of our topic and polarity modules, cloud the data: • With probability Ptopic,falsepos we observe a true irrelevant expression as a topical one • With probability Ptopic,falseneg we observe a true topical one as being irrelevant • With probability Ppos,falsepos we observe a positive topical expression when there is none. • With probability Ppos,falseneg we do not observe a positive topical expression when there really is one. • With probability Pneg,falsepos we observe a negative topical expression when there is none. • With probability Pneg,falseneg we do not observe a negative topical expression when there really is one. • Using this explicit generative process of the data, we can use our observed data with standard statistical techniques to estimate the true Ptopic, Ppos|topic, and Pneg|topic.

  21. Conclusion and Future work • There are a number of necessary steps required to complete the system and to improve the performance • Improve polarity recognition • Improve recall for topic • Implement and test the metric described above.

More Related