Learning Subjective Nouns with Extraction Patterns

Learning Subjective Nouns using Extraction Pattern Bootstrapping Ellen Riloff, Janyce Wiebe, Theresa Wilson Presenter: Gabriel Nicolae

Subjectivity – the Annotation Scheme • http://www.cs.pitt.edu/~wiebe/pubs/ardasummer02/ • Goal: to identify and characterize expressions of private states in a sentence. • Private state = opinions, evaluations, emotions and speculations. • Also judge the strength of each private state: low, medium, high, extreme. • Annotation gold standard: a sentence is • subjective if it contains at least one private-state expression of medium or higher strength • objective – all the rest The time has come, gentlemen, for Sharon, the assassin, to realize that injustice cannot last long.

Using Extraction Patterns to Learn Subjective Nouns – Meta-Bootstrapping (1/2) • (Riloff and Jones 1999) • Mutual bootstrapping: • Begin with a small set of seed words that represent a targeted semantic category • (e.g. begin with 10 words that represent LOCATIONS) • and an unannotated corpus. • Produce thousands of extraction patterns for the entire corpus (e.g. “<subject> was hired”) • Compute a score for each pattern based on the number of seed words among its extractions • Select the best pattern, all of its extracted noun phrases are labeled as the target semantic category • Re-score extraction patterns (original seed words + newly labeled words)

Using Extraction Patterns to Learn Subjective Nouns – Meta-Bootstrapping (2/2) • Meta-bootstrapping: • After the normal bootstrapping • all nouns that were put into the semantic dictionary are reevaluated • each noun is assigned a score based on how many different patterns extracted it. • only the 5 best nouns are allowed to remain in the dictionary; the others are discarded • restart mutual bootstrapping

Using Extraction Patterns to Learn Subjective Nouns – Basilisk • (Thelen and Riloff 2002) • Begin with • an unannotated text corpus and • a small set of seed words for a semantic category • Bootstrapping: • Basilisk automatically generates a set of extraction patterns for the corpus and scores each pattern based upon the number of seed words among its extractions  best patterns in the Pattern Pool. • All nouns extracted by a pattern in the Pattern Pool  Candidate Word Pool. Basilisk scores each noun based upon the set of patterns that extracted it and their collective association with the seed words. • The top 10 nouns are labeled as the targeted semantic class and are added to the dictionary. • Repeat bootstrapping process.

Using Extraction Patterns to Learn Subjective Nouns – Experimental Results • The graph tracks the accuracy as bootstrapping progressed. • Accuracy was high during the initial iterations but tapered off as the bootstrapping continued. After 20 words, both algorithms were 95% accurate. After 100 words, Basilisk was 75% accurate and MetaBoot 81%. After 1000 words, MetaBoot 28% and Basilisk 53%.

Creating Subjectivity Classifiers – Subjective Noun Features • Naïve Bayes classifier using the nouns as features. Sets: • BA-Strong: the set of StrongSubjective nouns generated by Basilisk • BA-Weak: the set of WeakSubjective nouns generated by Basilisk • MB-Strong: the set of StrongSubjective nouns generated by Meta-Bootstrapping • MB-Weak: the set of WeakSubjective nouns generated by Meta-Bootstrapping • For each set – a three-valuedfeature: • presence of 0, 1, ≥2 words from that set

Creating Subjectivity Classifiers – Previously Established Features • (Wiebe, Bruce, O’Hara 1999) • Sets: • a set of stems positively correlated with the subjective training examples – subjStems • a set of stems positively correlated with the objective training examples – objStems • For each set – a three-valuedfeature • the presence of 0, 1, ≥2 members of the set. • A binary feature for each: • presence in the sentence of a pronoun, adjective, cardinal number, modal other than will, adverb other than not. • Other features from other researchers.

Creating Subjectivity Classifiers – Discourse Features subjClues = all sets defined before except objStems • Four features: • ClueRatesubjfor the previous and following sentences • ClueRateobjfor the previous and following sentences • Feature for sentence length.

Creating Subjectivity Classifiers –Classification Results • The results of Naïve Bayes classifiers trained with different combinations of features. • Using both WBO and SubjNoun achieves better performance than either one alone. • The best results are achieved with all the features combined. • Another classification, with a higher precision, can be obtained by classifying a sentence as subjective if it contains any of the StrongSubjective nouns. • 87% precision • 26% recall

Learning Subjective Nouns with Extraction Patterns

Learning Subjective Nouns with Extraction Patterns

Presentation Transcript

Learning Subjective Adjectives From Corpora

Learning Subjective Nouns using Extraction Pattern Bootstrapping

Using Adjectives as Nouns

Bootstrapping

Learning Subjective Adjectives from Corpora

Subjective Complements: Predicate Adjectives and Predicate Nominatives(Nouns)

Bootstrapping

Bootstrapping information extraction from semi-structured web pages

Bootstrapping Mobile PINs Using Passwords

Bootstrapping

Learning Dictionaries for Information Extraction by Multi-Level Bootstrapping

Relational Learning of Pattern-Match Rules for Information Extraction

Comparing Information Extraction Pattern Models

Bootstrapping Information Extraction from Semi-Structured Web Pages

Relational Learning of Pattern-Match Rules for Information Extraction

BOEMIE: Bootstrapping Ontology Evolution with Multimedia Information Extraction

Learning Subjective Nouns using Extraction Pattern Bootstrapping

Using Adjectives as Nouns

Sampling Approaches to Pattern Extraction

Bootstrapping