Self-Selection

Self-Selection

Self-Selection and Information Role of Online Product Reviews ， Xinxin Li, Lorin Hitt The Wharton School, University of Pennsylvania Workshop on Information Systems and Economics (WISE 2004)

Outline • Introdution • Data Collection • Trend in Consumer reviews • Impact of Consumer Reviews on Book Sales • Theory Model and Implicatons • Conclusion

Introduction • Word of mouth has long been recognized as a major drivers of product sales. • eBay-like online reputation systems : a large body of work • product review websites : very little systematic research

Self-Selection Problem • The efficacy of consumer-generated product reviews may be limited for at least two reasons. • Firms may manipulate online rating services by paying individuals to provide high ratings. • There are possibilities that the reported ratings are inconsistent with the preferences of the general population. • Ratings of products may reflect both consumer taste as well as quality.

Major Research Questions • Early adopters may have significantly different preferences than later adopters which will create trends in ratings as products diffuse. • We consider whether consumers account for these biases in ratings when making product purchase decisions.

Data Collection • A random sample of 2651 hardback books was collected from “Books in Print” covering books published from 2000-2004 that also have reviews on Amazon. • Book characteristic information • title • author • publisher • publication date • category • publication date for corresponding paperback editions • consumer reviews • Sales-related data (every Friday from March to July in 2004) • sales rank • price • the number of consumers reviews • the average review • shipping availability

Trend in Consumer Reviews • The Box-Cox model : • AvgRatingit : the average review for book i at time t, • T: the time difference between the date the average review was posted and the date the book was released • ui : the idiosyncratic characteristics of each individual book that keep constant over time.

Trend in Consumer Reviews (contd.)

Impact of Consumer Reviews on Book Sales • Sales rank is a log-linear function of book sales with a negative slope.

Impact of Consumer Reviews on Book Sales (contd.) • All estimates are significant and have the right sign. • With other demand-related factors controlled for, the time variant component RT has a significant impact on book sales when consumers compare different books at the same time period

Theory Model and Implicatons • An individual consumer’s preferences over the product can be characterized by two components (xi, qi). • The element xi is known by each consumer before purchasing and represents the consumer’s preference over product characteristics that can be inspected before purchase • The element qi measures the quality of the product for consumer I • qe: expected quality

Conclusion • Our findings suggest the significance of product design and early period product promotion

Self-Selection Bias in Reputation Systems Mark Kramer MITRE Corporation IFIPTM ’07

Outline • Introduction • Expectation and Self-Selection • Avoiding Bias in Reputation Management Systems • Conclusion

Motivation • Can a reputation system based on user ratings accurately signal the quality of a resource?

Ratings Bias • Reputation systems appear to be inherently biased towards better-than-average ratings • Amazon: 3.9 out of 5 • Netflix prize data set: 3.6 out of 5 stars

Ratings Bias (contd.) 87% of ratings are 3 or higher

Possible Reasons for Positive Bias • People don’t like to be critical • People don’t understand the rating system or cannot calibrate themselves • Lake Wobegon Effect: Most movies are better than average • Number of ratings for quality movies far exceeds number of ratings of poor movies

The SpongeBob Effect • Oscar Winners 2000-2005 : Average Rating 3.7 Stars • SpongeBob DVDs : Average Rating 4.1 Stars • If SpongeBob effect is common, then ratings do not accurately signal the quality of the resource

What is Happening Here? • People choose movies they think they will like, and often they are right • Ratings only tell us that “fans of SpongeBob like SpongeBob” • Self-selection • Oscar winners draw a wider audience • Rating is much more representative of the general population

What is Happening Here? (contd.) • There might be a tendency to downplay the problem of biased ratings • you already "know" whether or not you would like the SpongeBob movie • you could look at written reviews • one could get personalized guidance from a recommendation engine

Importance of Self-Selection Bias • Bizrate 44% of consumers consult opinion sites before making online purchases • High ratings are the norm, contain little information • Written reviews also can be biased • Discarding numerical (star) ratings would eliminate an important time-saver • Consumers have no idea what “discount” to apply to ratings to get a true idea of quality • No recommendation engine will ever totally replace browsing as a method of resource selection

Model of Self-Selection Bias • Two groups: • Evaluation group E • Feedback group F where F  E • Consider binary situation: • E = Expect to be satisfied (T/F) • S = Are satisfied • R = Resource selected (and reviewed) • P(S) = probability of satisfaction with resource in E • P(S|R) = probability of satisfaction within F If P(R|E) > P(R|~E) Self-Selection And P(S|E) > P(S|~E) Realization of expectations Then P(S|R) > P(S) Biased Rating

Select Utility and Self-Selection • Some distribution of expected utility in evaluation group E • Resource will be selected only if expected utility is positive • Very high reviews can shift the expected utility curve to the right and increase the number of people selecting the resource • “Swing” group has a greater chance of disappointment # people Expected Utility (Evaluation Group)

Effect of Biased Rating: Example • 10 people see SpongeBob’s 4-star ratings • 3 are already SpongeBob fans, rent movie, award 5 stars • 6 already know they don’t like SpongeBob, do not see movie • Last person doesn’t know SpongeBob, impressed by high ratings, rents movie, rates it 1-star Result: • Average rating remains unchanged: (5+5+5+1)/4 = 4 stars • 9 of 10 consumers did not really need rating system • Only consumer who actually used the rating system was misled

Paradox of Subjective Reputation “Accurate ratings render ratings inaccurate” • The purpose of reputation systems is to increase consumer satisfaction • Do better than random selection • The mechanism is self-selection • If self-selection works, ratings will become positively biased • In the limit, all ratings will be 5-star ratings • Self-Selection bias (SpongeBob Effect) distorts the information needed for accurate self-selection • Rating system defeats itself

Dynamics of Ratings Paradox Accurate, complete prior information Inaccurate or biased prior information Good self-selection Poor self-selection Mix of happy and unhappy consumers Happy consumers Positively biased ratings Unbiased ratings

Example of Reputation Dynamics • Resource with uniformly distributed satisfaction between 0 – 100 • Successive groups decide whether to use the resource, based on rating • # selecting resource is proportional to average rating

Example of Reputation Dynamics(contd.) Fans first Random people first

Ideas for Bias-Resistant Reputation Systems • Use more demographics • Kids like SpongeBob, most adults do not • Self-selection is still at work within demographic subgroup • Demographics might not create useful groups with different preferences • Make personalized recommendations • Yes, but people still like to browse • Recommendations based on biased ratings might fail • NetFlix recommendation engine has large error • Use written reviews • Self-selection bias is still present

Bias-Resistant Reputation System • Want P(S) but we collect data on P(S|R) S = Are satisfied with resource R = Resource selected (and reviewed) • However, P(S|E,R)  P(S|E) • Likelihood of satisfaction depends primarily on expectation of satisfaction, not on the selection decision • If we can collect prior expectation, the gap between evaluation group and feedback group disappears • whether you select the resource or not doesn’t matter

Bias-Resistant Reputation System • After viewing: • I liked this movie: • Much more than expected • More than expected • About the same as I expected • Less than I expected • Much less than I expected • Before viewing: • I think I will: • Love this movie • Like this movie • It will be just OK • Somewhat dislike this movie • Hate this movie Skeptics Everyone else Big fans

Conclusions • Self-selection bias exists in most cases of consumer choice • Bias means that user ratings do not reflect the distribution of satisfaction in the evaluation group • Consumers have no idea what “discount” to apply to ratings to get a true idea of quality • Many current rating systems may be self-defeating • Accurate ratings promote self-selection, which leads to inaccurate ratings • Collecting prior expectations may help address this problem

Self-Selection

Self-Selection

Presentation Transcript

Selection

SELECTION

Selection

SELECTION

Selection

Selection

Selection

Evolution by Natural Selection is a Self ish Process

Self-organized Spectrum Chunk Selection Algorithm for Local Area Deployment

Mexico’s Oportunidades: Self-Selection in Targeted Social Programs

MIQP formulation for controlled variable selection in Self Optimizing Control

Selection

FEATURE SELECTION = GENE SELECTION

Selection

SELECTION

Selection

Ec-980u: Self-Selection of Immigrants

Selection

Group Ⅴ. Self-sorting, Self-selection, and Self-recognition

Selection

MIQP formulation for controlled variable selection in Self Optimizing Control