330 likes | 442 Vues
Vote Calibration in Community Question-Answering Systems. Bee-Chung Chen ( LinkedIn ), Anirban Dasgupta ( Yahoo! Labs ), Xuanhui Wang ( Facebook ), Jie Yang ( Google ) SIGIR 2012 This work was conducted when all authors were affiliated with Yahoo!. Why I Present This Paper?.
E N D
Vote Calibration in Community Question-Answering Systems Bee-Chung Chen (LinkedIn), AnirbanDasgupta (Yahoo! Labs), Xuanhui Wang (Facebook), Jie Yang (Google) SIGIR 2012 This work was conducted when all authors were affiliated with Yahoo!
Why I Present This Paper? • Vote bias exists in many social media platforms • This paper solves a problem in a relatively old context “CQA” from a new perspective, “crowd sourcing quality content identification”
Outline • Motivation • Related Work • Data Set • Vote Calibration Model • Exploratory Analysis • Features • Experimental Results • Conclusion
Community Question Answering Crowd sourced alternative to search engines for providing information
Community Question Answering Commercial spam: mostly can be tackled by conventional machine learning Low quality content: difficult for machines to detect! Crowdsourcing quality content identification
Voting Mechanism • Content quality • User expertise
Vote in Yahoo! Answers • Asker vote for the best answer • Asker does not vote for the best answer within certain period, other users in the community vote • Thumb-up or thumb-down votes on each individual answer • However… Are users’ votes always un-biased?
Potential Bias • Vote more positively for friends’ answers • Use votes to show appreciation instead of identifying high quality content • Game the system to obtain high status, multiple accounts, vote for one another • Questions about opinions, vote for answer that share same opinions • …
Potential Bias • Trained human editors to judge answers based on a set of well-defined guidelines • Raw user votes have low correlation with editorial judgment
Motivation • Propose the problem of vote calibration in CQA systems • Based on exploratory data analysis, identify a variety of potential factors that bias the votes • Develop a model for vote calibration based on supervised learning, content-agnostic approach
Related Work • Predicting user-voted best answer • Assumption: readily available user-voted best answer are ground truth • Predicting editorial judgments • User votes are used as features, calibration of each individual vote has not be studied • Content-agnostic user expertise estimation
Dataset • Editorial data • Sample questions and answers from Yahoo! Answers • Give quality grade to the answer according to pre-determined set of editorial guideline, excellent, good, fair, bad • 21,525 editorial judged answers on 7,372 questions
Dataset • Distribution of editorial grades for best answers are not very different from non-best answers. Low correlation between users’ best-answer votes and answer quality • Significant percentage (>70%) of best answers are not even good • Many non-best answers are actually good or excellent
Dataset • Numeric quality scores, excellent=1,good=0.5,fair=0,bad=-0.5 • Voting data, 1.3M questions, 7.0M answers, 0.5M asker best answer votes, 2.1M community best answer votes, 9.1M thumb up/down votes
Vote Calibration Model • Three types of votes • Asker votes: best answer votes by asker • +1 for best answer • -1 for other answers • CBA votes: community best answer votes • +1 from the voter that votes for best answer • -1 from the voter for other answers • Thumb votes: thumb-up and thumb down • +1 for thumb up • -1 for thumb down
Average Vote of An Answer Pseudo votes, prior Calibrated type-t votes
Quality Prediction Function Calibrated vote aggregation model: Bias term Answer level User level Quality prediction: weighted sum of answer-level and user-level average vote values of all types on an answer
Training Algorithm • Determine model parameters by minimizing the following loss function • Using gradient descent to determine model parameters
Self Voting • Self votes contribute to 33% of total CBA votes • Users who cast at least 20 votes, percentage of self votes goes above 40%
Interaction Bias • Chi-squared statistic and randomized test show past interaction could be useful features for vote calibration
Feature • Voter features
Feature • Relation feature
Feature Transformation • Each for the features C that are counts, consider log(1+C) as an additional feature • For ratio features R, include a quadratic term R2
Experimental Results • User-level expert ranking • How well we rank users based on the predicted user-level scores • Answer ranking • How well we rank answers based on the predicted answer-level scores
Conclusion • Introduce vote calibration problem to CQA • Propose a set of features to capture bias by analyzing potential bias in users’ voting behavior • Supervised calibrated models are better than non-calibrated versions
Thanks • Q & A