Fine-tuning Ranking Models: Vitor Jan 29, 2008 Text Learning Meeting - CMU a two-step optimization approach With invaluable ideas from ….
Motivation • Rank, Rank, Rank… • Web retrieval, movie recommendation, NFL draft, etc. • Einat’s contextual search • Richard’s set expansion (SEAL) • Andy’s context sensitive spelling correction algorithm • Selecting seeds in Frank’s political blog classification algorithm • Ramnath’s thunderbird extension for • Email Leak prediction • Email Recipient suggestion
Help your brothers! • Try Cut Once!, our Thunderbird extension • Works well with Gmail accounts • It’s working reasonably well • We need feedback.
Email Recipient Recommendation 36 Enron users
Email Recipient Recommendation Threaded [Carvalho & Cohen, ECIR-08]
Aggregating Rankings [Aslam & Montague, 2001]; [Ogilvie & Callan, 2003]; [Macdonald & Ounis, 2006] • Many “Data Fusion” methods • 2 types: • Normalized scores: CombSUM, CombMNZ, etc. • Unnormalized scores: BordaCount, Reciprocal Rank Sum, etc. • Reciprocal Rank: • The sum of the inverse of the rank of document in each ranking.
Aggregated Ranking Results [Carvalho & Cohen, ECIR-08]
Intelligent Email Auto-completion TOCCBCC CCBCC
Can we do better? • Not using other features, but better ranking methods • Machine learning to improve ranking: Learning to rank: • Many (recent) methods: • ListNet, Perceptrons, RankSvm, RankBoost, AdaRank, Genetic Programming, Ordinal Regression, etc. • Mostly supervised • Generally small training sets • Workshop in SIGIR-07 (Einat was in the PC)
Pairwise-based Ranking Goal: induce a ranking function f(d) s.t. Rank q d1 d2 d3 d4 d5 d6 ... dT We assume a linear function f Therefore, constraints are:
Ranking with Perceptrons • Nice convergence properties and mistake bounds • bound on the number of mistakes/misranks • Fast and scalable • Many variants[Collins 2002, Gao et al 2005, Elsas et al 2008] • Voting, averaging, committee, pocket, etc. • General update rule: • Here: Averaged version of perceptron
Rank SVM [Joachims, KDD-02], [Herbrich et al, 2000] • Equivalent to maximing AUC Equivalent to:
Loss Functions • SVMrank • SigmoidRank Not convex
Fine-tuning Ranking Models Base ranking model Final model Base Ranker Sigmoid Rank e.g., RankSVM, Perceptron, etc. Non-convex: Minimizing a very close approximation for the number of misranks
Results in CC prediction 36 Enron users
Set Expansion (SEAL) Results [Wang & Cohen, ICDM-2007] [Listnet: Cao et al. , ICML-07]
Learning Curve TOCCBCC Enron: user lokay-m
Learning Curve CCBCC Enron: user campbel-m
Regularization Parameter s=2 TREC3 TREC4 Ohsumed
Some Ideas • Instead of number of misranks, optimize other loss functions: • Mean Average Precision, MRR, etc. • Rank Term: • Some preliminary results with Sigmoid-MAP • Does it work for classification?