Learning to Rank for Information Retrieval
180 likes | 399 Vues
Learning to Rank for Information Retrieval. Liang Du Supervised by Prof. Yi-Dong Shen. Outline. Motivation Learning to Rank Related work. Motivation. We are drawn in information, but we are starveling for knowledge. Information is nothing without retrieval
Learning to Rank for Information Retrieval
E N D
Presentation Transcript
Learning to Rank for Information Retrieval Liang Du Supervised by Prof. Yi-Dong Shen
Outline • Motivation • Learning to Rank • Related work
Motivation • We are drawn in information, but we are starveling for knowledge. Information is nothing without retrieval • Search Engine are widely used tools • The key inside search engine is ranking model
IR Evaluation • Various measures are used • MAP • NDCG • WTA • MRR
Ranking Model • Conventional Ranking Models • Similarity-based models (like vector space model) • Probabilistic models (like Language model ) • Hyperlink-based models (like PageRank)
Discussions on Conventional Ranking Models • For a particular model –Parameter tuning is usually difficult, especially when there are many parameters to tune. •For comparison between two models –Given a test set, it is difficult to compare two models, one is over-tuned (over-fitting) while the other is not. •For a collection of models –There are hundreds of models proposed in the literature. –It is non-trivial to combine them effectively.
Learning to Rank • Machine learning is an effective tool for ranking • To automatically tune parameters • To combine multiple evidences • To avoid over-fitting (regularization framework, structure risk minimization, etc.)
Categorization-1 • Criteria: Relation to Conventional Machine Learning • Learning to rank reduced to conventional machine learning. • Ranking with IR unique properties
Categorization-1 • Criteria: basic unit of learning • Point wise (Input: single documents ) • Pairwise (Input: document pairs ) • Listwise (Input: document collections )
Hot Area • More than 40 papers directly for learning to rank and 100+ published on top conference and journals in recent five years.
Related Areas • Machine learning (provide theory analysis and practice algorithms for ranking) • Data Mining (supply evidence and algorithms for ranking) • Information Retrieval (provide test bed and application background)