1 / 17

Usefulness of Quality Click-through Data for Training

Usefulness of Quality Click-through Data for Training. Craig Macdonald, ladh Ounis Department of Computing Science University of Glasgow, Scotland, UK { craigm,ounis }@ dcs.gla.ac.uk WSCD 2009. Outline. Abstract Introduction Select training query Rank strategy Experiments

pier
Télécharger la présentation

Usefulness of Quality Click-through Data for Training

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Usefulness of Quality Click-through Data for Training Craig Macdonald, ladhOunis Department of Computing Science University of Glasgow, Scotland, UK {craigm,ounis}@dcs.gla.ac.uk WSCD 2009

  2. Outline • Abstract • Introduction • Select training query • Rank strategy • Experiments • Conclusions & Future work

  3. Abstract • Modern IR system often employ document weighting models with many parameters • To obtain these parameter • This work , use click through for training • Compare

  4. Introduction • Parameters which affect the selection and ordering of results • There has been much research done over recent years to develop new methods for training models with many parameters • For instance by attempting to directly optimise rank-based evaluation measures • The wider Learning to Rank combined field of machine learning and information retrieval

  5. Introduction • Traditionally, training to find a setting of the parameters using the corresponding relevance judgements. • Test set : a set of unseen queries • Deriving relevance judgements is expensive,

  6. Introduction • Examine how quality click-through data • In aggregate form means that no individual user is treated as absolutely correct • Perform an analysis of the usefulness of sampling training data from a large query log • Three different sampling strategies are investigated, and results drawn across three user search tasks

  7. Select training query • Data Set : MSN Search Asset Data collection • 15 million query log with click-through documents • 7 million unique queries • Users clicked on documents which are in the GOV Web test collection • 25,375 in the the GOV Web test collection

  8. Select training query • Classified Web search queries into three categories: • Navigational query 15-25% • informational query 60% • Transactional query 25-35% • The most frequent queries are often navigational

  9. Select training query • Head-First • Ranking the most commonly clicked query-document pairs, we select the top 1000 pairs • Unbiased Randomly • Select 1000 random queries from the query list providing a random sample of both frequent and infrequent queries. • Biased Randomly • Select 1000 random queries from the query list (with repetitions). The queries in this sample are more likely to be frequent.

  10. Select training query • TREC Web tracks investigated user retrieval tasks in the Web setting • home page finding task • named page finding task • topic distillation task • For the TREC Web track tasks, each task forms a test collection comprising of a shared corpus of Web documents (the GOV corpus in this case), a set of queries, and corresponding binary relevance judgements made by human assessors. • Relevance assessments is expensive • Automatically derive data

  11. Rank strategy • Use textual features from the documents. • PL2F field-based weighting model cfis a hyper-parameter for each field controlling the term frequency normalisation wfcontroll the contribution of the field

  12. Training PL2F • 6 parameters: wbody,wanchor,wtitle,cbody,canchor,ctitle • Train the parameters using simulated annealing to directly optimise a given evaluation measure on a training set of queries • simulated annealing would be very time consuming • the independence cfto perform concurrent optimisations

  13. Experiments • Compare using the parameter settings obtained from the click-through training and real human relevance judgements • Baseline : trained using a mixed set of TREC Web task queries, with human relevance judgements.

  14. Experiments • 8 drop in retrieval performance using the click-through training compared to training on the TREC mixed query tasks. • 5 is significantly better. • 23 is no significant differences

  15. Experiments • The random samples are, in general, more effective than the head-first sample • High performance on the home page finding and named page finding tasks. • MAP appears to be marginally better for training using click-through. • Due to the high number of queries which have only one clicked document in the training set

  16. Conclusions • Our results show that the click-through data is usually as good as training on bona fide relevance-assessed TREC dataset, and occasionally significantly better

  17. Future work • Could be expanded to train many more features: document features • link analysis • URL length • Directly learning features into the ranking strategy.

More Related