1 / 46

On Understanding and Classifying Web Queries

On Understanding and Classifying Web Queries. Prepared for :. Telcordia Contact: Steve Beitzel Applied Research steve@research.telcordia.com April 14, 2008. Overview. Introduction: Understanding Queries Query Log Analysis Automatic Query Classification Conclusions. Problem Statement.

fiorenza
Télécharger la présentation

On Understanding and Classifying Web Queries

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. On Understanding and Classifying Web Queries Prepared for: Telcordia Contact:Steve Beitzel Applied Research steve@research.telcordia.com April 14, 2008

  2. Overview • Introduction: Understanding Queries • Query Log Analysis • Automatic Query Classification • Conclusions

  3. Problem Statement • A query contains more information than just its terms • Search is not just about finding relevant documents – users have: • Target task (information, navigation, transaction) • Target topic (e.g., news, sports, entertainment) • General information need • User queries are simply an attempt to express all of the above in a couple of terms

  4. Popular Web Queries

  5. Problem Statement (2) • Current search systems focus mainly on the terms in the queries • Systems do not focus on extracting target task & topic information about user queries • We propose two techniques for improving understanding of queries • Large-Scale Query Log Analysis • Automatic Query Classification • This information can be used to improve general search effectiveness and efficiency

  6. Query Log Analysis • Introduction to Query Log Analysis • Our Approach • Key Findings • Conclusions

  7. Introduction • Web query logs are a source of information on users’ behaviors on the web • Analysis of logs’ contents may allow search services to better tailor their products to serve users’ needs • Existing query log analysis focuses on high-level, general measurements such as query length and frequency

  8. Our Approach • Examine several aspects of the query stream over time: • Total query volume • Topical trends by category: • Popularity (Topical Coverage of the Query Stream) • Stability (Pearson Correlation of Frequencies)

  9. Query Log Characteristics • Analyzed two AOL search service logs: • One full week of queries from December, 2003 • Six full months of queries; Sept. 2004-Feb. 2005 • Some light pre-processing was done: • Case differences, punctuation, & special operators removed; whitespace trimmed • Basic statistics: • Queries average 2.2 terms in length • Only one page of results is viewed 81% of the time • Two pages: 18% • Three or more: 1%

  10. Traffic Volume Over a Day

  11. Category Breakdown • Query lists for each category formed by a team of human editors • Query stream classified by exactly matching each query to category lists

  12. Category Popularity Over a Day

  13. Category Popularity Over Six Months

  14. Key Findings • Some topical categories vary substantially more in popularity than others over an average day • Some topics are more popular during particular times of the day • others have a more constant level of interest • Most Individual categories are substantially less divergent over longer periods • Still some seasonal changes (Sports, Holidays)

  15. Pearson Correlations for Selected Categories Over A Day

  16. Pearson Correlations for Selected Categories Over Six Months

  17. Key Findings • The query sets for different categories have differing similarity over time • The level of similarity between the actual query sets received within topical categories varies differently according to category • As we move out to very large time scales, new trends become apparent: • Climatic (Seasonal) • Holidays • Sports-related • Several major events fall within the studied six-month period, causing high divergence in some categories • Long-term trends like these can potentially be very useful for query routing & disambiguation

  18. Summary • Query Stream contains trends that are independent of volume fluctuation • Query Stream exhibits different trends depending on the timescale being examined • Future work may be able to leverage these trends for improvement in areas such as • Caching strategies • Query disambiguation • Query routing & classification

  19. Automatic Query Classification • Introduction: Query Classification • Motivations & Prior Work • Our approach • Results & Analysis • Conclusions • Future Work

  20. Introduction • Goal is to conceive an approach that can identify a query with relevant topical categories • Automatic classifiers help a search service decide when to use specialized databases • Specialized databases may provide tailored, topic-specific results

  21. Problem Statement • Current search systems focus mainly on the terms in the queries • No focus on extracting topic information • Manual query classification is expensive • Does not take advantage of the large supply of unlabeled data available in query logs

  22. Prior Work • Much early text classification was document-based • Query Classification: • Manual (human assessors) • Automatic • Clustering Techniques – doesn’t help identify topics • Supervised learning via retrieved documents • Still expensive – retrieved documents must be classified

  23. Automatic Query Classification Motivations • Web queries have very few features • Achieving and sustaining classification recall is difficult • Web query logs provide a rich source of unlabeled data; we must harness these data to aid classification

  24. Our Approach • Combine three methods of classification: • Labeled Data Approaches: • Manual (exact-match lookup using labeled queries) • Supervised Learning (Perceptron trained with labeled queries) • Unlabeled Data Approach: • Unsupervised Rule Learning with unlabeled data from a large query log • Disjunctive Combination of the above

  25. Approach #1 - Exact-Match to Manual Classifications • A team of editors manually classified approximately 1M popular queries into 18 topical categories • General topics (sports, health, entertainment) • Mostly popular queries • Pros • Expect high precision from exact-match lookup • Cons • Expensive to maintain • Very low classification recall • Not robust to changes in the query stream

  26. Approach #2 - Supervised Learning with a Perceptron • Goal: achieve higher levels of recall than human efforts • Supervised Learning • Used heavily in text classification • Bayes, Perceptron, SVM, etc… • Use manually classified queries to train a classifier • Pros: • Leverages available manual classifications for training • Finds features that are good predictors of a class • Cons: • Entirely dependant on the quality andquantity of manual classifications • Does not leverage unlabeled data

  27. Approach #3 - Unsupervised Rule Learning Using Unlabeled Data • We have query logs with very large numbers of queries • Must take advantage of millions of users showing us how they look for things • Build on manual efforts • Manual efforts tell us some words from each category • Find words associated with each category • Learn how people look for topics, e.g. “what words do users use to find musicians or lawn-mowers”

  28. Unsupervised Rule Learning Using Unlabeled Data (2) • Find good predictors of a class based on how users look for queries related to certain categories • Use those words to predict new members of each category • Apply the notion of selectional preferences to find weighted rules for classifying queries automatically

  29. Selectional Preferences: Step 1 • Obtain a large log of unlabeled web queries • View each query as pairs of lexical units: • <head, tail> • Only applicable to queries of 2+ terms • Queries with n terms form n-1 pairs • Example: “directions to DIMACS” forms two pairs: • <directions, to DIMACS> and <directions to, DIMACS> • Count and record the frequency of each pair

  30. Selectional Preferences: Step 2 • Obtain a set of manually labeled queries • Check the heads and tails of each pair to see if they appear in the manually labeled set • Convert each <head, tail> pair into: • <head, CATEGORY> (forward preference) • <CATEGORY, tail> (backward preference) • Discard <head, tail> pairs for which there is no category information at all • Sum counts for all contributing pairs and normalize by the number of contributing pairs

  31. Selectional Preferences: Step 2

  32. Selectional Preferences: Step 3 • Score each preference using Resnik’s Selectional Preference Strength formula: • Where urepresents a category, as found in Step 2. • S(x) is the sum of the weighted scores for every category associated with a given lexical unit

  33. Selectional Preferences: Step 4 • Use the mined preferences and weighted scores from Steps 3 and 4 to assign classifications to unseen queries

  34. Forward Rules harlem club X ENT->0.722 PLACES->0.378 TRAVEL->1.531 harley all stainless X AUTOS->3.448 SHOPPING->0.021 harley chicks with X PORN->5.681 Backward Rules X gets hot wont start AUTOS->2.049 PLACES->0.594 X getaway bargain PLACES->0.877 SHOPPING->0.047 TRAVEL->0.862 X getaway bargain hotel and airfare PLACES->0.594 TRAVEL->2.057 Selectional Preference Rule Examples

  35. Combined Approach • Each approach exploits different qualities of our query stream • A natural next step is to combine them • How similar are the approaches?

  36. Evaluation Metrics • Classification Precision: • #true positives / (#true positives + #false positives) • Classification Recall: • #true positives / (#true positives + # false negatives) • F-Measure: Higher values of beta put more emphasis on recall

  37. Experimental Data Sets • Separate collections for training and testing: • Training: • Nearly 1M web queries manually classified by a team of editors • Grouped non-exclusively into 18 topical categories, and trained each category independently • Query log of several hundred million queries used for forming SP rules • Testing: • 20,000 web queries classified by human assessors • ~30% agreement with classifications in training set • 25% of the testing set was set aside for tuning the perceptron & SP classifiers

  38. Effectiveness of each approach

  39. Performance of Classifiers at varying levels of Beta

  40. KDD Cup 2005 • 2005 KDD Cup task was Query Classification • 800,000 queries and 67 topical categories • 800 queries judged by three assessors • Top performers used information from retrieved documents • Retrieved result snippets for aiding classification decisions • Top terms from snippets and documents used for query expansion • Systems evaluated on precision and F1

  41. KDD Cup Experiments • We mapped our manual classifications on to the KDD cup category set • Obviously an imperfect mapping • Our categories are general, i.e. “Sports” • KDD Cup categories are specific, i.e. “Sports-Baseball” • Running a retrieval pass is prohibitively expensive • We relied only on our general manual classifications and queries in the log

  42. KDD Cup Results

  43. Conclusions • Our system successfully makes use of large amounts of unlabeled data • The Selectional Preference rules allow us to classify a significantly larger portion of the query stream than manual efforts alone • Excellent potential for further improvements

  44. Future Work • Expand available classification features per query • Mine web query logs for related terms and patterns • More intelligent combination methods • Learned combination functions • Voting algorithms • Utilize external sources of information • Patterns and trends from query log analysis • Topical ontology lookups • Use automatic query classification to improve effectiveness and efficiency in a production search system

  45. Related Bibliography • Journals • S. Beitzel, et. al, “Temporal Analysis of a Very Large Topically Categorized Query Log”, Journal of the American Society for Information Science and Technology (JASIST), Vol. 58, No. 2, 2007. • S. Beitzel, et. al, “Automatic Classification of Web Queries Using Very Large Unlabeled Query Logs”, ACM Transactions on Information Systems (TOIS), Vol. 25, No. 2, April 2007. • Conferences • S. Beitzel, et. al, “Hourly Analysis of a Very Large Topically Categorized Web Query Log", ACM-SIGIR, July 2004. • S. Beitzel, et. al “Automatic Query Classification”, ACM-SIGIR, August 2005. • S. Beitzel, et. al, “Improving Automatic Query Classification via Semi-supervised Learning”, IEEE-ICDM, November 2005.

  46. Questions? • Thanks!

More Related