1 / 25

Query Suggestion Using Hitting Time

Query Suggestion Using Hitting Time. Qiaozhu Mei † , Dengyong Zhou ‡ , Kenneth Church ‡ † University of Illinois at Urbana-Champaign ‡ Microsoft Research, Redmond. Motivating Examples. Sports center. MSG. 1. Difficult for a user to express information need

Télécharger la présentation

Query Suggestion Using Hitting Time

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Query Suggestion Using Hitting Time Qiaozhu Mei †, Dengyong Zhou ‡, Kenneth Church ‡ † University of Illinois at Urbana-Champaign ‡ Microsoft Research, Redmond

  2. Motivating Examples Sports center MSG 1. Difficult for a user to express information need 2. Difficult for a Search engine to infer information need Food Additive Query Suggestions: Accurate to express the information need; Easy to infer information need

  3. Motivating Examples (Cont.) Welcome to the hotel california

  4. Motivating Examples: Personalization MSR Metropolis Street Racer Magnetic Stripe Reader Molten salt reactor Mars Sample Return … Mountain safety research Actually Looking for Microsoft Research…

  5. Research Questions • How can we generate query suggestions in a principled way? • Can we generate personalized query suggestions using the same method? • Can this method be generalized to other search related tasks?

  6. Rest of This Talk • Random Walk, Hitting Time, and Bipartite Graph • Generating Query Suggestion • Personalized Query Suggestion • Experiments • Discussion and Summary

  7. Random Walk and Hitting Time P = 0.3 • Hitting Time • TA: the first time that the random walk is at a vertex in A • Mean Hitting Time • hiA: expectation of TA given that the walk starts from vertex i 0.3 k A i 0.7 P = 0.7 j

  8. Computing Hitting Time hiA = 0.7 hjA + 0.3 hkA + 1 h = 0 • TA: the first time that the random walk is at a vertex in A 0.7 k A i • hiA: expectation of TA given that the walk starting from vertex i 0.7 Apparently, hiA = 0 for those j Iterative Computation

  9. Bipartite Graph and Hitting Time • Bipartite Graph: • Edges between V1 and V2 • No edge inside V1 or V2 • Edges are weighted • e.g., V1 = query; V2 = Url 5 5 5 A A A 4 4 4 V1 V1 V1 0.4 0.4 0.4 V2 V2 V2 k 0.7 0.7 0.7 7 7 7 1 1 1 i i i w(i, j) = 3 j j j Expected proximity of query i to the query A : hitting time of i  A, hiA • convert to a directed graph, even collapse one group

  10. Generate Query Suggestion • Construct a (kNN) subgraph from the query log data (of a predefined number of queries/urls) • Compute transition probabilities p(i  j) • Compute hitting time hiA • Rank candidate queries using hiA Query Url 300 T www.aa.com aa 15 www.theaa.com/travelwatch/planner_main.jsp mexiana american airline en.wikipedia.org/wiki/Mexicana

  11. Intuition • Why it works? • A url is close to a query if freq(q, url) dominates the number of clicks on this url (most people use q to access url) • A query is close to the target query if it is close to many urls that are close to the target query

  12. Personalized Query Suggestion • Queries are ambiguous • Different user  different information need  different query suggestions • Simple approach: build the graph, compute hitting time solely based on the user’s history • Data Sparseness • E.g., you cannot see a query if you never used it • Alternative: modify the bipartite graph instead of rebuilding all

  13. Personalize the Bipartite Graph • Key: How to compute • From w(url, user, query) – Sparse data! • Compute a smoothed p(Url | User, Query) Query Url Reweight edges using personalized Probs. T aa www.aa.com pseudo query: P “aa” + user www.theaa.com/travelwatch/planner_main.jsp alcoholics anonymous en.wikipedia.org/wiki/Alcoholics_Anonymous Introduce a pseudo (personalized query) american airline www.alcoholics-anonymous.org

  14. Personalization with Backoff (Mei and Church 08) Full personalization: sparse data! 156.111.188.243 156.111.188.* Personalization with backoff: 156.111.*.* 156.*.*.* No personalization: lose the opportunity *.*.*.* • We don’t have enough data for everyone! • - Backoff to classes of users (e.g., IP)

  15. Experiments • Query Suggestion using Query Logs • commercial search engine log (1.5 year) • 637 million queries; 585 million urls • Query-click bipartite graph • Author/keyword suggestion using DBLP • titles and authors from DBLP • 110k of papers, 580k authors • Coauthor graph, keyword graph, author-keyword bipartite graph • Baselines: nearest neighbor; personalized pagerank

  16. Result: Query Suggestion Query = friends

  17. Result: Query Suggestion (II) Query = aa Query = ranknet

  18. Results: Personalized Query Suggestion Query = msr

  19. Result: Author Suggestion Favor students, especially current students Query = Jon Kleinberg (personalized Pagerank is similar) Famous researchers + former students

  20. Result: Keyword Suggestion

  21. Result: Keyword Suggestion for Author Query = Michael I. Jordan Query = Jiawei Han

  22. Discussions • Hitting time effectively boosts infrequent queries • Nearest Neighbor & personalized pagerank favorites frequent queries • Fast convergence: a few iterations and a subgraph gets most of the value • No parameter to tune • Can be generalized to many other tasks (on different graphs)

  23. Ranking on Query log Graph and Search Tasks • Query  Query: query suggestion • Url  Url: finding related pages www.cs.jhu.edu/~brill  • "research.microsoft.com/users/brill” • IP  IP: finding similar users • Url  Query: Annotation, Summarization, ads term • Query  Url: Search • IP, Query  Url: Personalized Search • IP, Query  Query: Personalized Query Suggestion • Many other opportunities!

  24. Summary • Generate query suggestions using hitting time on query-click graph • Personalized query suggestion • Generalizable to other search tasks • Future work: • Different types of graphs: e.g., query sessions • Combine with other features • Large scale evaluation

  25. Thanks!

More Related