1 / 19

Query Classification

Query Classification. and KDDCUP 2005 Qiang Yang, Dou Shen. Query Classification and Online Advertisement. QC as Machine Learning. Inspired by the KDDCUP’05 competition Classify a query into a ranked list of categories Queries are collected from real search engines

merle
Télécharger la présentation

Query Classification

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Query Classification and KDDCUP 2005 Qiang Yang, Dou Shen

  2. Query Classification and Online Advertisement

  3. QC as Machine Learning Inspired by the KDDCUP’05 competition • Classify a query into a ranked list of categories • Queries are collected from real search engines • Target categories are organized in a tree with each node being a category 3

  4. How to do it?

  5. Solutions: Query Enrichment + Staged Classification Solution 1: Query/Category Enrichment 5 Solution 2: Bridging classifier

  6. Category information Title Snippet Category Full text Query enrichment • Textual information 6

  7. E D Classifiers • Map by Word Matching • Direct and Extended Matching • High precision, low recall • SVM: Apply synonym-based classifiers to map Web pages from ODP to target taxonomy • Obtain <pages, target category> as the training data • Train SVM classifiers for the target categories; • Higher Recall 7

  8. Bridging Classifier Problem with Solution 1: When target is changed, training needs to repeat! Solution: Connect the target taxonomy and queries by taking an intermediate taxonomy as a bridge 8

  9. Bridging Classifier (Cont.) The relation between and • How to connect? The relation between and The relation between and Prior prob. of 9

  10. Category Selection for Intermediate Taxonomy Category Selection for Reducing Complexity Total Probability (TP) Mutual Information 10

  11. Experiment─ Data Sets & Evaluation • KDDCUP • Starting at 1997, KDD Cup is the leading Data Mining and Knowledge Discovery competition in the world, organized by ACM SIGKDD • KDDCUP 2005 • Task: Categorize 800K search queries into 67 categories • Three Awards • (1) Performance Award ; (2) Precision Award; (3) Creativity Award • Participation • 142 registration groups; 37 solutions submitted from 32 teams • Evaluation data • 800 queries randomly selected from the 800K query set • 3 human labelers labeled the entire evaluation query set (details) • Evaluation measurements: Precision and Performance (F1) (details) • a 11/ 68

  12. Experiment Results─ Compare Different Methods Comparison among our own methods Comparison with other teams in KDDCUP2005 From Different Groups 12/ 68

  13. Result of Bridging Classifiers Using bridging classifier allows the target classes to change freely without the need to retrain the classifier! • Performance of the Bridging Classifier with Different Granularity of Intermediate Taxonomy

  14. Target-transfer Learning • Classifier, once trained, stays constant • When target classes change, classifier needs to be retrained with new data • Too costly • Not online • Bridging Classifier: • Allow target to change • Application: advertisements come and go, but our querytarget mapping needs not be retrained! • We call this the target-transfer learning problem

  15. Task 2: Can computer do this?

  16. Data: Web Search Queries AAAI Machine learning Constraint Reasoning • Consider the following search queries • “AAAI” • “Machine Learning” • “Constraint Reasoning”

  17. AAAI 07, joint work with D. Shen, J. Sun, M. Qin, Z. Chen et al. Queries have different granularity Car v.s BMW; BMW v.s. AUDI Can we organize the queries into hierarchies? Benefits of building query hierarchies Provide online query suggestion Query classification Query clustering Difficulties of building query hierarchies Queries are short The hierarchical structure cannot be pre-defined

  18. Clickthrough Data Clickthrough Data Search Engines

  19. Intuitive Ideas Our goal: mine the query hierarchies from clickthrough data If two queries are related to each other, they should share some of the same or similar clicked Web pages; For two queries qi and qj, qiis more general if most of the clicked pages of qjhave similar pages to some clicked pages of qi while not the other way around If a query is specific, the contents of its clicked pages are relatively consistent,

More Related