Automatic Query Reformulation with Syntactic Operators to Alleviate Search Difficulty
190 likes | 292 Vues
Learn about automatic query reformulation using syntactic operators to improve search effectiveness. Explore the model, features, and experiments demonstrating the benefits of this approach.
Automatic Query Reformulation with Syntactic Operators to Alleviate Search Difficulty
E N D
Presentation Transcript
Automatic Query Reformulation with Syntactic Operators to Alleviate Search Difficulty Huizhong Duan, Rui Li, chengxiangZhai University of illinois at urbana-champaign
Introduction • Search Engine • No. 1 important tool for getting information. • We use everyday. • Queries • We are trained to use keyword queries. • Advanced Query Syntax • No idea what it is…
Advanced Query Syntax • Necessity Operator • E.g. green tree +street • I’m looking for a street! • Phrase Operator • E.g. “green tree street” • Not green street with trees! • Synonym Operator • E.g. green tree ~street • Hmm, I’m not sure it’s a street/road/avenue… • …… • Syntactic Operator, Syntactic Query, Syntactic Reformulation
Syntactic Operators • Extend our ability to express our information needs. • Potentially useful in formulating more effective queries.
Syntactic Operators • Are very effective if used appropriately. • Rarely used by ordinary users. • Difficult to use due to the lack of knowledge of the dataset. • Question: Can we automatically formulate syntactic queries given users’ keyword queries?
Problem formulation • Input:a keyword query q, a syntactic operator op and a target performance metric M. • Goal:to find a list of syntactic reformulations of q through the use of op:Sop(q)={q1,q2,…, qn| M(q1)>M(q2)>…>M(qn)}. • Tasks: • implicit refine: use q,q1,q2,…qmwith probabilities. • explicit refine: output top ranked query q1 if M’(q1)=M(q1)-M(q)>0, or otherwise the original query q. • diagnose query: users resort to help with an ineffective keyword query (negative / pseudo negative feedback is available)
The Model • Learning to rank • Learns a scoring function to score each sample • Pairwise or Listwise loss function • The score indicates the ranking • Score each candidate reformulation with the learned model • “green tree street” • “green tree” street • green “tree street” • green tree street
The features • Difficulty
The features • Distinguishability
The features • Negativity • Corresponds to a scenario where users resort to the reformulation only when they are not satisfied with the result from the keyword query • Negative feedback or pseudo negative feedback is available
Combining operators • Operator Combination • predict syntax queries with different operators jointly • Result-Combination • predict each operator separately and select the reformulation with the best predicted performance.
Experiments • Automatic reformulation: works for negative feedback scenario • Necessity operator: more useful for long queries • Phrase operator: more useful for short queries • Result-Combination: better than Operator-Combination • Syntactic reformulation: makes further improvement over existing negative feedback methods
Case studies • Discover representative keywords/phrases
Case studies • Discover undermatched concepts
Case studies • Eliminate ambiguities caused by matching keywords separately
conclusion • Automatic query reformulation through the use of query syntax operators • Formulate automatic syntactic reformulation as a supervised learning problem under the framework of learning to rank • Propose a set of effective features to represent the characteristics of syntax queries • Method is general, applicable to more syntactic operators