1 / 22

MEASUREMENT AND MODELING OF A WEB-BASED QUESTION ANSWERING SYSTEM

MEASUREMENT AND MODELING OF A WEB-BASED QUESTION ANSWERING SYSTEM. Chunyi Peng, Zaoyang Gong, Guobin shen Microsoft Research Asia HotWeb 2006. When you have a question…. Solve it yourself! – Ooh, out of our scope! Usually, Search it! –A common and good way in many cases, but

lola
Télécharger la présentation

MEASUREMENT AND MODELING OF A WEB-BASED QUESTION ANSWERING SYSTEM

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MEASUREMENT AND MODELING OF A WEB-BASED QUESTION ANSWERING SYSTEM Chunyi Peng, Zaoyang Gong, Guobin shen Microsoft Research Asia HotWeb 2006

  2. When you have a question… • Solve it yourself! – Ooh, out of our scope! • Usually, Search it! –A common and good way in many cases, but • Search engine typically returns pages of links, not direct answers. • Some time it is very difficult for people to describe their questions in a precise way. • not all information is readily available in the web. • So, Ask! –A natural and effective way • Question-Answering (QA) utilizes grassroots intelligence and collaboration • Especially as a specific information acquisition.

  3. So, our goals… • Measurement and modeling o f a real large-scale QA system • how a real QA system works? • What are the typical user behaviors and their impacts? • Seek Better QA system • How to design a QA system? • How to make performance tradeoffs?

  4. iAsk (http://iask.sina.com.cn) • A topic-based web-QA system • Question lifecycle: • questioning->wait for reply -> confirmation (closed) • Provide optimal reply selection & reply rewarding

  5. Measurement Results • Data Set • 2-month (Nov 22, 2005 to Jan 23, 2006) • 350K questions and 2M replies • 220K users, 1901 topics • Measurement on • Question/reply patterns over time • Question/reply pattern over topics • Question/reply pattern across users • Question/reply Incentive mechanisms

  6. Behavior Pattern over Time • On Hourly Scale: a consistent usage pattern

  7. Behavior Pattern over Topics • Topic characteristics • P--Popularity (#Q) (Zipf-Popularity) • questioning and replying activities • Q--Question Proneness (#Q/#U) • the likelihood that a user will ask a question • R-- Reply Proneness (#R/#U) • the likelihood that a user will reply a question • Our measurement shows that topic characteristics vary intensively and user behaves quite differently.

  8. Behavior Pattern across Users • Active and non-active users • about 9% users to 80% replies VS. about 22% users to 80% questions • asymmetric questioning/replying pattern • 4.7% altruists VS. 17.7% free-riders • Narrow user interests • #topic (Q): 1.8 • #topic (R): 3.3

  9. Performance Metric • Reply-Rate • how likely his question can be replied • Reply-Number • How likely his question can get an expected answer • Reply-Latency • how quickly he can get an answer

  10. iAsk performance • Long-term performance: • Reply-Rate: 99.8% • Reply-Number: about 5 • Reply-Latency: about 10hr • Within 24hrs • Reply-Rate: 85% • Reply-Number: about 4 • Reply-Latency: about 6hr • In summary, the performance is quite satisfactory except sometimes users need tolerate a relative long delay

  11. Measurement on Incentive Mechanism

  12. Modeling • The question arrival distribution: Poisson distribution • The reply behavior: an approximate exponentially-decaying model •  Performance formula • Define dynamic performance

  13. Parameter Impact

  14. Possible Improvement • Active or Push-based Question Delivery • Better Webpage Layout, e.g. adding shortcuts • Better Incentive mechanism • Utilize Power of Social Networks

  15. Conclusions • Web-QA that leverages the grassroots’ intelligence and collaboration is hot and getting hotter… • Our measurement and model revealed that the QA’s QoS heavily depends on three key factors: user scale, user reply probability and a system design artifact, e.g. webpage design. • Current simple Web-QA System achieved the acceptable performance, but there still is improvement room

  16. Backup

  17. Behavior Pattern over Topics • Topic characteristics • P--Popularity (#Q) (Zipf-Popularity)

  18. Behavior Pattern over Topics • Topic characteristics • P--Popularity (#Q), Zipf-Popularity • Q--Question Proneness (#Q/#U) • R-- Reply Proneness (#R/#U)

  19. Narrow User Interest Scope

  20. Reply distribution (measured)

  21. Static Performance Formula Reply-Rate Reply-Number Reply-Latency

  22. Dynamic Performance Formula Define dynamic performance We have,

More Related