1 / 22

Segmentation of Multi-Sentence Questions: Towards Effective Question Retrieval in cQA Services

Segmentation of Multi-Sentence Questions: Towards Effective Question Retrieval in cQA Services. Kai Wang, Zhao-Yan Ming, Xia Hu, Tat-Seng Chua SIGIR ’ 10 Speaker: Hsin-Lan, Wang Date: 2011/03/07. Outline. Introduction Question Sentence Detection Sequential Pattern Mining

aiko
Télécharger la présentation

Segmentation of Multi-Sentence Questions: Towards Effective Question Retrieval in cQA Services

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Segmentation of Multi-Sentence Questions: Towards Effective Question Retrieval in cQA Services Kai Wang, Zhao-Yan Ming, Xia Hu, Tat-Seng Chua SIGIR’10 Speaker: Hsin-Lan, Wang Date: 2011/03/07

  2. Outline • Introduction • Question Sentence Detection • Sequential Pattern Mining • Syntactic Shallow Pattern Mining • Model Learning • Multi-Sentence Question Segmentation • Building Graphs for Question Threads • Propagating the Closeness Scores • Segmentation-aided Retrieval • Experiment • Conclusion

  3. Introduction • cQA: Community-based Question Answering services

  4. Introduction • A new graph based approach to segment multi-sentence questions would be introduced in this paper. • Basic idea: • Detect question sentences • Measure the closeness score • Model their relationships to form a graph • Use the graph to propagate the closeness scores • Group topically related sentences

  5. Question Sentence Detection • Human generated content on the Web are usually informal. • Solve: Use salient sequential and syntactic patterns as features to build a question detector.

  6. Question Sentence Detection • Sequential Pattern Mining • Sequential Pattern is also referred to as Labeled Sequential Pattern. S→C, C is the class label that the sequence S is classified to. • Sequence is defined to be a series of tokens from sentences, and the class is in the binary form of {Q, NQ}.

  7. Question Sentence Detection • Sequential Pattern Mining • The purpose is to extract a set of frequent subsequence of words that are indicative of questions. • Applying POS taggers to all tokens except some keywords. <any1, know, what>→<any1, VB, what>

  8. Question Sentence Detection • Syntactic Shallow Pattern Mining

  9. Question Sentence Detection • Model Learning • Certain patterns from questions becomes unnatural to identify characteristics for non-questions. • Solve: One-class SVM • Training data: assuming all questions ending with question marks as an initial set of positive examples.

  10. Multi-Sentence Question Segmentation • Building Graphs for Question Threads • Vq: question sentence vertex set Vc: context sentence vertex set • Model the question thread into a weighted graph (V,E).

  11. Multi-Sentence Question Segmentation • Building Graphs for Question Threads • Directed edge (u→v): • KL-divergence • Coherence • Coreference

  12. Multi-Sentence Question Segmentation • Building Graphs for Question Threads • Undirected edge (u-v): • Cosine Similarity • Distance : proportional to the number of sentences between u and v.

  13. Multi-Sentence Question Segmentation • Building Graphs for Question Threads • Undirected edge (u-v): • Coherence • Coreference

  14. Multi-Sentence Question Segmentation • Propagating the Closeness Scores

  15. Multi-Sentence Question Segmentation • Propagating the Closeness Scores • Sort edges in Er by the closeness score. <e1, e2, … , en > • Extraction process terminates at em when one of the following criteria is met:

  16. Multi-Sentence Question Segmentation • Propagating the Closeness Scores • Example: final edge set {(q1,c1), (q2,c2), (q1,c2)} question segments (q1–c1, c2), (q2–c2)

  17. Multi-Sentence Question Segmentation • Segmentation-aided Retrieval

  18. Experiments • Evaluation of Question Detection • Dataset: issued getByCategory API query to Yahoo! Answers. • Generate three datasets: • Pattern Mining Set: 350k sentences extracted from 60k question threads. • Training Set: 130k sentences from another 60k question threads. • Testing Set: Two annotators are asked to tag 2004 question sentences and 2039 non-question sentences.

  19. Experiments • Evaluation of Question Detection

  20. Experiments • Direct Assessment of Multi-Sentence Question Segmentation via User Study

  21. Experiments • Performance Evaluation on Question Retrieval with Segmentation Model

  22. Conclusion • Present a new segmentation approach for segmenting multi-sentence questions. • Separates question sentences from non-question sentences and aligns them according to their closeness scores.

More Related