1 / 97

Question Classification (cont’d)

Question Classification (cont’d). Ling573 NLP Systems & Applications April 12, 2011. Upcoming Events. Two seminars: Friday 3:30 Linguistics seminar: Janet Pierrehumbert : Northwestern Example-Based Learning and the Dynamics of the Lexicon AI Seminar: Patrick Pantel : MSR

dot
Télécharger la présentation

Question Classification (cont’d)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Question Classification (cont’d) Ling573 NLP Systems & Applications April 12, 2011

  2. Upcoming Events • Two seminars: Friday 3:30 • Linguistics seminar: • Janet Pierrehumbert: Northwestern • Example-Based Learning and the Dynamics of the Lexicon • AI Seminar: • Patrick Pantel: MSR • Associating Web Queries with Strongly-Typed Entities

  3. Roadmap • Question classification variations: • SVM classifiers • Sequence classifiers • Sense information improvements • Question series

  4. Question Classification with Support Vector Machines • Hacioglu & Ward 2003 • Same taxonomy, training, test data as Li & Roth

  5. Question Classification with Support Vector Machines • Hacioglu & Ward 2003 • Same taxonomy, training, test data as Li & Roth • Approach: • Shallow processing • Simpler features • Strong discriminative classifiers

  6. Question Classification with Support Vector Machines • Hacioglu & Ward 2003 • Same taxonomy, training, test data as Li & Roth • Approach: • Shallow processing • Simpler features • Strong discriminative classifiers

  7. Features & Processing • Contrast: (Li & Roth) • POS, chunk info; NE tagging; other sense info

  8. Features & Processing • Contrast: (Li & Roth) • POS, chunk info; NE tagging; other sense info • Preprocessing: • Only letters, convert to lower case, stopped, stemmed

  9. Features & Processing • Contrast: (Li & Roth) • POS, chunk info; NE tagging; other sense info • Preprocessing: • Only letters, convert to lower case, stopped, stemmed • Terms: • Most informative 2000 word N-grams • Identifinder NE tags (7 or 9 tags)

  10. Classification & Results • Employs support vector machines for classification • Best results: Bi-gram, 7 NE classes

  11. Classification & Results • Employs support vector machines for classification • Best results: Bi-gram, 7 NE classes • Better than Li & Roth w/POS+chunk, but no semantics • Fewer NE categories better • More categories, more errors

  12. Classification & Results • Employs support vector machines for classification • Best results: Bi-gram, 7 NE classes • Better than Li & Roth w/POS+chunk, but no semantics

  13. Classification & Results • Employs support vector machines for classification • Best results: Bi-gram, 7 NE classes • Better than Li & Roth w/POS+chunk, but no semantics • Fewer NE categories better • More categories, more errors

  14. Enhanced Answer Type Inference … Using Sequential Models • Krishnan, Das, and Chakrabarti 2005 • Improves QC with CRF extraction of ‘informer spans’

  15. Enhanced Answer Type Inference … Using Sequential Models • Krishnan, Das, and Chakrabarti 2005 • Improves QC with CRF extraction of ‘informer spans’ • Intuition: • Humans identify Atype from few tokens w/little syntax

  16. Enhanced Answer Type Inference … Using Sequential Models • Krishnan, Das, and Chakrabarti 2005 • Improves QC with CRF extraction of ‘informer spans’ • Intuition: • Humans identify Atype from few tokens w/little syntax • Who wrote Hamlet?

  17. Enhanced Answer Type Inference … Using Sequential Models • Krishnan, Das, and Chakrabarti 2005 • Improves QC with CRF extraction of ‘informer spans’ • Intuition: • Humans identify Atype from few tokens w/little syntax • Who wrote Hamlet?

  18. Enhanced Answer Type Inference … Using Sequential Models • Krishnan, Das, and Chakrabarti 2005 • Improves QC with CRF extraction of ‘informer spans’ • Intuition: • Humans identify Atype from few tokens w/little syntax • Who wrote Hamlet? • How many dogs pull a sled at Iditarod?

  19. Enhanced Answer Type Inference … Using Sequential Models • Krishnan, Das, and Chakrabarti 2005 • Improves QC with CRF extraction of ‘informer spans’ • Intuition: • Humans identify Atype from few tokens w/little syntax • Who wrote Hamlet? • How many dogs pull a sled at Iditarod?

  20. Enhanced Answer Type Inference … Using Sequential Models • Krishnan, Das, and Chakrabarti 2005 • Improves QC with CRF extraction of ‘informer spans’ • Intuition: • Humans identify Atype from few tokens w/little syntax • Who wrote Hamlet? • How many dogs pull a sled at Iditarod? • How much does a rhino weigh?

  21. Enhanced Answer Type Inference … Using Sequential Models • Krishnan, Das, and Chakrabarti 2005 • Improves QC with CRF extraction of ‘informer spans’ • Intuition: • Humans identify Atype from few tokens w/little syntax • Who wrote Hamlet? • How many dogs pull a sled at Iditarod? • How much does a rhino weigh?

  22. Enhanced Answer Type Inference … Using Sequential Models • Krishnan, Das, and Chakrabarti 2005 • Improves QC with CRF extraction of ‘informer spans’ • Intuition: • Humans identify Atype from few tokens w/little syntax • Who wrote Hamlet? • How many dogs pull a sled at Iditarod? • How much does a rhino weigh? • Single contiguous span of tokens

  23. Enhanced Answer Type Inference … Using Sequential Models • Krishnan, Das, and Chakrabarti 2005 • Improves QC with CRF extraction of ‘informer spans’ • Intuition: • Humans identify Atype from few tokens w/little syntax • Who wrote Hamlet? • How many dogs pull a sled at Iditarod? • How much does a rhino weigh? • Single contiguous span of tokens • How much does a rhino weigh?

  24. Enhanced Answer Type Inference … Using Sequential Models • Krishnan, Das, and Chakrabarti 2005 • Improves QC with CRF extraction of ‘informer spans’ • Intuition: • Humans identify Atype from few tokens w/little syntax • Who wrote Hamlet? • How many dogs pull a sled at Iditarod? • How much does a rhino weigh? • Single contiguous span of tokens • How much does a rhino weigh? • Who is the CEO of IBM?

  25. Informer Spans as Features • Sensitive to question structure • What is Bill Clinton’s wife’s profession?

  26. Informer Spans as Features • Sensitive to question structure • What is Bill Clinton’s wife’sprofession?

  27. Informer Spans as Features • Sensitive to question structure • What is Bill Clinton’s wife’s profession? • Idea: Augment Q classifier word ngrams w/IS info

  28. Informer Spans as Features • Sensitive to question structure • What is Bill Clinton’s wife’s profession? • Idea: Augment Q classifier word ngrams w/IS info • Informer span features: • IS ngrams

  29. Informer Spans as Features • Sensitive to question structure • What is Bill Clinton’s wife’s profession? • Idea: Augment Q classifier word ngrams w/IS info • Informer span features: • IS ngrams • Informer ngramshypernyms: • Generalize over words or compounds

  30. Informer Spans as Features • Sensitive to question structure • What is Bill Clinton’s wife’s profession? • Idea: Augment Q classifier word ngrams w/IS info • Informer span features: • IS ngrams • Informer ngramshypernyms: • Generalize over words or compounds • WSD?

  31. Informer Spans as Features • Sensitive to question structure • What is Bill Clinton’s wife’s profession? • Idea: Augment Q classifier word ngrams w/IS info • Informer span features: • IS ngrams • Informer ngramshypernyms: • Generalize over words or compounds • WSD? No

  32. Effect of Informer Spans • Classifier: Linear SVM + multiclass

  33. Effect of Informer Spans • Classifier: Linear SVM + multiclass • Notable improvement for IS hypernyms

  34. Effect of Informer Spans • Classifier: Linear SVM + multiclass • Notable improvement for IS hypernyms • Better than all hypernyms – filter sources of noise • Biggest improvements for ‘what’, ‘which’ questions

  35. Effect of Informer Spans • Classifier: Linear SVM + multiclass • Notable improvement for IS hypernyms • Better than all hypernyms – filter sources of noise • Biggest improvements for ‘what’, ‘which’ questions

  36. Perfect vs CRF Informer Spans

  37. Recognizing Informer Spans • Idea: contiguous spans, syntactically governed

  38. Recognizing Informer Spans • Idea: contiguous spans, syntactically governed • Use sequential learner w/syntactic information

  39. Recognizing Informer Spans • Idea: contiguous spans, syntactically governed • Use sequential learner w/syntactic information • Tag spans with B(egin),I(nside),O(outside) • Employ syntax to capture long range factors

  40. Recognizing Informer Spans • Idea: contiguous spans, syntactically governed • Use sequential learner w/syntactic information • Tag spans with B(egin),I(nside),O(outside) • Employ syntax to capture long range factors • Matrix of features derived from parse tree

  41. Recognizing Informer Spans • Idea: contiguous spans, syntactically governed • Use sequential learner w/syntactic information • Tag spans with B(egin),I(nside),O(outside) • Employ syntax to capture long range factors • Matrix of features derived from parse tree • Cell:x[i,l], i is position, l is depth in parse tree, only 2 • Values: • Tag: POS, constituent label in the position • Num: number of preceding chunks with same tag

  42. Parser Output • Parse

  43. Parse Tabulation • Encoding and table:

  44. CRF Indicator Features • Cell: • IsTag, IsNum: e.g. y4 = 1 and x[4,2].tag=NP • Also, IsPrevTag, IsNextTag

  45. CRF Indicator Features • Cell: • IsTag, IsNum: e.g. y4 = 1 and x[4,2].tag=NP • Also, IsPrevTag, IsNextTag • Edge: • IsEdge: (u,v) , yi-1=u and yi=v • IsBegin, IsEnd

  46. CRF Indicator Features • Cell: • IsTag, IsNum: e.g. y4 = 1 and x[4,2].tag=NP • Also, IsPrevTag, IsNextTag • Edge: • IsEdge: (u,v) , yi-1=u and yi=v • IsBegin, IsEnd • All features improve

  47. CRF Indicator Features • Cell: • IsTag, IsNum: e.g. y4 = 1 and x[4,2].tag=NP • Also, IsPrevTag, IsNextTag • Edge: • IsEdge: (u,v) , yi-1=u and yi=v • IsBegin, IsEnd • All features improve • Question accuracy: Oracle: 88%; CRF: 86.2%

  48. Question Classification Using Headwords and Their Hypernyms • Huang, Thint, and Qin 2008 • Questions: • Why didn’t WordNet/Hypernym features help in L&R?

  49. Question Classification Using Headwords and Their Hypernyms • Huang, Thint, and Qin 2008 • Questions: • Why didn’t WordNet/Hypernym features help in L&R? • Best results in L&R - ~200,000 feats; ~700 active • Can we do as well with fewer features?

  50. Question Classification Using Headwords and Their Hypernyms • Huang, Thint, and Qin 2008 • Questions: • Why didn’t WordNet/Hypernym features help in L&R? • Best results in L&R - ~200,000 feats; ~700 active • Can we do as well with fewer features? • Approach: • Refine features:

More Related