1 / 2

Prior Knowledge Driven Domain Adaptation Gourab Kundu, Ming-wei Chang, and Dan Roth

Prior Knowledge Driven Domain Adaptation Gourab Kundu, Ming-wei Chang, and Dan Roth. Experimental Results. Constrained Conditional Model. Domain Adaptation. After adding knowledge, POS tagging error reduces 42% , SRL error reduces 25% on Be verbs and 9% on all verbs.

avel
Télécharger la présentation

Prior Knowledge Driven Domain Adaptation Gourab Kundu, Ming-wei Chang, and Dan Roth

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Prior Knowledge Driven Domain Adaptation Gourab Kundu, Ming-wei Chang, and Dan Roth Experimental Results Constrained Conditional Model Domain Adaptation After adding knowledge, POS tagging error reduces 42%, SRL error reduces 25% on Be verbs and 9% on all verbs. • Incorporate prior knowledge as constraints c = {Cj(.)}. • Learn the weight vector w ignoring c. • Impose constraints c at inference time. Problem: Performance of statistical systems drops significantly when tested on a domain different than the training domain. Example:CoNLL 2007 shared task – annotation standard was different across the source and target domain. Motivation: Prior Knowledge is cheap and readily available for many domains. Solution:Use prior knowledge on the target domain for better adaptation. • For POS tagging, we do not have any domain independent knowledge. • For SRL, we use some domain independent knowledge. • Example: Two arguments can not overlap. POS Tagging I eat fruits . When POS Tagger trained on WSJ domain is tested on Bio domain, F1 drops 9%. PDA-KW PRP VB NNS . Comparison with JiangZh07 • Incorporate Target domain specific knowledge c’ = {C’k(.)} as constraints. • Impose constraints c and c’ at inference time. • Adaptation without retraining. Semantic Role Labeling (SRL) • Without using any labeled data, prior knowledge reduces error 38% over using 300 labeled sentences. • Without using any labeled data, prior knowledge recovers 72% accuracy gain of adding 2730 labeled sentences. I eat fruits . When SRL trained on WSJ domain is tested on Ontonotes, F1 drops 18%. A0 V A1 Prior Knowledge on Ontonotes Be verbs are unseen in training domain. . • If be verb is followed by a verb immediately, there can be no core argument. • Example: John is eating. • If be verb is followed by the word “like”, core arguments of A0 and A1 are possible. • Example: And he’s like why ‘s the door open ? • Otherwise, A1 and A2 are possible. • Example: John is a good man. PDA-ST • Motivation: Constraints are accurate but apply rarely. So can we generalize to cases where constraints did not apply? • Solution: Embed constraints into self training. Frame file of “be” verb Ds: Source domain labeled data Du: Target domain unlabeled data Dt: Target domain test data Conclusion Prior Knowledge on BioMed • Hyphenated compounds are tagged as NN. • Example:H-ras • Digit letter combinations should be tagged as NN. • Example:CTNNB1 • Hyphen should be tagged as HYPH. Annotation wiki • Prior knowledge gives competitive results to using labeled data. • Future Work • Improve the results for self-training. • Find theoretical justifications for self training • Apply PDA to more tasks/ domains. Suggestions? “Only names of persons, locations etc. are proper nouns which are very few. Gene, disease, drug names etc. are marked as common nouns. “ References • Any word unseen in source domain followed by the word “gene” should be tagged as NN. Example:ras gene • If any word does not appear with tag NNP in training data, predict NN instead of NNP. Example:polymerase chain reaction ( PCR ) Self-training • J. Jiang and C. Zhai, Instance Weighting for domain adaptation in nlp, acl07 • G. Kundu and D. Roth, Adapting text instead of the Model: An Open Domain Approach, conll 11 • J. Blitzer, R. Mcdonald, F. Pereira, Domain Adaptation with Structural Correspondence Learning, emnlp06 • Motivation: How good is self training without knowledge? Same as PDA-ST except replace the red boxed line with the following line. This research is sponsored by ARL and DARPA, under the machine reading program. TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAAAA

  2. “Only names of persons, locations etc. are proper nouns which are very few. Gene, disease, drug names etc. are marked as common nouns. “

More Related