Prior Knowledge Driven Domain Adaptation Gourab Kundu, Ming-wei Chang, and Dan Roth

Prior Knowledge Driven Domain Adaptation Gourab Kundu, Ming-wei Chang, and Dan Roth Experimental Results Constrained Conditional Model Domain Adaptation After adding knowledge, POS tagging error reduces 42%, SRL error reduces 25% on Be verbs and 9% on all verbs. • Incorporate prior knowledge as constraints c = {Cj(.)}. • Learn the weight vector w ignoring c. • Impose constraints c at inference time. Problem: Performance of statistical systems drops significantly when tested on a domain different than the training domain. Example:CoNLL 2007 shared task – annotation standard was different across the source and target domain. Motivation: Prior Knowledge is cheap and readily available for many domains. Solution:Use prior knowledge on the target domain for better adaptation. • For POS tagging, we do not have any domain independent knowledge. • For SRL, we use some domain independent knowledge. • Example: Two arguments can not overlap. POS Tagging I eat fruits . When POS Tagger trained on WSJ domain is tested on Bio domain, F1 drops 9%. PDA-KW PRP VB NNS . Comparison with JiangZh07 • Incorporate Target domain specific knowledge c’ = {C’k(.)} as constraints. • Impose constraints c and c’ at inference time. • Adaptation without retraining. Semantic Role Labeling (SRL) • Without using any labeled data, prior knowledge reduces error 38% over using 300 labeled sentences. • Without using any labeled data, prior knowledge recovers 72% accuracy gain of adding 2730 labeled sentences. I eat fruits . When SRL trained on WSJ domain is tested on Ontonotes, F1 drops 18%. A0 V A1 Prior Knowledge on Ontonotes Be verbs are unseen in training domain. . • If be verb is followed by a verb immediately, there can be no core argument. • Example: John is eating. • If be verb is followed by the word “like”, core arguments of A0 and A1 are possible. • Example: And he’s like why ‘s the door open ? • Otherwise, A1 and A2 are possible. • Example: John is a good man. PDA-ST • Motivation: Constraints are accurate but apply rarely. So can we generalize to cases where constraints did not apply? • Solution: Embed constraints into self training. Frame file of “be” verb Ds: Source domain labeled data Du: Target domain unlabeled data Dt: Target domain test data Conclusion Prior Knowledge on BioMed • Hyphenated compounds are tagged as NN. • Example:H-ras • Digit letter combinations should be tagged as NN. • Example:CTNNB1 • Hyphen should be tagged as HYPH. Annotation wiki • Prior knowledge gives competitive results to using labeled data. • Future Work • Improve the results for self-training. • Find theoretical justifications for self training • Apply PDA to more tasks/ domains. Suggestions? “Only names of persons, locations etc. are proper nouns which are very few. Gene, disease, drug names etc. are marked as common nouns. “ References • Any word unseen in source domain followed by the word “gene” should be tagged as NN. Example:ras gene • If any word does not appear with tag NNP in training data, predict NN instead of NNP. Example:polymerase chain reaction ( PCR ) Self-training • J. Jiang and C. Zhai, Instance Weighting for domain adaptation in nlp, acl07 • G. Kundu and D. Roth, Adapting text instead of the Model: An Open Domain Approach, conll 11 • J. Blitzer, R. Mcdonald, F. Pereira, Domain Adaptation with Structural Correspondence Learning, emnlp06 • Motivation: How good is self training without knowledge? Same as PDA-ST except replace the red boxed line with the following line. This research is sponsored by ARL and DARPA, under the machine reading program. TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAAAA

“Only names of persons, locations etc. are proper nouns which are very few. Gene, disease, drug names etc. are marked as common nouns. “

Prior Knowledge Driven Domain Adaptation Gourab Kundu, Ming-wei Chang, and Dan Roth

Prior Knowledge Driven Domain Adaptation Gourab Kundu, Ming-wei Chang, and Dan Roth

Presentation Transcript

Domain Driven Design

Domain Adaptation

Concept-Based Analysis of Scientific Literature Chen- Tse Tsai, Gourab Kundu , Dan Roth

Prior Knowledge

Prior Knowledge!

Prior Knowledge

Prior Knowledge

Domain Driven Design

Domain Driven Design

Prior Knowledge

Prior knowledge

Community Driven Adaptation

Chang, Roth. Science 293, 1793, 2001

CHAO-WEI HSU, CHAU-TING YEH, MING-LING CHANG, and YUN-FAN LIAW

Ming-Yih Chang , National Ilan Univ . , Taiwan Wei Fang , National Taiwan Univ . , Taiwan

Community-Driven Adaptation

Multi-core Structural SVM Training Kai-Wei Chang, Vivek Srikumar and Dan Roth

Margin-based Decomposed Amortized Inference Gourab Kundu, Vivek Srikumar, Dan Roth

Ming- wei Chang University of Illinois at Urbana-Champaign Wen -tau Yih and Robert McCann

Ming- wei Chang University of Illinois at Urbana-Champaign Wen -tau Yih and Christopher Meek

Prior Knowledge