1 / 35

Bridged Refinement for Transfer Learning

Bridged Refinement for Transfer Learning. XING Dikan , DAI Wenyua, XUE Gui-Rong, YU Yong Shanghai Jiao Tong University {xiaobao,dwyak,grxue,yyu}@apex.sjtu.edu.cn. Outline. Motivation Problem Solution Assumption Method Improvement and Final Solution Experiment Conclusion. Overview.

sahara
Télécharger la présentation

Bridged Refinement for Transfer Learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bridged Refinement for Transfer Learning XING Dikan, DAI Wenyua, XUE Gui-Rong, YU Yong Shanghai Jiao Tong University {xiaobao,dwyak,grxue,yyu}@apex.sjtu.edu.cn

  2. Outline • Motivation • Problem • Solution • Assumption • Method • Improvement and Final Solution • Experiment • Conclusion

  3. Overview • Motivation • Problem • Solution • Assumption • Method • Improvement and Final Solution • Experiment • Conclusion

  4. Motivation • Email spamming: Whether a given mail is a spam or not. • Training Data • Test Data Mailbox: football basketball basketball classic music Pop music

  5. Motivation • New events always occur. news in 2006, commercial or politics news in 2007, commercial or politics • Solution ? • Labeling new data again and again -- costly • Therefore, … We try to utilize those old labeled data but take the shift of distribution into consideration. [Transfer useful information]

  6. Overview • Motivation • Problem • Solution • Assumption • Method • Improvement and Final Solution • Experiment • Some other solutions

  7. Problem • We want to solve a classification problem. • The set of target categories is fixed. • Main difference from traditional classification: • The training data and test data are governed by two slightly different distributions. • We do not need labeled data in the new test data distribution.

  8. Illustrative Example sports +: normal mail music -: spam mail

  9. Overview • Motivation • Problem • Solution • Assumption • Method • Improvement and Final Solution • Experiment • Some other solutions

  10. Overview • Motivation • Problem • Solution • Assumption • Method • Improvement and Final Solution • Experiment • Some other solutions

  11. Assumption • P(c|d) doesn’t changes: Ptrain(c|d) = Ptest(c|d) Since • The set of target categories is fixed. • Each target category is definite. • P(c|di) ~P(c|dj), when di ~ dj. ~ means “similar”, “close to each other” • Consistency • Mutual Reinforcement Principle

  12. Overview • Motivation • Problem • Solution • Assumption • Method • Improvement and Final Solution • Experiment • Some other solutions

  13. Method: Refinement • UConfc: scores of a base classifier, coarse-gained (Unrefined Confidence score of category c) • M: adjacent matrix. Mij = 1 if di is a neighbor of dj (then row L1 normalized). • RConfc: Refined Confidence score of category c. • Mutual reinforcement principle yields: RConf c = α M RConfc + (1-α) UConfc where α is a trade-off coefficient.

  14. Method: Refinement • Refinement can be regarded as reaching a consistency under the mixture distribution. • Why not try to reach a consistency under the distribution of the test data?

  15. Illustrative Example

  16. Overview • Motivation • Problem • Solution • Assumption • Method • Improvement and Final Solution • Experiment • Some other solutions

  17. Method: Bridged Refinement • Bridged Refinement • Refine towards the mixture distribution • Refine towards the target distribution.

  18. Outline • Motivation • Problem • Solution • Assumption • Method • Improvement and Final Solution • Experiment • Conclusion

  19. Experiment • Data set • Base classifiers • Different refinement styles • Performance • Parameter sensitivity

  20. Experiment: Data set • Source • SRAA • Simulated autos (simauto) • Simulated aviation (simaviation) • Real autos (realauto) • Real aviation (realaviation) • 20 Newsgroup • Top level categories: rec, talk, sci, comp • Reuters-21578 • Top level categories: org, places, people

  21. Experiment: Data set • Re-construction • 11 data sets Positive Negative Training Data Test Data

  22. Experiment: Base classifier • Supervised • Generative model: Naïve Bayes classifier • Discriminative model: Support vector machines • Semi-supervised: • Transductive support vector machines

  23. Experiment: Refinement Style • No refinement (base) • One step • Refine directly on the test distribution (Test) • Refine on the mixture distribution only (Mix) • Two steps • Bridged Refinement (Bridged)

  24. Performance: On SVM • Base • Test • Mix • Bridged • Test (2nd) , Mix(3rd) v.s. Base (1st) • Test (2nd) v.s. Bridged (1st): • Different start point

  25. Performance: NB and TSVM

  26. Parameter: K Whether di is regarded as a neighbor of dj is decided by checking whether di is in dj’s k-nearest neighbor set.

  27. Parameter: α Error rate Vs. Different alpha

  28. Convergence The refinement formula can be solved in a close manner or an iterative manner.

  29. Outline • Motivation • Problem • Solution • Assumption • Method • Improvement and Final Solution • Experiment • Conclusion

  30. Conclusion • Task: Transfer useful information from training data to the same classification task of the test data, while training and test data are governed by two different distributions. • Approach: Take the mixture distribution as a bridge and make a two-step refinement.

  31. Thank you Please ask in slow and simple English 

  32. Backup 1: Tranductive • The boundary after either step of refinement are actually never calculated explicitly. It is hidden in the refined labels of each data points. • I draw it in the examples explicitly is for a clearer illustration only.

  33. Backup 2: n-step • One important problem left unsolved by us: • How to describe a distribution \lembda D_train + (1-\lembda) D_test ? • One solution is sampling in a generative manner. But this makes the result depends on each random number picked up in the generative process. It may cause the solution not very stable and hard to repeat.

  34. Backup 3: Why mutual reinforcement principle ? • If d_j has a high confidence to be in category c, then d_i, the neigbhor of d_j should also receive a high confidence score.

More Related