Ambiguous Nodes in Networked Data based on Measuring Reliable Neighboring Probabilities
430 likes | 519 Vues
Advisor : Prof. Sing Ling Lee Student : Chao Chih Wang Date : 2013.01.04. Ambiguous Nodes in Networked Data based on Measuring Reliable Neighboring Probabilities. Outline. Introduction Network data Traditional VS Networked data Classification Collective Classification ICA
Ambiguous Nodes in Networked Data based on Measuring Reliable Neighboring Probabilities
E N D
Presentation Transcript
Advisor : Prof. Sing Ling Lee Student : Chao Chih Wang Date : 2013.01.04 Ambiguous Nodes in Networked Data based on Measuring Reliable Neighboring Probabilities
Outline • Introduction • Network data • TraditionalVS Networked data Classification • Collective Classification • ICA • Problem • Our Method • Collective Inference With Ambiguous Node (CIAN) • Experiments • Conclusion
Introduction – Network data • traditional data: • instances are independent of each other • network data: • instances are maybe related to each other • application: • emails • web page • paper citation independent related
Introduction • traditionalVS network data classification 1 Class: 1 2 A A 1 E E D D 2 B : Class 1 F F B B 2 G G C C H H
Introduction – Collective Classification • To classify interrelated instances using content features and link features. D D: E: 1 A B 1 1 0 0 1 1 0 2 1 1 0 2 0 0 2 + C E 1/2 1/2 0 1 0 0 1 0 0 1 We use : 1/2 1/2 0 2 1
Introduction – ICA • ICA : Iterative Classification Algorithm Initial : Training local classifier use content features to predict unlabel instances Iterative{ for predict each unlabeled instance { set unlabeled instance ’s link feature use local classifier to predict unlabeled instance } } step1 step2
Introduction – ICA Example unlabel data: Training data: training data: Class : 1 2 3 3 1 H C 2 3 A 3 1 B 2 1 2 2/3 0 1/3 1 A: 3 1/3 1/3 1/3 D E 2 I 3 1 F 1 1 G 2 1/2 1/2 0 B: 2 1/4 1/2 1/4
Problem – AmbiguousNode C A • label the wrong class • judge the label with difficulty • make a mistake B G D 2 E F 2 1 1 1 1 or 2 ? 2 1
Problem – use ICA Training data True class : unlabel data: B A C 2 2 1 training data: 1 A G 1 1 1 C F 1 2 D J 2 • Ambiguous B 1 1 I 2 1 2/3 1/3 0 A: 1 2 E 1 2/3 1/30 1 H C:
Idea • Make a new prediction for neighbors of unlabeled instance • Use probability to compute link feature • Retrain the CC classifier
Our Method –Method #1 • compute link feature • use probability General method : Class 1 : 1/3 Class 2 : 1/3 Class 3 : 1/3 A 1 3 2 ( 1 , 80%) ( 3 , 70%) Our method: Class 1 : 80/(80+60+70) Class 2 : 60/(80+60+70) Class 3 : 70/(80+60+70) ( 2, 60%)
Our Method –Method #2 True class : B A To predict unlabeled instance ’s neighbors again. C 2 2 1 A A G G 1 1 1 1 C C 1 1 F F D D ( 1 , 70%) ( 1 , 70%) 2 2 ( 1 , 70%) ( 2, 80%) ( 1 , 70%) ( 2, 80%) 1 1 ( 2 , 80%) ( 2 , 80%) • Noise • Ambiguous B B ( 1 , 70%) ( 1 , 70%) 2 2 E E ( 2, 90%) ( 2, 60%) predict again predict again 2 2 1 1 1 1 H H B is ambiguous node. B is noise node.
Our Method –Method #2 • To predictunlabeled instance ’s neighbors again • first iteration needs to predict again • difference between originaland predict label : • This iterationdoesn’t to adopt • Next iterationneed to predict again • similarity between originaland predict label : • Average the probability • Next iterationdoesn’t need to predict again A Example: new prediction ( 2, 60%) ( 2, 80%) 1 2 1 C B ( 2, 70%) ( 2, 60%) ( 2, 60%) ( 1 , 80%)
Our Method –Method #2 x’sTrue label : 2 2 w • Ambiguous x z 1 3 y ( 1 , 60%) ( 3 , 60%) ( 2 , 70%) new prediction ( 2 , 70%) ( 3 , 60%) ( 2 , 80%) x is ambiguous (ornoise) node: Method B >Method C > Method A ( ? , ??%) ( 3 , 60%) ( 2 , 75%) x is notambiguous (ornoise) node: Method A >Method C > Method B x: Method A : (1 , 50%) Method B : (2 , 60%) Method C : (1 , 0%) not change class 2 Method A & Method B is too extreme. So we choose the Method C. change class not adopt
Our Method –Method #2 Accuracy
Our Method –Method#3 • Retrain CC classifier Initial ( ICA ) D A B 1 2 + E C retrain ( 3 , 70%) ( 1 , 80%) D 3 A ( 2 , 70%) 1 B 2 1 1 + 2 E C ( 2, 60%) ( 1 , 90%)
CIAN Example – Ambiguous Training data True label: unlabel data: B A C training data: 2 2 1 B: 2 1 ( 1 , 60%) G ( 2 , 60%) ( 1 , 60%) A 2 1 1 predict again C ( 1 , 80%) 1 1 ( 2 , 80%) ( 2 , 60%) ( 1 , 80%) 2 ( 2 , 80%) D F 2 1 1/2 1/2 0 Our: • Ambiguous B 1 ( 1 , 70%) 2 ICA: E ( 2 , 80%)
CIAN Example – Noise Training data True label: unlabel data: B A C training data: 2 2 1 B: 2 1 ( 1 , 70%) ( 1 , 60%) G ( 2 , 70%) A 2 1 1 predict again C ( 1 , 80%) 1 1 ( 2 , 80%) ( 2 , 80%) ( 1 , 80%) 2 ( 2 , 80%) D F 1 Our: • Noise B 1 2 ( 1 , 70%) 1/2 1/2 0 2 E ( 2 , 80%) ICA:
CIAN • CIAN : Collective Inference With Ambiguous Node Initial : Training local classifier use content features to predict unlabel instances Iterative{ for predict each unlabel instance { for nb unlabeled instance ’s neighbors{ if(need to predict again) (class label, probability ) = local classifier(nb) } set unlabel instance ’s link feature (class label, probability ) = local classifier(A) } retrain local classifier } step1 step2 step3 step4 step5
Experiments-Experimental setting fixed argument ‧Compare with CO、ICA、CIAN
Experiments • 1. misclassified nodes • Proportion of misclassified nodes (0%~30% , 80%) • 2. ambiguous nodes • NB vs SVM • 3. misclassified and Ambiguous nodes • Proportion of misclassified and ambiguous nodes (0%~30% , 80%) • 4.iteration & stable • number of iterations
Experiments – 1. misclassified • CiteSeer
Experiments – 1. misclassified • WebKB-texas
Experiments – 1. misclassified • WebKB-washington
Experiments – 1. misclassified • 80% of misclassifiednodes
Experiments – 2. ambiguous • Cora Max ambiguous nodes : 429 Max ambiguous nodes : 356
Experiments – 2. ambiguous • CiteSeer Max ambiguous nodes : 590 Max ambiguous nodes : 365
Experiments – 2. ambiguous • WebKB-texas Max ambiguous nodes : 52 Max ambiguous nodes : 20
Experiments – 2. ambiguous • WebKB-washington Max ambiguous nodes : 33 Max ambiguous nodes : 31
Experiments – 2. ambiguous ‧ How much the same ambiguous nodes between NB and SVM?
Experiments – 3. misclassified and ambiguous • CiteSeer
Experiments – 3. misclassified and ambiguous • WebKB-texas
Experiments – 3. misclassified and ambiguous • WebKB-washington
Experiments – 3. misclassified and ambiguous • 80% of misclassified and ambiguousnodes
Experiments ‧ When the accuracy of ICA is lower than CO?
Experiments – 4. iteration & stable • CiteSeer
Experiments – 4. iteration & stable • WebKB-texas
Experiments – 4. iteration & stable • WebKB-washington