120 likes | 218 Vues
The study introduces a novel approach to learning with labeled and unlabeled examples by adding noise to edges in the graph mincut process to extract confidence scores for classifications. Experiments on a digits dataset show significant error reduction compared to standard mincut methods. Future work includes exploring different datasets and comparing with similar algorithms.
E N D
Improving the Graph Mincut Approach to Learning from Labeled and Unlabeled Examples Avrim Blum, John Lafferty, Raja Reddy, Mugizi Rwebangira
Outline • Often have little labeled data but lots of unlabeled data • Graph mincuts: based on a belief that most ‘close’ examples have same classification • Problem: -Does not say where it is most confident • Our approach: Add noise to edges to extract confidence scores
Obtain s-t mincut + - Mincut
Classification + - Mincut
Goal • To obtain a measure of confidence on each classification Our approach • Add random noise to the edges • Run min cut several times • For each unlabeled example take majority vote
Experiments • Digits data set (each digit is a 16 X 16 integer array) • 100 labeled examples • 3900 unlabeled examples • 100 runs of mincut
Conclusions • 3% error on 80% of the data • Standard mincut only gives us 6% error on all the data • Future Work • Conduct further experiments on other data sets • Compare with similar algorithm of Jerry Zhu • Investigate the properties of the distribution we get by selecting minimum cuts in this way