Semi-Supervised Learning Using Randomized Mincuts

Semi-Supervised Learning Using Randomized Mincuts Avrim Blum, John Lafferty, Raja Reddy, Mugizi Rwebangira

Outline • Often have little labeled data but lots of unlabeled data. • We want to use the relationships between the unlabeled examples to guide our predictions. • Idea: “Similar examples should generally be labeled similarly."

Learning using Graph Mincuts:Blum and Chawla (ICML 2001)

Construct a Graph

Add sink and source + -

Obtain s-t mincut + - Mincut

Classification + - Mincut

Confidence on the predictions? • Plain mincut gives no indication of the examples that it is most confident about. Solution • Add random noise to the edges. • Run mincut several times. • For each unlabeled example take majority vote.

Motivation • Margin of the vote gives a measure of the confidence. • Ideally we would like to assign a weight to each cut in the graph (a higher weight to small cuts) and then take a vote over all the cuts in the graph according to their weights. • We don’t know how to do this, but we can view randomized mincuts as an approximation to this.

Related Work –Gaussian Processes • Zhu, Gharamani and Lafferty (ICML 2003). • Each unlabeled example receives a label that is the average of its neighbors. • Equivalent to minimizing the squared difference of the labels.

How to construct the graph? • K-nn • Graph may not have small separators. • How to learn k? • Connect all points within distance δ • Can have disconnected components. • δ is hard to learn. • Minimum Spanning Tree • No parameters to learn. • Gives connected, sparse graph. • Seems to work well on most datasets.

Experiments • ONE VS TWO: 1128 examples . • (8 X 8 array of integers, Euclidean distance). • ODD VS EVEN: 4000 examples . • (16 X 16 array of integers, Euclidean distance). • PC VS MAC: 1943 examples . • (20 newsgroup dataset, TFIDF distance) .

ONE VS TWO

ODD VS EVEN

PC VS MAC

Accuracy Coverage: PC VS MAC(12 labeled)

Conclusions • We can get useful estimates of the confidence of our predictions. • Often get better accuracy than plain mincut. • Minimum spanning tree gives good results across different datasets.

Future Work • Sample complexity lower bounds (i.e. how much unlabeled data do we need to see?). • Better way of sampling mincuts? Reference • A. Blum, J. Lafferty, M.R. Rwebangira, R. Reddy “Semi-supervised Learning Using Randomized Mincuts”, ICML 2004 (To appear)

Questions?

Semi-Supervised Learning Using Randomized Mincuts

Semi-Supervised Learning Using Randomized Mincuts

Presentation Transcript

Semi-supervised Learning

Semi-Supervised Learning

Learning using Graph Mincuts

Semi-Supervised Learning

Semi-supervised learning

Semi-Supervised Learning

Semi-supervised learning

Semi-Supervised Learning

Semi-supervised Learning

Active Semi-Supervised Learning using Submodular Functions

Inductive Semi-supervised Learning

Semi-Supervised Learning

Semi-Supervised Learning

Semi-supervised Learning

Semi-Supervised Learning

COMP3503 Semi-Supervised Learning

Semi-Supervised Learning Using Randomized Mincuts

Semi-Supervised Learning

EEG Classification using Semi Supervised Learning