Cutting-edge Methods for Active Learning with Minimal Labels
E N D
Presentation Transcript
Lots of data, very few labels • Choose unknowns to be labeled • Varying methods of choosing this unknown • Hopefully, will find the best classifier with very small number of examples
Maximum Curiosity • Generate new training sets by taking known data and adding assumed values for all unknowns • Run those through a learner and do statistics on results • Assume highest r value (cross-validated correlation coefficient) results from correct pairing
Terrible Graphics Additive Curiosity Variant: Sum, not max
Minimum Marginal Hyperplane • Based on Support Vector Machines • After learning SVM on known data, pick unknowns closest to boundary and repeat • Takes advantage of geometric features of SVMs
Maximum Entropy • Calculate entropy of assumed datasets • Assume that the most informative item is that which is most uncertain (highest entropy)
Entropic Tradeoff • Choose a mix of easily-classified and highly informative • At each step, choose both highest and lowest entropy unknowns to classify