180 likes | 323 Vues
Extending the Multi-Instance Problem to Model Instance Collaboration. Anjali Koppal Advanced Machine Learning December 11, 2007. Multi-Instance Learning: Set-up. Given: A set of labeled data points (training set) and unlabeled data points (test set)
E N D
Extending the Multi-Instance Problem to Model Instance Collaboration Anjali KoppalAdvanced Machine LearningDecember 11, 2007
Multi-Instance Learning: Set-up Given: A set of labeled data points (training set) and unlabeled data points (test set) Bag: Each data point is called a bag and is associated with a label (‘+’ or ‘-’) Instance: A Bag is described by a set of Instances. Each instance is described by a vector of features. Every instance also has a (hidden/unknown) label. Problem: Predict the class of an unlabelled bag. +
Multi-Instance Learning: Set-up Given: A set of labeled data points (training set) and unlabeled data points (test set) Bag: Each data point is called a bag and is associated with a label (‘+’ or ‘-’) Instance: A Bag is described by a set of Instances. Each instance is described by a vector of features. Every instance also has a (hidden/unknown) label. Problem: Predict the class of an unlabelled bag. +
Multi-Instance Learning: Set-up Given: A set of labeled data points (training set) and unlabeled data points (test set) Bag: Each data point is called a bag and is associated with a label (‘+’ or ‘-’) Instance: A Bag is described by a set of Instances.Each instance is described by a vector of features. Every instance also has a (hidden/unknown) label. Problem: Predict the class of an unlabelled bag. +
Multi-Instance Learning: Example beach Bag -> An image Instance -> A ‘region’ in the image Label -> {‘beach’ ‘not beach’} sky water sand http://www.adrhi.com/Waimanalo-Beach.jpg
Framework for Solving MI Problems + Key Assumption: A bag is positive if at least one of its instances is predicted to be positive. A bag is negative if all its instances are predicted to be negative. – – + – – – – – – – – – – – – – –
Framework for Solving MI Problems:Diversity Density Diversity Density of an instance: A measure of how close an instance is to instances of different positive bags while also being far away from all instances of negative bags. The instance with the maximum DD value is the most positive instance. O. Maron and T. Lozano-Pérez. A framework for multiple-instance learning. In Advances in Neural Information Processing Systems 10, pages 570-576. Cambridge, MA: MIT Press, 1998
Using DD to Solve MIL Problems • Approach #1: Maron et al: • Find the point in instance space with the maximum DD value. Call this p. • Use the max DD value to compute a distance threshold t. • A test bag b with instances x_i is classified as positive if min_i(distance(x_i, p)) < t ? ? d3 d1 p d2 d4 ? ?
Using DD to Solve MIL Problems • Approach #1:Maron et al: • Find the point in instance space with the maximum DD value. Call this p. • Use the max DD value to compute a distance threshold t. • A test bag b with instances x_i is classified as positive if min_i(distance(x_i p)) < t – + d1 < t d3 > t p d2 > t d4 > t – –
DD-SVM Approach #2:(Y. Chen and J. Wang, image categorization by learning and reasoning with regions, The Journal of Machine Learning Research, 5, p.913-939) The previous approach uses one instance point (the one with the globally maximal DD value) as a prototype for all positive instances. Idea: Represent the positive instance space by a set of instance prototypes (locally max DD values). A bag is represented by a vector of minimum distances to each of the prototypes. d2 d1 d1 d2 d3 d4 d5 Now train a SVM in the “bag-feature” space. d3 d4 d5
Extending DD-SVM : Motivation Instance Prototypes (Approach #2) are an improvement over single prototypes (Approach #1) because they allow for diversity in positive instances. e.g: A ‘beach scene’ could contain “water” or “sand” or “sky”. sky water sand
Extending DD-SVM : Motivation However Approach #2 assumes to some extent that the instances contribute independently to the bag’s class. e.g: The classification rule for ‘beach scene’ might be: “has water and sand but not only one”. For what is a beach without sand…
Extending DD-SVM : Proposal Instead of computing the minimum distance between an instance prototype and any instance in a bag compute the minimum distance between an instance prototype and a linear combination of all instances in the bag. Instance Prototype α2 α3 α1 α4 α5 For bag B with instances bi find the values of αis such thatdist( Σαibi instancePrototype) is minimized – this is a standard QP Problem.
Extending DD-SVM : Initial Results Artificial Data Set Instances come from two normal distributions N1 and N2. number of bags = 200 with each bag containing 2 instances.Positive bags contain one instance from each of N1 and N2.Negative bags contain two instances from N2 (with probability t)
Extending DD-SVM: Initial Results SVM used: LIBSVM RBF Kernel Correct_Label DD_label DD_conf EDD_label EDD_conf 1 -1 0.5483 1 0.5000 1 -1 0.6220 1 0.5868 1 -1 0.5593 1 0.5535 1 -1 0.5329 1 0.5000 1 1 0.6002 -1 0.5396 1 -1 0.5205 1 0.6843 1 -1 0.5379 1 0.5199 1 -1 0.5152 1 0.5817 -1 1 0.5442 -1 0.5075 -1 -1 0.5834 1 0.5414 -1 1 0.5109 -1 0.5146 -1 1 0.5388 -1 0.5071 -1 -1 0.5106 1 0.5661
Extending DD-SVM: Initial Results • Musk 2 Data Set (UCI) • 102 bags (molecules) containing instances represented by 166 dimensions (conformations). The 2 classes are ‘musk’ and ‘non-musk’ • Average number of instances per bag ~ 64 – computationally very slow to calculate instance prototypes. Worked only with bags containing fewer than 20 instances (there were 63 such bags).
Conclusions & Future Work • The Extended DD-SVM is a modification of the DD-SVM algorithm to model the influence of groups of instances instead of single, independent instances in multi instance labeling. • Initial results were interesting, but need to run more expansive tests to better understand its performance. • Useful to find datasets where a collaborative model is intuitivelya good one (unclear if that is true for the MUSK dataset). • Other possible directions:-- Another way of extending DD-SVM: use the average distance of k closest instances to a prototype.-- Another way to model collaborations: convex combinations of nearest neighbours (instead of all instances).
Application µ-rNA mRNA TS1 TS2 TS3 Bag ~ mRNAInstance ~ target sites (TS)Positive instance ~ legitimate TSPositive bag ~ protein expression