130 likes | 265 Vues
An opposition to: Context-Based Vision System for Place and Object Recognition Contextual Models for Object Detection Using BRFs. Authors : Antonio Torralba, Kevin P. Murphy, William T. Freeman, and Mark A. Rubin Opponent : Carlos Vallespi. Paper claims.
E N D
An opposition to: Context-Based Vision System for Place and Object Recognition Contextual Models for Object Detection Using BRFs Authors: Antonio Torralba, Kevin P. Murphy, William T. Freeman, and Mark A. Rubin Opponent: Carlos Vallespi
Paper claims • Claims to recognize 63 different locations. • Claims to categorize new environments • Claims to help object recognition by suggesting presence and location.
Place recognition Is the classifier really doing anything? • Temporal information is available. • HMM will help a lot to the classifier. • Only 2-3 choices are possible at a time, knowing the current state.
Simple place recognition with SIFT Database
Comparing with SIFT 74 matches
Comparing with SIFT Some correct matches
Comparing with SIFT Correct no matches
Comparing with SIFT • No incorrect mismatches • Just one weak match (22 matches): • Provided 9 locations and 100% accuracy in the test set.
Scene categorization • This paper claims that they are able to categorize 17 unseen scenarios. • We have seen other methods in the past for scene categorization that also worked well (with up to 13 classes): • Bag-of-words approaches (using textons, for instance). • Histogram-based approaches. • Torralba’s paper (using image frequencies). • They use an average of local features over the image with a sliding window. • In fact, this is just a sort of histogram approach (nothing new). • DB does not seem very generic. They do not compare with other methods. • It performs poorly, except for the exception of the HMM:
??? Object presence and location • Their own images speak for themselves ;) • A filecabinet is expected to be seen in almost the entire image. • Most of the objects that are highly expected to be found, do not show up.
Object presence and location • Their own images speak for themselves ;) • Except for the case of the building (which I am sure I could get something similar by averaging all the bounding boxes of buildings), all others are wrong… even the sky.
Conclusions • Place recognition: • It seems to be an easy problem, that can be solved by simpler methods without temporal information. • An HMM alone could have done similar work. • Scene categorization: • Suspicious DB • Only works because of the temporal information. • Object presence and location: • Just does not work.