1 / 12

Presented by Arshad Jamal, Rajesh Dhania , Vinkal Vishnoi

Active hashing and its application to image and text retrieval Yi Zhen, Dit -Yan Yeung , Published in DMKD Feb 2012. Presented by Arshad Jamal, Rajesh Dhania , Vinkal Vishnoi. Introduction. Computing similarity plays a fundamental role

ulf
Télécharger la présentation

Presented by Arshad Jamal, Rajesh Dhania , Vinkal Vishnoi

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Active hashing and its application to image and text retrievalYi Zhen, Dit-Yan Yeung, Published in DMKD Feb 2012 Presented by Arshad Jamal, Rajesh Dhania, VinkalVishnoi

  2. Introduction • Computing similarity plays a fundamental role • Hashing based methods gained popularity for large-scale similarity search Hashing based Tree based Suitable for low dimensions Data Dependent Data Independent This paper proposes a novel Framework for Active Hashing Unsupervised Semi-supervised

  3. Related work • Locality Sensitive Hashing • Goal is to assign similar binary code for data points that are closer in feature space [Random Linear Projection + Thresh] • Code length could become quite large • Spectral Hashing • Performs spectral decomposition to learn hash functions • Assumes data to be uniformly distributed • Active Learning • Identify and present the most informative unlabeled data to human experts for labeling

  4. Related Work: Semi-supervised Hashing • Given N normalized data points of D dimensions • Learn K Hash functions to generate K-bit binary code • Build two set of point pairs S (Similar), D(Dissimilar) • Together they characterize the semantic similarity • Hash functions are learned by maximizing an objective function,

  5. Limitations of SSH • Point pairs from both S and D sets are considered to be equally important • For multi-class data, the D points picked from closer or farther class contribute same weight • More dissimilar points will spoil the learned hash function C1 C3 C2

  6. Active Hashing (Greedy AH) • Tries to overcome the limitations of SSH by picking most informative points • Algorithm: Three main steps • Given (L, U) labeled and un-labeled data points and candidate set C Select most informative pts A from C Get A labeled by an expert Update L, U, C Train the hash functions based on L & U

  7. Greedy AH: Selecting data points • Based on SSH model hash function • Intuitively, the term indicates the certainty of x • Data certainty (DC): • Data points with smallest f will be the most informative points

  8. Batch mode Active Hashing • Selecting points one by one is inefficient and suboptimal • Set of points are selected and processed to learn a Hash fn. • µ is indicator function deciding about the presence of a point • f is a vector of normalized certainty values in C • K is positive semi-definite similarity matrix defined on C • Choose M examples with largest µ

  9. Experimental evaluation-I • Image retrieval (MNIST dataset): Results reported for different parameter settings • Text Retrieval (20Newsgroups (NEWS) data set) • Random vs BMAH: Performance improvement

  10. Experimental evaluation-II • Image retrieval (MNIST dataset) • BMAH vs GAH: BMAH takes less time

  11. References • Andoni A, Indyk P (2006) Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In: Proceedings of the 47th annual IEEE symposium on foundations of computer science, FOCS ’06, IEEE Computer Society, Washington, pp 459–468 • Weiss Y, Torralba A, Fergus R (2008) Spectral hashing. In: Koller D, Schuurmans D, Bengio Y, BottouL (eds) Advances in neural information processing systems 21, NIPS 21, The MIT Press, Cambridge, MA, pp 1753–1760 • Wang J,Kumar S, Chang S-F (2010a) Semi-supervised hashing for scalable image retrieval. In: Proceedings of IEEE conference on computer vision and pattern recognition [46], pp 3424–3431 • Salakhutdinov R, Hinton GE (2009) Semantic hashing. Int J Approx Reason 50:969–978 Thanks

More Related