1 / 28

Presentation By: Salman Ahmad (270279)

A structured learning framework for content-based image indexing and visual Query (Joo-Hwee, Jesse S. Jin). Presentation By: Salman Ahmad (270279). Introduction. Motivation To do content based image retrieval from non-specific images in a broad domain. Literature Review.

haru
Télécharger la présentation

Presentation By: Salman Ahmad (270279)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A structured learning framework for content-based image indexing and visual Query(Joo-Hwee, Jesse S. Jin) Presentation By: Salman Ahmad (270279)

  2. Introduction • Motivation • To do content based image retrieval from non-specific images in a broad domain.

  3. Literature Review • Semantic labeling approach by Town & Sinclair into 11 visual categories by ANN. • Monotonic tree Approach for classifying image into semantic regions (8 in total). • Associating image with words but not scalable for image with diverse content.

  4. Image Retrieval Cycle User Storage Retrieval Result User images Query term comparison with features Feature extraction Feature & Image storage

  5. Semantic Gap • Semantic Extraction • Requires object recognition & scene understanding • Monotonic tree • Semantic Interpretation • Pre-query (manual annotation), query, post query (relevance feedback)

  6. Semantic Gap Image Object recognition User search term & requirement Low level feature extraction

  7. Structured learning for Image Indexing • Based on SSR – salient image patches that exhibit semantic meaning. • SSR are learned a priori and detected during image indexing. • No region segmentation step is required. • Image indexing onto the classification space spanned by semantic label.

  8. Semantic Support Region (SSR) • Introduced to address the issue of high content diversity • Modular view based object detector • Generate spatial semantic signature • Similarity based and fuzzy logic based query processing. • Not restricted to the main area of attention in image.

  9. Semantic Support Region (SSR) Face Figure Crowd Skin Clear Cloudy Blue Floor Green Flower Branch Far Rocky Old City Far sand Grass Pool Pond River wall wooden china Fabric Light

  10. Semantic Support Regions (SSR) • SSR are segmentation free image region. • Have semantic meanings. • Detected from tessellated image blocks • Reconciled across multiple resolution • Aggregated spatially

  11. SSR learning • Use of Support Vector Machines • Features employed • Color (YIQ) – 6 dimensions • Texture (Gabor coefficient) – 60 dimensions • 26 classes of SSR • 8 super classes (People(4), sky(3), ground(3), water(3), foliage(3), mountain(2), building(3), interior(5)) • Kernel – polynomial with degree 2 and a constant. • Total data for train & test – 554 image regions from 138 images. • Training data – 375 image regions from 105 images. • Test data – 179 image regions.

  12. SSR Detection

  13. SSR Detection • Feature vectors zc and zt (for color & texture) • 3 color maps and 30 texture maps from Gabor Coefficient. • Windows of different scales used for scale invariance. • Each pixel will consolidate the SSR classification vector Ti(z)

  14. Multiscale Reconciliation • Object detected in different region in image • Fusing multiple SSR detected from different image scale • Comparing two detection map at a time (from 60 x 60 & 50 x 50 to 30 x 30 & 20 x 20) • Smallest scan windows consolidating the result

  15. Spatial Aggregation • Summarize the reconciled detection map in larger spatial region. • Spatial aggregation Map (SAM) variable emphasis (weights). • SAM are invariant to image rotation & translation • SAM effected slightly by change of angle of view, change of scale, occlusion.

  16. Spatial Aggregation Map

  17. Scalability • Modular Nature • Independent training of binary detectors. • Parallel computation of feature map. • Multiple SSR detection simultaneously • Concurrent spatial aggregation by different nodes in SAM. • Retraining of SVM with the addition of new SSR.

  18. Query Methods • Low-level features • QBE (Query By Example) • QBC (Query By Canvas) • Semantic Information • QBK (Query By keywords) • QBS (Query By sketches) • QBSI (Query By Spatial Icons)

  19. Query Formulation & Parsing • QBME (Query by multiple examples) • Similarity computed based on the similarity between their tessellated blocks. • Larger block for similar semantics but different spatial arrangement. • Smaller blocks for spatial specificity. • City block distance provide best performance.

  20. Query Formulation & Parsing • QBSI (Query by Spatial icons) • Spatial arrangement of visual semantics • Q (Visual query term) specify region R for SSR i. • Chaining of these term VQT. • Two level is-a hierarchy of SSRs • Use of max in abstract visual semantics.

  21. Query Formulation & Parsing • Disjunctive normal form of VQT can be used (with or without negation). • Fuzzy operation to remove the uncertainty in values. • Vocabulary for the QBSI limited by the semantics • Graphical interface provided for VQT • Indexing the images with 3 x 3 spatial tessellation with 26 SSR.

  22. Experimental Results • Tested on consumer images • More challenging & complex • Diverse content • Faded, over exposed, blurred, dark • Different focuses, distances and occlusion. • 2400 heterogeneous photos of a single family taken over the span of 5 years • Indoor and outdoor settings • Resolution of 256 x 384 converted to 240 x 360 • No pre-selection of images.

  23. QBME Experiment • 24 semantic queries for 2400 images • Truth values based on the opinion of 3 subjects • Comparison with feature based approach (CTO). • Best performing parameters selected

  24. QBSI Experiment • 15 QBSI queries for 2400 photos Queryexamples for QBSI

  25. QBSI Experiment Results Precision on top retrieved images for QBSI experiment

  26. Advantages of QBSI • Explicit specification of visual semantics with combination • Better and more accurate expression than sketches and visual icons.

  27. Conclusion & Future Work • SSR allows image indexing based on local semantics without region segmentation • A unique and powerful query language. • Extendable to other domains like medical images.

  28. Questions

More Related