1 / 18

Automatic Image Annotation and Retreval using Cross-Media Relevance Models

Automatic Image Annotation and Retreval using Cross-Media Relevance Models. J.Jeon, V. Lavrenko and R. Manmatha Center for Intelligent Information Retrieval Computer Science Department University of Massachusetts Amherts. 1096304144 鄭志毅. Introduction(1). What is Image Retrieval?

rosa
Télécharger la présentation

Automatic Image Annotation and Retreval using Cross-Media Relevance Models

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Automatic Image Annotation and Retreval using Cross-Media Relevance Models J.Jeon, V. Lavrenko and R. Manmatha Center for Intelligent Information Retrieval Computer Science Department University of Massachusetts Amherts 1096304144 鄭志毅

  2. Introduction(1) • What is Image Retrieval? Given a database of images and a query string (e.g. words), what are the images that are described by the words? Query String: “jet”

  3. Introduction(2) • query by example QBIC(IBM) , PhotoBook (MIT) ,VisualSEEK(UBC)

  4. Introduction(3) • What is Image Annotation(1)?( Object recognition) • each region have a word to describe

  5. Introduction(4) • Given an image, what are the words that describe the image(2)(use a set of word to annotation image)

  6. Outline • Preprocessing • Cross-Media Relevance Model • Experiment • Conclusions

  7. Preprocessing(1)_segment Normalized cuts segmentation b1= vector of image features b2= vector of image features local based(region) Grid segmentation xi = vector of image features x = {x1, x2, …} wi = one word w = {w1, w2, …} = vector of feature vectors = vector of words global based(grid)

  8. Preprocessing(2)_feature extraction • extract each region features all 30 features[22]: area x, y, boundary_len^2/area, convexity, moment-of-inertia (6) color moment: ave RGB (3) (mean) RGB stdev (3) (standard deviation) ave L*a*b (3) (mean) lab stdev (3) (standard deviation) texture : oriented energy, 30 degree increments (12) 30 features blobs

  9. Preprocessing(3)_Clustering to blob • use k-means to cluster each region features(k=500) • get a cluster maps ,and each cluster call “blob” in the maps Blobs Segments k=500 … …

  10. Preprocessing(4)_final • each image I ={b1,b2,b3,…..} (non annotation image) • each image have one or five keyword in training set ,J={b1,b2,b3…bm; w1, w2, …wn};wn is Tf (term frequency)

  11. R Cross Media Relevance Models • Estimating Relevance Model – the joint distribution of words and blobs • Find probability of observing word w and image region bi P(w,b1,…,bm) together(information retrieval, language models:elevance model) • To annotate image with blobs • Grass, tiger, water, road • P(w|b1,b2,b3,b4) • If top three probabilities are for words • grass, water, tiger. • Then annotate image with grass, water, tiger Tiger Water Grass

  12. Relevance Models • Annotation • Joint distribution computed as an expectation over the training set J • Given J, the events are independent

  13. Image Annotation • Compute P(w|I) for different w • Probabilistic Annotation: • Annotate the image with every possible w in the vocabulary with associated probabilities. • take the top (3 or 4) words for every image and annotate images with them.

  14. Image Retrieval • Language Modeling Approach: • Given a Query Q, the probability of drawing Q from image I is • Or using the probabilistic annotation. • Rank images according to this probability.

  15. Experiment Dataset • [22] 5,000 images from 50 Corel Stock Photo cds (4500 tringing set,500 test set) • Segmentation using normalized cuts followed by quantization ensures that there are 1-10 blobs for each image. • Each image was also assigned 1-5 keywords. • 371 words and 500 blobs

  16. for a single word(compare other two models for image annotation) Nc is the number of correctly predicted test images N is the number of all test image predicted by the word Nr is the number of test images actually annotated by the word • precision = recall = Comparison of 3 models: The graph shows mean precisions and recall for 3 models for 70 queries (one word queries)

  17. Annotation examples - CMRM • Retrieval examples – Top 4 images, CMRM Query : Tiger Query : Pillar

  18. Conclusions • large amounts of labeled training and test data • better feature extraction or the use of continuous features will probably improve the results

More Related