1 / 77

Automation and Quality in Image Digital Libraries with Annotations

Automation and Quality in Image Digital Libraries with Annotations. Edward Fox, Uma Murthy and Ricardo Torres Florence, Italy 17 February 2007. Outline. Acknowledgements Digital Libraries Scenarios, Requirements Superimposed Information Content Based Information Retrieval CBISC, SIERRA

stevie
Télécharger la présentation

Automation and Quality in Image Digital Libraries with Annotations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Automation and Quality in Image Digital Libraries with Annotations Edward Fox, Uma Murthy and Ricardo Torres Florence, Italy 17 February 2007

  2. Outline • Acknowledgements • Digital Libraries • Scenarios, Requirements • Superimposed Information • Content Based Information Retrieval • CBISC, SIERRA • Theory, Quality • References • Summary

  3. Acknowledgements: Students • Pavel Calado, Yuxin Chen, Fernando Das Neves, Shahrooz Feizabadi, Robert France, Marcos Gonçalves, Doug Gorton, Nithiwat Kampanya, Rohit Kelapure, S.H. Kim, Neill Kipp, Aaron Krowne, Bing Liu, Ming Luo, Roberto Marchesini, Paul Mather, Sudarshan Murthy, Uma Murthy, Sanghee Oh, Ananth Raghavan, Unni. Ravindranathan, Ryan Richardson, Rao Shen, Ohm Sornil, Hussein Suleman, Ricardo da Silva Torres, Srinivas Vemuri, Wensi Xi, Seungwon Yang, Baoping Zhang, Qinwei Zhu, …

  4. Acknowledgements: Faculty, Staff • Lillian Cassel, Lois Delcambre, Debra Dudley, Roger Ehrich, Joanne Eustis, Weiguo Fan, James Flanagan, C. Lee Giles, Sandy Grant, Eric Hallerman, Eberhard Hilf, John Impagliazzo, Filip Jagodzinski, Douglas Knight, Deborah Knox, Alberto Laender, David Maier, Gail McMillan, Claudia Medeiros, Manuel Perez-Quinones, Jeff Pomerantz, Naren Ramakrishnan, Layne Watson, Barbara Wildemuth, …

  5. Other Collaborators (Selected) • Brazil: FUA, UFMG, UNICAMP • Case Western Reserve University • Emory, Notre Dame, Oregon State • Germany: Univ. Oldenburg • Mexico: UDLA (Puebla), Monterrey • College of NJ, Hofstra, Penn State, Villanova • Portland State University • University of Arizona, University of Florida, Univ. of Illinois, University of Virginia • VTLS (slides on digital repositories, NDLTD)

  6. Acknowledgements: Support ACM, Adobe, AOL, CAPES, CNI, CNPq, CONACyT, DFG, FAEPEX, FAPESP, IBM, IMLS, Microsoft, NASA, NDLTD, NLM, NSF (IIS-9986089, 0080748, 0086227, 0307867, 0325579, 0532825, 0535057, 0535060; ITR-0325579; DUE-0121679, 0121741, 0136690, 0333531, 0333601, 0435059), OCLC, SOLINET, SUN, SURA, UNESCO, US Dept. Ed. (FIPSE), VTLS, …

  7. Outline • Acknowledgements • Digital Libraries • Scenarios, Requirements • Superimposed Information • Content Based Information Retrieval • CBISC, SIERRA • Theory, Quality • References • Summary

  8. Digital Libraries --- Objectives • World Lit.: 24hr / 7day / from desktop • Integrated “super” information systems: 5S: Table of related areas and their coverage • Ubiquitous, Higher Quality, Lower Cost • Education, Knowledge Sharing, Discovery • Disintermediation -> Collaboration • Universities Reclaim Property • Interactive Courseware, Student Works • Scalable, Sustainable, Usable, Useful

  9. 5S Societies Users Collaboration, Web 2.0 Scenarios Workflow, Stories Services, Components Spaces: GIS Structures: DBMS Streams: DSMS 3C Content Content Management Systems Context Link Structure NLP Mental models Criticism, commentary Annotation, Talmud Cataloging, indexing Abstracting Summarizing Secondary literature Alliteration

  10. Outline • Acknowledgements • Digital Libraries • Scenarios, Requirements • Superimposed Information • Content Based Information Retrieval • CBISC, SIERRA • Theory, Quality • References • Summary

  11. Consider this scenario 2. In a field visit, she finds a unique-looking fish, and wants to know more. Source: http://umd.edu/ Source: http://umd.edu/ 3. She wants to search for related information based on others’ observa-tions, in the dept. DB. Also, she wants to enter new infor-mation about the fish into the DB. 1. Ingrid is a graduate student in the Fisheries department doing research on freshwater fish

  12. EKEY: The electronic key for identifying freshwater fishes

  13. Next, Ingrid works on an assignment to gain familiarity with the capabilities of a new Biodiversity Information System. She is required to make the system help her with her complex integrated information need: • “Retrieve fish descriptions of all fish whose shape is similar to that shown in the figure below, which belong to genus “Notropis”, which have “large eyes” and “dorsal stripe”, and have been observed within the catchments of the “Tennessee” river.”

  14. Here is another scenario … • An archeologist wants to write commentaries on artifacts discovered in the field • Using an Archeology digital library in his study, he wants to be able to: Source: http://www.bewegende-plaatjes.net • Manually annotate images (and parts) • Search for images (and parts), and annotations • Automatically annotate/tag similar images (and parts) • Share annotations and images Sources: http://www.dorsetforyou.com, http://www.archaeology.org

  15. Functionality required • Digital Library (DL) users need, but get little assistance, regarding tasks: • Selecting and Annotating images and parts of images • Preserve original context of information • Manual and automated annotation • Content-based image retrieval of images and parts of images (+ GIS + metadata + text …), machine learning of proper set of descriptors • Sharing selections and annotations

  16. New Microsft Research grant • Virginia Tech and UNICAMP (Brazil) • Fisheries & Wildlife, Computer Science • Tablet PCs: Content-Based Image Retrieval + Superimposed Information

  17. Outline • Acknowledgements • Digital Libraries • Scenarios, Requirements • Superimposed Information • Content Based Information Retrieval • CBISC, SIERRA • Theory, Quality • References • Summary

  18. Superimposed information (SI) • New interpretation of existing information • New content, new structures • Focuses on • Information at sub-document granularity • Information from heterogeneous sources (multimedia content) • Working with information in situ

  19. Origin of SI • This basic need had been addressed in diverse ways, with varying degrees of success, for many years: • concordances, annotations, comments • bookmarks, concept maps, digital annotations, … • The term “SI” was coined in 1999 by researchers, currently collaborating with us, now at Portland State University • Lois Delcambre • David Maier

  20. Layers in an SI system * Source: ICDE04 presentation by Murthy, et. al

  21. Benefits • Specificity of reference • Flexibility • Identifying interesting (parts of) objects • Making connections between selections • Managing collections of selections • References sub-document information • Preservation of context • Facilitates easy sharing of information

  22. C A 0 5 10 15 20 B Superimposed Applications SIMPEL: A SuperImposed Multimedia Presentation Editor and pLayer Enhanced CMapTools

  23. Combining CBIR and SI • Associate images and parts of images, with related information such as annotations, hyperlinks, metadata records, etc. • Perform CBIR on images and parts of images that have been annotated • Combine text- (on annotations and other associated text information) and content-based (image content) search for more effective retrieval of images and parts of images

  24. Outline • Acknowledgements • Digital Libraries • Scenarios, Requirements • Superimposed Information • Content Based Information Retrieval • CBISC, SIERRA • Theory, Quality • References • Summary

  25. Content-Based Image Retrieval (CBIR) • Retrieve images similar to a user-defined specification or pattern (e.g., shape sketch, image example) • Goal: To support image retrieval based on content properties (e.g., shape, color or texture), usually encoded into feature vectors

  26. Textual information retrieval Query on Google using Sunset and Rio de Janeiro Query result

  27. Content Based Information Retrieval

  28. R G B B Feature Vector [0.98, 0.91, 0.73, ……] Effective Image Description + Feature Extraction

  29. Image descriptors • Image Descriptor

  30. Example: Histogram • Frequency count of each individual color • Most commonly used color feature representation Corresponding histogram Image Source: Andrade, D.

  31. Texture Descriptors

  32. Contour Saliences

  33. Contour Segment Saliences

  34. Multiscale Fractal Dimension • Complex geometric shapes • Defined by simple algorithms • Non integer dimension • Invariant under scaling

  35. Multiscale Fractal Dimension (Experiments)

  36. Tensor Scale Descriptor • Introduced by Punam et al. in 2003. • For a pixel p, it is the largest ellipse centered at p within the same homogeneous region. • It extracts local structure information (thickness, orientation, and anisotropy).

  37. 90° 180° Tensor Scale Image

  38. Tensor Scale Image

  39. Tensor Scale Descriptor

  40. Tensor Scale Descriptor

  41. Interface Data Insertion Query Specification Visualization Query Pattern Similar Images Feature Vector Extraction Ranking Similarity Computation Query-processing Module Image Database Images Feature Vectors A typical CBIR system

  42. Outline • Acknowledgements • Digital Libraries • Scenarios, Requirements • Superimposed Information • Content Based Information Retrieval • CBISC, SIERRA • Theory, Quality • References • Summary

  43. CBISC • An OAI-compliant component that supports queries on image collections using content-based image retrieval • May be customized to support different image collections

  44. CBISC in ETANA

  45. CBISC Descriptor Training

  46. Mediator System’s Architecture Interface Data Insertion Module Query Processing Module Databases DBMS GIS Geo. DB Image DB Metadata

  47. Interface Query Mediator Query Specification Visualization Analysis Merging Execution Geographic Data Search Component (GDSC) Eco Collection Geo Collection Taxonomic Trees Maps Metadata Metadata Web Feature Server (WFS) Image Collection Image Collection Images Image Descriptors Image Metadata HTTP Request (ListDescriptors) HTTP Request (GetCapabilities) HTTP Request (GetFeatureType) HTTP Request (GetFeature) BIS Manager HTTP Request (GetImages) HTTP Request (keywords) OAI Content-Based Image Search Component (CBISC) Metadata-Based Search Component (ESSEX)

More Related