The Multimedia Semantic Web

The Multimedia Semantic Web Bill Grosky Multimedia Information Systems Laboratory University of Michigan-Dearborn Dearborn, Michigan

Contents • Introduction • CBR – Where are we? • Multimedia annotation • Context-rich environments • Semantic web • Our work • Anglograms • Finding latent semantics • Using text for improved image search • Using images for improved text search • Web page structure • A cross-modal theory of linked document semantics

CBR – Where are We? • Development of feature-based techniques for content-based retrieval is a mature area, at least for images • CBR researchers should now concentrate on extracting semantics from multimedia documents so that retrievals using concept-based queries can be tailored to individual users • The semantic gap • (Semi)-automated multimedia annotation

Multimedia Annotation • Multimedia annotations should be semantically rich • Multiple semantics • A social theory based on how multimedia information is used • This can be discovered by placing multimedia information in a natural, context-rich environment

Context-Rich Environments • Structural context – Author’s contribution • Document’s author places semantically similar pieces of information close to each other • User can cluster together semantically similar pieces of information • Dynamic context – User’s contribution • Short browsing sub-paths are semantically coherent

Context-Rich Environments • The WEB is a perfect example of a context-rich environment • Develop multimedia annotations through cross-modal techniques • Audio • Images • Text • Video

Semantic Web • This program overlaps another very important current research topic, the semantic web • Web page annotations are the backbone of this research effort • We have something very important to offer to this area • Multimedia documents • Deriving multiple semantics for a single document • Combining our efforts will enrich both communities

Semantic Web • “The Semantic Web is a new initiative to transform the web into a structure that supports more intelligent querying and browsing, both by machines and by humans. This transformation is to be supported through the generation and use of metadata constructed via web annotation tools using user-defined ontologies that can be related to one another.” Somewhere on the web

End User Semantic Web Ontology Articulation Toolkit Agents Ontology Construction Tool Ontologies Community Portal x C  D Inference Engine Web-Page Annotation Tool Annotated Web Pages Metadata Repository Based on www.semanticweb.org

Semantic Web • Plan a vacation within the next month • Bill instructed his semantic web agent through his handheld browser. • An agent retrieved Bill’s vacation profile from his travel agent, retrieved Bill’s availability from his calendar, checked availability of airlines, hotels and restaurants, and made all the necessary arrangements.

Semantic Web • Multimedia semantic web • Plan a vacation close to where is being exhibited.

Anglograms • Image object • Entire image • Some meaningful portion of an image • semcon • Point-based features • corner points • color histograms

Anglograms Point feature map for shape

Anglograms Point feature map for color

Anglograms Voronoi diagram of n = 18 sites

Dual graph of a Voronoi diagram Delaunay triangulation of n = 18 sites Anglograms

Anglograms • Delaunay triangulation of a set of n points • O(n log n) algorithm • Invariance of Delaunay triangles of a set of points to • translation • rotation • scaling

Anglograms • Spatial layout of point set • Anglogram • Computed by discretizing and counting the angles of the Delaunay triangles • Which angles are counted? • O(max(n #bins)) algorithm • What is bin size?

A set of 26 points Delaunay triangulations of the point set and its two transformed variants

Anglograms • Computation of color anglogram of an image • Divide image evenly into a number of M*N non-overlapping blocks • Each individual block is abstracted as a unique feature point labeled with its spatial location and dominant colors

Anglograms • Computation of color anglogram of an image • Point feature map • Normalized feature points, after adjusting any two neighboring feature points to a fixed distance • Construct Delaunay triangulation for each set of feature points labeled with identical color

Anglograms • Computation of color anglogram of an image • Compute anglogram based on each Delaunay triangulation • Color anglogram for image • Concatenating all the anglograms together

Anglograms Pyramid image

Anglograms

Anglograms Hue component

Anglograms Saturation component

Anglograms Point feature map

Anglograms Feature points of hue 2

Anglograms Delaunay triangulation of hue 2

Anglograms Delaunay triangulation of saturation 5

Anglograms Anglogram of saturation 5

Finding Latent Semantics • We want to transform low-level features to a higher level of meaning • Used for dimension reduction in QBIC • Searching in high-dimensional spaces • More importantly, it creates clusters of co-occurring features • So-called concepts

Finding Latent Semantics • Latent Semantic Analysis (LSA) was introduced to overcome a fundamental problem in textual information retrieval • Users want to retrieve on the basis of conceptual content • Individual words provide unreliable evidence about conceptual meanings • Synonymy • Many ways to refer to the same object • Polysemy • Most words have more than one distinct meaning

Finding Latent Semantics • Searching for documents concerning automobiles • Tend to use the key-word automobile • A statistical analysis determines that the key-words automobile and car tend to co-occur • LSA will retrieve documents in which the key-word car appears, but not the key-word automobile

Finding Latent Semantics • Term-document association • It is assumed that there exists some underlying latent semantic structure in the data that is partially obscured by the randomness of term choice • By semantic structure we mean the correlation structure in which individual terms appear in documents • Semantic implies only the fact that terms in a document may be taken as referents to the document itself or to its topic • Statistical techniques are used to estimate this latent semantic structure, and to get rid of obscuring noise

Finding Latent Semantics • Singular-value decomposition (SVD) • Take a large matrix of term-document association • Construct a semantic space wherein terms and documents that are closely associated are placed near to each other • SVD allows the arrangement of space to reflect the major associative patterns and ignore smaller, less important influence • As a result, terms that did not actually appear in a document may still end up close to the document, if that is consistent with the major patterns of association • Position in the space serves as the semantic indexing • Retrieval proceeds by using the terms in a query to identify a point in the semantic space, and documents in its neighborhood are returned as relevant results

Finding Latent Semantics • Term-document matrix • d documents • t terms • Represented by a t  d term-document matrix A • Each document is represented by a column • document vector • Each term is represented by a row • term vector

Finding Latent Semantics

Finding Latent Semantics • SVD is a dimension reduction technique • Reduced-rank approximation to both column space and row space • Find a rank-k approximation to matrix A with minimal change to that matrix for a given value of k • This decomposition exists for any matrix A

Finding Latent Semantics • SVD of a term-document matrix A • A = U  VT • A is t  d • U is a t  r orthogonal matrix, where r is rank(A) • The columns of U are a basis for the column space of A • U is the matrix of eigenvectors of the matrix AAT •  is an r  r diagonal matrix having singular values 1  2  …  r of A in order along its diagonal • 2 is the matrix of eigenvalues of AAT or ATA • VT is a r  d orthogonal matrix • The rows of VT are a basis for the row space of A • V is the matrix of eigenvectors of the matrix ATA

Finding Latent Semantics t  d t  r r  r r  d

Finding Latent Semantics • A special rank-k approximation, Ak • Ak = Uk k VkT • Uk • First k columns of U • k • First k diagonal values of  • VkT • First k rows of VT

Finding Latent Semantics

Finding Latent Semantics • Reduce the rank to 3

Finding Latent Semantics Query Score

The Multimedia Semantic Web

The Multimedia Semantic Web

Presentation Transcript

The Semantic Web

The Semantic Web

The Semantic Web

Multimedia Semantic Web and MPEG-7

Semantic Multimedia Web

Multimedia Semantics and the Semantic Web

The Semantic Web

Semantic Web - Multimedia Ontology-

The SEMANTIC Web

The Semantic Web

Semantic Multimedia

The Semantic Web

Semantic Web - Multimedia Annotation –

The Semantic Web

The Semantic Web

The Semantic Web

The Semantic Web

Multimedia on the Semantic Web

The Semantic Web

Semantic Web - Multimedia Ontology-

The Semantic Web