Semantic Multimedia Web

Semantic Multimedia Web Ansgar Scherp Basierend auf Folien von Carsten Saathoff, Raphael Troncy und Lynda Hardman

Was bisher geschah... MMDB als Erweiterung von ORDBMS Information Retrieval als Basis für Queries Feature Extraktion um Inhalt zu beschreiben Feature Transformation um kompaktere Darstellung zu bekommen Fokus auf low-level features Distanzen und Ähnlichkeiten Indizierung von Features für schnellen Zugriff

Probleme traditioneller MMDB Datenstrukturen und Schemata meist proprietär MMDB typischerweise für eine Applikation aufgesetzt. Spätere Integration mit anderen Applikationen schwer Ad-Hoc Integration eher unmöglich Starker Fokus auf Low-Level Features Semantische Lücke Kein direktes Mapping zwischen Low-Level Features und Semantik des Bildes Retrieval primär über Ähnlichkeit Fast alle Studien zeigen, dass Nutzer dadurch nicht zufriedengestellt werden Nutzer wollen semantisch Anfragen

Metadaten

Metadaten (2)‏ Stichworte GPS Information Kamera Daten Datum

Metadaten Daten über Daten Autor, Creation-Date, Keywords, ... Wie repräsentieren? Relationales Schema XML Semantic Web Technologien Metadaten sollten interoperabel sein Web Desktop Intranets Viele Applikation müssen kommunizieren

Überblick Semantic Web + Multimedia Semantische Lücke CanonicalProcessfor Multimedia Production MPEG7 und COMM Probleme mit MPEG7 Core Ontology on Multimedia (COMM)‏ Linked Open Data KAT – K-Space Annotation Tool Semi-Automatische Effiziente Annotation Szenarien

Semantic Web auf einer Folie cooperatesWith Ontology rdfs:Domain rdfs:Range Person rdfs:subClass Employee rdfs:subClass rdfs:subClass PostDoc Professor rdf:type rdf:type <swrc:PostDoc rdf:ID="person_sha"> <swrc:name>Siegfried Handschuh</swrc:name> ... </swrc:PostDoc> <swrc:Professor rdf:ID="person_sst"> <swrc:name>Steffen Staab </swrc:name> ... </swrc:Professor> Meta-data <swrc:cooperatesWith rdf:resource = "http://www.uni-koblenz.de/~staab/#person_sst"/> swrc:cooperatesWith Webpage URL http://www.deri.ie/~sha http://www.uni-koblenz.de/~staab

Semantic Web for Multimedia Multimedia Ontology Domain Ontology IsWeb @ Bad Kreuznach 2007 http://isweb.uni-koblenz.de/ ResearchMeeting:= >=1 depicts.Researcher ResearchMeeting depicts Researcher rdf:type depicts http://kodemaniak.de/foaf.rdf http://wiamis2008.org hasName WIAMIS 2008 in Klagenfurt „Carsten Saathoff“ Zeig mir alle Bilder von Research Meetings!

Semantische Lücke 0010EE -> bläulich Visuell ähnlich! Aber semantisch unterschiedlich! 0033FE -> bläulich

Semantische Lücke Visuell ähnlich, semantisch ähnlich, aber... USA Italien

Ebenen der Semantik Generische Objekte Eine Person Generische Szene Personen unterhalten sich Wissen Spezifische Objekte Churchill Spezifische Szene Churchill, Roosevelt, Stalin sitzen zusammen Abstrakte Objekte Churchill, Premierminister, GB, ... Abstrakte Szene Big Three, Yalta Konferenz WWII, ...

Overview of Canonical Processes

Example 2: Vox Populi Video Sequences Generation • Stefano Bocconi, Frank Nack • Interview with Americavideo footage with interviews and background material about the opinion of American people after 9-11http://www.interviewwithamerica.com • Example question:What do you think of the war in Afghanistan? “I am never a fan of military action, in the big picture I don’t think it is ever a good thing, but I think there are circumstances in which I certainly can’t think of a more effective way to counter this sort of thing…”

Vox Populi Premeditate Process • Analogous to the pre-production process in the film industry • Static versus dynamic video artifact • Output • Script, planning of the videos to be captured • Questions to the interviewee prepared • Profiles of the people interviewed: education, age, gender, race • Locations where the interviews take place

Vox Populi Annotations • Contextual • Interviewee (social), locations • Descriptive • Question asked and transcription of the answers • Filmic continuity, examples: • gaze direction of speaker (left, centre, right) • framing (close-up, medium shot, long shot) • Rhetorical • Rhetorical Statement • Argumentation model: Toulmin model

Vox Populi Statement Annotations • Statement formally annotated: • <subject> <modifier> <predicate> • E.g. “warbestsolution” • A thesaurus containing: • Terms on the topics discussed (155) • Relations between terms: similar(72), opposite(108), generalization(10), specialization(10) • E.g. waroppositediplomacy

Toulmin Model Qualifier Data Claim Warrant Condition Backing Concession 57 Claims, 16 Data, 4 Concessions, 3 Warrants, 1 Condition

Vox Populi Query Interface

Vox Populi Organize Process diplomacy best solution contradict support war best solution war not solution • Using the thesaurus, create a graph of related statements • nodes are the statements (corresponding to video segments)“war best solution”,“diplomacy best solution”,“war not solution” • edges are either support or contradict

Result of Vox Populi Query I am not a fan of military actions I cannot think of a more effective solution War has never solved anything Two billions dollar bombs on tents

Vox Populi Processes

Canonical Processes 101 • Canonical: reduced to the simplest and most significant form possible without loss of generality • Formalization of each process in UML diagrams • Process • Process artifacts • Process actors • External world artifacts

Create Media Asset • Process where media assets are captured, generated or transformed

Semantic Annotate • The annotation uses some controlled vocabularies • Subject matter annotations of your photos • Rhetorical annotations in Vox Populi

Package • Process where process artifacts are logically and physically packed

Canonical Processes Possible Flow

Sum Up • Community agreement, not “yet another model” • Large proportion of the functionality provided by multimedia applications can be described in terms of this model • Initial step towards the definition of open web-based data structures for describing and sharing semantically annotated media assets

MPEG7 ISO Standard der MPEG Community Einheitliches Format zu Speicherung von Multimedia Metadaten Struktur Features Semantik Extrem (!) umfangreich Darauf basierend MPEG21 mit Fokus auf DRM etc. Hat im Gegensatz zu MPEG1-4 nichts mit Kodierung zu tun

MPEG7 (2)‏ Basiert auf XML MPEG7 Beschreibung ist eine Hierarchie Deskriptoren beschreiben Eigenschaften von Multimedia Daten Struktur Video -> Shots -> Frames Bilder -> Segmente Semantik Low-Level Features Um Flexibilität zu gewahren, können Deskriptoren sehr vielseitig kombiniert werden.

Big Three

Issues Winston ChurchillRecognizer Franklin RooseveltRecognizer Josef StalinRecognizer <Mpeg7> <Description xsi:type="ContentEntityType"> <MultimediaContent xsi:type=„ImageType"> <Image> <SpatialDecomposition> How do you formulate a query to get imagesshowing Churchill et al.? <StillRegion id=„SR1“> <TextAnnotation> <KeywordAnnotation xml:lang="en"> <Keyword>Churchill</Keyword> </KeywordAnnotation> </TextAnnotation> </StillRegion> First Shot (Xpath)://StillRegion[.//Keyword=“Churchill” or .//Keyword=”Roosevelt” or .//Keyword=”Stalin”] <StillRegion id=„SR2“> <Semantic> <Label> <Name>Roosevelt</Name> <Label> </Semantic> </StillRegion> <StillRegion id=„SR3“> <Semantic> <Definition>  <StructuredAnnotation> <WhatObject> <Name xml:lang="en">Stalin</Name> </WhatObject> </StructuredAnnotation> </Definition> </Semantic> </StillRegion> ...

Probleme mit MPEG-7? Annotationen sind nicht interoperabel! Mehrdeutigkeiten Mehrere Möglichkeiten um semantisch identische Annotationen zu beschreiben Deskriptoren können auf viele Arten kombiniert werden Komplexe Anfragen müssen alle Alternativen beachten

Capabilities and Maturity Levels • no standard, no vocabulary • manual 1:1 agreement on format and semantics • tight coupling of data and applications • standard vocabulary • manual 1:1 agreement onmpeg-7 vocabulary • tight coupling of data and applications • standard vocabulary • pre-defined meaning • ad-hoc coupling of data and applications • CORE ONTOLOGY Integration Automation COMM Nächster Teil der VL MPEG-7 Formerly Former Situation Current Situation Future / Desired Situation

Ontology Stack • Foundational Ontologies • Span across multiple fields, each covering multiple domains • Modelling of the most abstract concepts like event, object, ... • Core Ontologies • Situated in one field, but spans multiple domains • Can base on foundational ontologies • Examples fields: events, annotation, communication, ... • Domain Ontology • Fora specific domain, e.g., fishery, human body, etc. Core Ontologies Foundational Ontologies Domain Ontologies

Legend Challenge BuildingBlock COMM Requirements on a high quality MM Ontology MPEG-7

Requirements for COMM ReusabilityDesign a core ontology for any multimediarelated application MPEG-7-ComplianceSupport most important description tools ExtensibilityEnable inclusion of further description tools(even those that are not part of MPEG-7!)‏ media types Separation of ConcernsClear separation of domain knowledge andknowledge about structure ModularityEnable customization of multimedia ontology High degree of axiomatization Ensure interoperability throughmachine accessible semantics <Mpeg7> ... </Mpeg7> <Mpeg7> ... </Mpeg7> Josef StalinRecognizer ChurchillRecognizer FaceDetector AuthoringTool PhotoManager MusicManager • audio descriptors ... decomposition • visual descriptors SemanticAnnotation CompoundDocument TextDescriptor

Is MPEG-7 a good Basis for a high Quality Ontology? Shortcomings of badly modelled ontologies[Oberle et al., 2006]: Conceptual ambiguity Difficulties in understanding themeaning of concepts and theirrelations Poor axiomatization Axiomatization of well definedconcepts is missing Loose Design Presence of modelling artefacts(concepts without ontological meaning)‏ Shortcomings mainly hinder Extensibility Interoperability Especially 1) and 2) are major shortcomings of MPEG-7 1-to-1 translations from MPEG-7 to OWL/RDFS (e.g. [Hunter, 2003a]) will not result in high quality ontologies! <StillRegion id=„SR1“> <TextAnnotation> <KeywordAnnotation xml:lang="en"> <Keyword>Churchill</Keyword> </KeywordAnnotation> </TextAnnotation> </StillRegion> <StillRegion id=„SR2“> <Semantic> <Label> <Name>Roosevelt</Name> <Label> </Semantic> </StillRegion> <StillRegion id=„SR3“> <Semantic> <Definition>  <StructuredAnnotation> <WhatObject> <Name xml:lang="en">Stalin</Name> </WhatObject> </StructuredAnnotation> </Definition> </Semantic> </StillRegion>

Legend Challenge BuildingBlock COMM Quality of Ontologies Requirements on a high quality MM Ontology MPEG-7

How to Design a High Quality Multimedia Ontology? Approach from [Oberle, 2005], [Oberle et al., 2006]:Use a well designed foundational ontology as a modelling basis to avoid shortcomings Foundational ontologies provide Formal precision Domain independence Broad scope Building upon foundational ontologies prevents easy inclusion of modeling artefacts reduces conceptual ambiguity inherit rich axiomatization

Methodology COMM Reference Ontologie MPEG-7Compliance Quality Measures for Ontologies Quality of Ontologies Legend Challenge Requirements: High Quality MM Ontology BuildingBlock MPEG-7

Methodology for Design Pattern Definition Identification of most important MPEG-7 functionalities[Arndt et al., 2007]: Decomposition of multimedia content into segments Annotation of segments with meta data (e.g. visual descriptor, media information, creation & production, …)‏ General: Identify repetitive structures and describe them at an abstract level Describe digital data by digital data at an arbitrary level of granularity Additional patterns are needed for: Complex data types of MPEG-7 Semantic annotation by using domain ontologies Interface between reusable multimedia core and domain specific knowledge

DOLCE Design Patterns: OIO Foundational ontology DOLCE+DnSUltralight Aims at capturing the most essential aspects in the world Defines disjunctive upper classesEvent, Object, Quality and Abstract Follows a pattern-oriented approach for ontology design 2 design patterns (extensions) that are especially important for MPEG-7: Ontology of Information objects (OIO): Formalization of information exchange Information object represents pure abstract information (message)‏ Relevance for multimedia ontology: MPEG-7 describes digital data (multimedia information objects) with digital data (annotation)‏ Digital data entities are information objects

DOLCE Design Patterns: D&S • Descriptions & Situations (D&S): Formalization of Context • Relevance for multimedia ontology: • Meaning of digital data depends on context • Digital data entities are connected through computational situations (e.g. input and output data of an algorithm) • Algorithms are descriptions • Annotations and decompositions are situations that satisfy the rules of an algorithm / method

Methodology COMM Pattern definition through Specialization Repr. of Context Identification of repetitive structures Reference Ontologie Repr. of Information MPEG-7Compliance Quality Measures for Ontologies Quality of Ontologies Legend Challenge Requirements: High Quality MM Ontology BuildingBlock MPEG-7

Ontology of Information Objects (OIO)‏

Example Information Object „Graz Tourist Guide“ Information Realization http://cms.graztourismus.at/cms/ziel/42425/EN/ Booklet Information Encoding: English German About: Places, Buildings (e.g. Clock Tower)‏ Agent: 1. iMedia Visitor / 2. Tourist Officer / 3. Graphics Designer Expresses: Walking Path through Graz Small-Size Tourist Guide Arrangement of Illustrations

Semantic Multimedia Web