220 likes | 372 Vues
Introduction to the Special Issue on MPEG-7. 報告人 : 張富茂 指導教授 : 尤信程. Outline. Overview of MPEG-7 Standard Overview of MPEG-7 Audio MPEG-7 Sound-Recognition Tools. Overview of MPEG-7 Standard(I). MPEG-7 focuses on description of multimedia content.
E N D
Introduction to the Special Issue on MPEG-7 報告人: 張富茂 指導教授: 尤信程
Outline • Overview of MPEG-7 Standard • Overview of MPEG-7 Audio • MPEG-7 Sound-Recognition Tools
Overview of MPEG-7 Standard(I) MPEG-7 focuses on description of multimedia content. MPEG-7 intends to be an interoperable interface,which defines the syntax and semantic of various description tools. Many groups and organizations have initiated active works in defining interoperable frameworks and representations for metadata description.
Overview of MPEG-7 Standard(II) • Ds define syntax and semantics of features of audio-visual content. • DSs allow construction of complex descriptions by specifying the structure and semantics of the relationships among the constituent Ds or DSs • DDL allows flexible definition of MPEG-7 DSs and Ds based on XML schema.
Overview of MPEG-7 Standard(II) Scope of MPEG -7 Description production Standard description Description consumption Normative part of MPEG-7 standard
Overview of MPEG-7 Standard(III) • 1)ISO/IEC 15938-1:MPEG-7 System • 2)ISO/IEC 15938-2:MPEG-7 DDL • 3)ISO/IEC 15938-3:MPEG-7 Visual • 4)ISO/IEC 15938-4:MPEG-7 Audio • 5)ISO/IEC 15938-5:MPEG-7 Multimedia DSs • 6)ISO/IEC 15938-6:MPEG-7 Reference Software • 7)ISO/IEC 15938-7:MPEG-7 Conformance
Overview of MPEG-7 Audio(I) • First making textual search impossible,Second, consider how one typically listens to audio content. • It is a standard for content-based media description. It is independent of the coding format of the media. • It is independent of the physical location of the media.
Overview of MPEG-7 Audio(II) • MPEG-7 standardizes a representation of meta-data associated with media content. • The MPEG-7 audio standard is composed of Descriptors and Description Schemes. • DDL allows for new DS to be written for specific applications.
Overview of MPEG-7 Audio(III) • Query by Humming • Query for Spoken Content • Assisted Consumer-Level Audio Editing • Extraction and Query Paradigm
Overview of MPEG-7 Audio(V) • The MPEG-7 audio standard comprises six main technologies that can be divided roughly in to two classes: • MPEG-7 Audio Description Framework • Silence Segment • Sound Effects Description Tools • Musical Instrument Timbre Description Tools • Spoken Content Description Tools • Melody Contour Description Scheme • Other Parts of the Standard
MPEG-7 Sound-Recognition Tools(I) • The Tools are designed for searching media by automatically indexing a soundtrack. • Sound-recognition tools provide a unified interface for automatic indexing of audio using trained sound classes in a pattern recognition framework. • Description is divided into two types: • Text-base description by category lables • Quantitative description using probablilistic models.
MPEG-7 Sound-Recognition Tools(II) • Sound Recognition Descriptors and Descriptions Schemes • Qualitative Descriptors • Quantitative Descriptors • Probability Model Description Schemes • Sound-Recognition Model Description Schemes
“Dogs” NT NT “Bark” “Woof” “Howl” UF A simple taxonomy of sound categories
<SoundCategory term =“1” scheme = “DOGS”> <Label>Dosg</Label> <TermRelation term=“1,1” scheme=“DOGS”> <Lable>Bark</Lable> <TermRelation term=“1,2” scheme =“DOGS” type=“UF”> <Label>Woof</Label> <TermRelation> <TermRelation term=“1.3” scheme=“DOGS”> <Label> Howl</Label> </TermRelation> </SoundCategory>
0.”Pets” NT NT 1.”Dogs” 2”Cats” NT NT NT NT 1.1.”Bark” 1.2.”Woof” 1.3.”Howl” 2.1.”Meow” 2.2.”Purr” UF Combining categories into a larger taxonomy
<ClassificationScheme term”0” Scheme=“PETS”> <Label> Pets </Label> <ClassificationSchemeRef scheme=“DOGS”/> <ClassificationSchemeRef scheme=“CATS”/> </ClassificationScheme>
MPEG-7 Sound-Recognition Tools(III) • Building A Sound-Recognition Classifier • HMM Model Training • Audio Feature Extraction
Window Stored Basis Functions Extraction: SVD/ ICA Audio Spectrum Envelop X ÷ X Power Envelope Basis Projection Fig 8. Extraction of low-level audio features for sound-recognition classification
HMM AND BASIS Audio Wav Files Feature Extract Basis Extract HIMM Fig 9. Extraction of hidden Markov model and basis functions and storage in a DDL representation
Conclusion • MPEG-7 provides… • Multimedia content description framework for interoperable applications • Description definition language (DDL) • XML Schema (flexibility) + BiM • Description Schemes (MDSs) • Library of description tools • Covers a wide range of generic needs
References • http://www.cmlab.csie.ntu.edu.tw/mpeg4workshop/MPEG7%20Introduction.files/frame.htm • http://mpeg.telecomitalialab.com/standards/mpeg-7/mpeg-7.htm