250 likes | 365 Vues
Austrian Academy of Science. OeAW-ISF. Acoustics Research Institute. MPEG-7 Today‘s Multimedia Standard Peter Balazs http://www.kfs.oeaw.ac.at. Peter Balazs 1999 started as programmer at the ISF 2001 finshed mathematics (University of Vienna).
E N D
Austrian Academy of Science OeAW-ISF Acoustics Research Institute • MPEG-7 • Today‘s Multimedia Standard • Peter Balazs • http://www.kfs.oeaw.ac.at • Peter Balazs • 1999 started as programmer at the ISF • 2001 finshed mathematics (University of Vienna) Institut für Schallforschung der Österreichischen Akademie der Wissenschaften: A-1010 Wien; Liebiggasse 5. Tel. +43 1/4277-29500; Fax +43 1/4277-9296; email: xxl@kfs.oeaw.ac.at; http://www.kfs.oeaw.ac.at
OeAW-ISF MPEG-7 • ISO / IEC Standard • „Mulitmedia Content Description Interface“ • Multimedia data / metadata description system • Low Level – High Level; content based • Open system • Inheritance • Description of methods • normativ – informativ
OeAW-ISF MPEG-7 • ISO / IEC Standard • „Mulitmedia Content Description Interface“ • Multimedia data / metadata description system • Low Level – High Level <AudioDescriptorxsi:type="SoundModelStatePathType"> <SoundModelRef>IDDogBarks</SoundModelRef> <StateRef>IDState1</StateRef> <RelativeFrequency>0.000</RelativeFrequency> <StateRef>IDState2</StateRef> <RelativeFrequency>0.000</RelativeFrequency> <StateRef>IDState3</StateRef> <RelativeFrequency>0.045</RelativeFrequency> <StateRef>IDState4</StateRef> <RelativeFrequency>0.000</RelativeFrequency> <StateRef>IDState5</StateRef> <RelativeFrequency>0.442</RelativeFrequency> <StateRef>IDState6</StateRef> <RelativeFrequency>0.513</RelativeFrequency> </AudioDescriptor> • Open system • Inheritance • Description of methods • normativ – informativ
OeAW-ISF MPEG-7 • History • Call for Proposals October 1998 • Evaluation February 1999 • First version of Working Draft (WD) December 1999 • Committee Draft (CD) October 2000 • Final Committee Draft (FCD) February 2001 • Final Draft International Standard (FDIS) July 2001 • International Standard (IS) September 2001 • Development • Amendment Audio May 2002 • Call for Proposals (Systems, version 2) July 2002 • MPEG 21 international standard April 2009
OeAW-ISF XML XML = eXtensible Markup Language • Metasprache • Hypertext • Markup • markup = tag <Befehl> ... </Befehl> • Open Standard <?xml version=„1.0“> <?xml version=„1.0“> <!DOCTYPE document [ <!ELEMENT ADRESSE (Vorname, Nachname, Wohnort)> <!ELEMENT Vorname (#PCDATA)> .... ]> <?xml version=„1.0“> <!DOCTYPE document [ <!ELEMENT ADRESSE (Vorname, Nachname, Wohnort)> <!ELEMENT Vorname (#PCDATA)> .... ]> <ADRESSE> <Vorname> Peter </Vorname> <Nachname> Balazs </Nachname> <Wohnort> Tulln </Wohnort> </ADRESSE> <ADRESSE> ........ <?xml version=„1.0“> <!-– XMl-Test --> <!DOCTYPE document [ <!ELEMENT ADRESSE (Vorname, Nachname, Wohnort)> <!ELEMENT Vorname (#PCDATA)> .... ]> <ADRESSE> <Vorname> Peter </Vorname> <Nachname> Balazs </Nachname> <Wohnort> Tulln </Wohnort> </ADRESSE> <ADRESSE> ........
OeAW-ISF XML XML = eXtensible Markup Language • Metasprache • Hypertext • Markup • markup = tag <Befehl> ... </Befehl> • Open Standard <?xml version=„1.0“> <!-– XMl-Test --> <!DOCTYPE document [ <!ELEMENT ADRESSE (Vorname, Nachname, Wohnort)> <!ELEMENT Vorname (#PCDATA)> .... ]> <ADRESSE> <Vorname> Peter </Vorname> <Nachname> Balazs </Nachname> <Wohnort> Tulln </Wohnort> </ADRESSE> <ADRESSE> ........ <Set ID="Viewer3" RunMode="Multiple> <Table ID="Settings"> CursorOpts = 0 0 1 440 SignalOpts = 1 1 </Table> <Set ID="Profiles"> <Table ID="Default"> FrameOpts = 40 1 75 2 0 1 GraphXY = 0 1e4 1 -80 50 1 Method = 0 32 20 0 1 0 0 0 1 0 0 Average = 0 0 99 </Table> </Set> </Set>
OeAW-ISF MPEG-7 • Descriptors • Low Level • Descriptor Schemes • High Level, container • Descriptor Definition Language (DDL) • XML Schema, STX Schema • System Tools • ASCII Text - binary
OeAW-ISF MPEG-7 Out of [1]
OeAW-ISF MPEG-7 Audio: Low Level Descriptors • Single Sample • Segments • DS, compare to STX Out of [1]
OeAW-ISF MPEG-7 Audio: Low Level Descriptors • Scalar • Vector • Single • Series • series of vectors • = table, matrix • Scalable Series Out of [2]
OeAW-ISF MPEG-7 Audio: Low Level Descriptors • Basic • AudioWaveform, AudioPower
OeAW-ISF MPEG-7 Audio: Low Level Descriptors • Basic • AudioWaveform, AudioPower • Basic Spectral • AudioSpectrumEnvelope, AudioSpectrumCentroid, AudioSpectrumSpread, AudioSpectrumFlatness
OeAW-ISF MPEG-7 Audio: Low Level Descriptors • Basic • AudioWaveform, AudioPower • Basic Spectral • AudioSpectrumEnvelope, AudioSpectrumCentroid, AudioSpectrumSpread, AudioSpectrumFlatness • Signal Parameters • AudioHarmonicity, • AudioFundamentalFrequency
OeAW-ISF MPEG-7 Audio: Low Level Descriptors • Basic • AudioWaveform, AudioPower • Basic Spectral • AudioSpectrumEnvelope, AudioSpectrumCentroid, AudioSpectrumSpread, AudioSpectrumFlatness • Signal Parameters • AudioHarmonicity, AudioFundamentalFrequency • Timbral Temporal • LogAttackTime, TemporalCentroid
OeAW-ISF MPEG-7 Audio: Low Level Descriptors • Basic • AudioWaveform, AudioPower • Basic Spectral • AudioSpectrumEnvelope, AudioSpectrumCentroid, AudioSpectrumSpread, AudioSpectrumFlatness • Signal Parameters • AudioHarmonicity, AudioFundamentalFrequency • Timbral Temporal • LogAttackTime, TemporalCentroid • Timbral Spectral • SpectralCentroid, HarmonicSpectralCentroid, HarmonicSpectralDeviation, HarmonicSpectralSpread, HarmonicSpectralVariation
OeAW-ISF MPEG-7 Audio: Low Level Descriptors • Basic • AudioWaveform, AudioPower • Basic Spectral • AudioSpectrumEnvelope, AudioSpectrumCentroid, AudioSpectrumSpread, AudioSpectrumFlatness • Signal Parameters • AudioHarmonicity, AudioFundamentalFrequency • Timbral Temporal • LogAttackTime, TemporalCentroid • Timbral Spectral • SpectralCentroid, HarmonicSpectralCentroid, HarmonicSpectralDeviation, HarmonicSpectralSpread, HarmonicSpectralVariation Out of [1] • Spectral Basis • AudioSpectrumBasis, AudioSpectrumProjection
OeAW-ISF MPEG-7 Audio: Low Level Descriptors • Basic • AudioWaveform, AudioPower • Basic Spectral • AudioSpectrumEnvelope, AudioSpectrumCentroid, AudioSpectrumSpread, AudioSpectrumFlatness • Signal Parameters • AudioHarmonicity, AudioFundamentalFrequency • Timbral Temporal • LogAttackTime, TemporalCentroid • Timbral Spectral • SpectralCentroid, HarmonicSpectralCentroid, HarmonicSpectralDeviation, HarmonicSpectralSpread, HarmonicSpectralVariation Out of [1] • Spectral Basis • AudioSpectrumBasis, AudioSpectrumProjection
OeAW-ISF MPEG-7 Audio: Low Level Descriptors • Basic • AudioWaveform, AudioPower • Basic Spectral • AudioSpectrumEnvelope, AudioSpectrumCentroid, AudioSpectrumSpread, AudioSpectrumFlatness • Signal Parameters • AudioHarmonicity, AudioFundamentalFrequency • Timbral Temporal • LogAttackTime, TemporalCentroid • Timbral Spectral • SpectralCentroid, HarmonicSpectralCentroid, HarmonicSpectralDeviation, HarmonicSpectralSpread, HarmonicSpectralVariation Out of [2] Out of [1] • Spectral Basis • AudioSpectrumBasis, AudioSpectrumProjection • Silence
OeAW-ISF MPEG-7 Audio: High Level DSs • AudioSignature • AudioSpectrumFlatness
OeAW-ISF MPEG-7 Audio: High Level DSs • AudioSignature • AudioSpectrumFlatness • Musical Instrument Timbre Description Tool • HarmonicInstrumentTimbre (LAT + timbre spectral) • PercussiveInstrumentTimbre (timbre temporal + SpectralCentroid)
OeAW-ISF MPEG-7 Audio: High Level DSs • AudioSignature • AudioSpectrumFlatness • Musical Instrument Timbre Description Tool • HarmonicInstrumentTimbre (LAT + timbre spectral) • PercussiveInstrumentTimbre (timbre temporal + SpectralCentroid) • Melody Description Tools • MelodyContour DS, Melody Sequence DS
OeAW-ISF MPEG-7 Audio: High Level DSs • AudioSignature • AudioSpectrumFlatness • Musical Instrument Timbre Description Tool • HarmonicInstrumentTimbre (LAT + timbre spectral) • PercussiveInstrumentTimbre (timbre temporal + SpectralCentroid) • Melody Description Tools • MelodyContour DS, Melody Sequence DS • General Sound Recognition and Indexing Description Tool • SpectralBasis, SoundClassificationModel : SoundModels, classification scheme; • SoundModelStatePath, SoundModelStateHistogram
OeAW-ISF MPEG-7 Audio: High Level DSs • AudioSignature • AudioSpectrumFlatness • Musical Instrument Timbre Description Tool • HarmonicInstrumentTimbre (LAT + timbre spectral) • PercussiveInstrumentTimbre (timbre temporal + SpectralCentroid) • Melody Description Tools • MelodyContour DS, Melody Sequence DS • General Sound Recognition and Indexing Description Tool • SpectralBasis, SoundClassificationModel : SoundModels, classification scheme; • SoundModelStatePath, SoundModelStateHistogram • SpokenContentDescription Tools • SpokenContentHeader : WordLexicon, PhonLexicon; • SpokenContentLattice: WordLinks, PhonLinks.
OeAW-ISF MPEG-7 Audio: Amendment • New Base types • optional attribute for channel • Modification of Spoken Content Description Tools • „acoustics only“ score possible for speech recognition; prosody, syllabels • Audio Signal Quality DS • BackgroundNoiseLevel, BalanceType, DCoffsetType, BandwidthType. • TransmissionTechnologyType: shellac, vinyl,.... • Additional Tools: • tempo description, compact variable precision representation (BAM) • Liguistic Description Tools: • semantic structure of liguistic data
OeAW-ISF MPEG-7 • Literatur: • [1] José M. Martínez, MPEG-7 Overview (version 8) ISO/IEC JTC1/SC29/WG11N4980, Klagenfurt, July 2002, http://mpeg.telecomitalialab.com/standards/mpeg-7/mpeg-7.htm • [2] ISO / IEC, Information Technology – Multimedia Content Description Interface – Part 4: Audio, Geneva, July 2001 • [3] Oliver Pott, Günter Wielange, XML Praxis und Referenz, München 2001 • [4] J. Bitzer, J. H. Martínez, Information Technology — Multimedia Content Description Interface — Part 4: Audio — Proposed Draft Amendment, Fairfax, May 2002 • Links: • [4] MPEG Home Page, http://mpeg.telecomitalialab.com/ • [5] Extensible Markup Language, http://www.w3.org/XML/ • [6] STX, http://www.kfs.oeaw.ac.at/software.htm