Audio

Audio Henning Schulzrinne Dept. of Computer Science Columbia University Fall 2003

Common narrowband audio codecs

Common wideband audio codecs

iLBC – MOS behavior with packet loss

Recent audio codecs • iLBC: optimized for high packet loss rates (frames encoded independently) • AMR-NB • 3G wireless codec • 4.75-12.2 kb/s • 20 ms coding delay

Speex • Open-source patent-free speech codec • CELP (code-excited linear prediction) codec • operating modes: • narrowband (8 kHz sampling rate) • 2.15 – 24.6 kb/s • delay of 30 ms • wideband (16 kHz sampling rate) • 4-44.2 kb/s • delay of 34 ms • ultra-wideband (32 kHz sampling rate) • intensity stereo encoding • variable bit rate (VBR) possible • voice activity detection (VAD)

Ogg Vorbis • Similar in application to AAC, MP3, VQF, …, but claims to be free of patents • Ogg = container format file (also for Speex, FLAC) • Vorbis = music speech codec • near CD quality = 160 kb/s • forward-adaptive modified DCT (discrete cosine transform) • overlapping windows • floor: carries frequency representation as piecewise linear interpolated representation on a dB amplitude scale and linear frequency scale • residue: subtract out floor  cascaded (multi-pass) vector quantization • entropy (Huffman) coding • carries codec parameters in header

Sound localization • Human ear uses 3 metrics for stereo localization: • intensity • time of arrival (TOA) – 7 µs • direction filtering and spectral shaping by outer ear • For shorter wavelengths (4 – 20 kHz), head casts an acoustical shadow giving rise to a lower sound level at the ear farthest from the sound sources • At long wavelength (20 Hz - 1 KHz) the, head is very small compared to wavelengths • In this case localization is based on perceived Interaural Time Differences (ITD) UCSC CMPE250 Fall 2002

Audio samples • http://www.cs.columbia.edu/~hgs/audio/codecs.html • Speex: http://www.speex.org/audio/samples/ • both narrowband and wideband

Audio

Audio

Presentation Transcript

Audio

Audio

Audio

Audio

Audio Slideshow: Audio Tips

Audio

Audio

Audio

Audio

Audio

Audio

Audio

Audio

AUDIO

Audio

Audio

Audio

Audio

Audio