210 likes | 388 Vues
W.A.V.S. Compression. Alex Chen Nader Shehad Aamir Virani Erik Welsh. Overview. Approach Psychoacoustic Modeling Filter Banks Quantization Demonstration Results Further Research. Approach. Encoding:. Filter Banks. Quantization. Input. Encoded Signal. Psychoacoustic Model.
E N D
W.A.V.S. Compression Alex Chen Nader Shehad Aamir Virani Erik Welsh
Overview • Approach • Psychoacoustic Modeling • Filter Banks • Quantization • Demonstration • Results • Further Research
Approach Encoding: Filter Banks Quantization Input Encoded Signal Psychoacoustic Model Decoding: Encoded Signal Inverse Quantization Reconstruction Filter Banks Output
Psychoacoustic Model • Based on studies that show hearing capabilities affected by: • Environment • Limitations of human auditory system • Used to eliminate portions of signal average human won’t hear • Two key properties: • Absolute threshold of hearing • Auditory masking
Absolute Threshold of Hearing • Experiment: • Plot audible threshold of tone • Observations: • Auditory system sensitive to some frequencies • Frequencies within “critical bandwidth” treated similarly • Basis for Bark scale
Auditory Masking • Tones and noise drown out less powerful sounds • Affect neighboring frequencies • Affect critical bandwidth • Effects add to produce overall masking threshold • Mask quantization
Filter Banks Theory • Array of bandpass filters • Break up signal into frequency subbands • Allows for variable coding scheme
Analysis and Synthesis Banks 1) Analysis filters divide up the signal 2) Down-sample 3) Quantize • 4) Up-sample • 5) Synthesis filters remove distortions • 6) Reconstruct the signal
Filter Bank Design • Phase • Tradeoff between fine and coarse frequency resolution • Piccolo vs. Castanets • Non-stationary signals • We used non-adaptive approach
Filter Bank Implementation • We used Cosine Modulated PR (perfect reconstruction) filter banks with 32 filters each • Output is a delayed version of the input (linear phase) • Distortion arises from quantization only
Quantization Two types • Narrow-band • Current input • Overhead cost • Full-range • Independent of current input • No overhead Sampled Input Quantized Version Reconstructed Input
Quantization • Narrow Band • More accurate • Lower compression ratio • Full-Range • Less accurate • Higher compression ratio • Using 3-bit Quantization • Input: -.4 -.22 .14 .4 • Levels: 1 3 6 8 • Recon.: -.4 -.2 .1 .3 • Total Error: .16 • Input: -.4 -.22 .14 .4 • Output: 3 4 6 7 • Recon: -.5 -.25 .25 .50 • Total Error: .34
Sine wave Full range Narrow range Chime 8-bit Full range Narrow range Percussion Full Range Narrow Range Modern 8-bit Full Range Narrow Range Demonstration
Sine Wave (time) Full-Range Quantization Narrow Quantization
Sine Wave (freq) Full-Range Quantization Narrow Quantization
Sine Wave (freq error) Full-Range Quantization Narrow Quantization
Modern (time) Full-Range Quantization Narrow Quantization
Modern (freq) Full-Range Quantization Narrow Quantization
Modern (freq error) Full-Range Quantization Narrow Quantization
Results • Full Range: Smallest File, Worst Sound Quality • Narrow Range: Better Sound Quality, Larger File • MP3: Industry Standard
Further Research • Filter Banks • Wavelets • Dynamic Frequency Ranges • Better Psychoacoustic Model • Tone Designation • Pre- and Post- Echo • Bit Allocation • Writing a File