1.19k likes | 1.82k Vues
Speech Coding (Part I) Waveform Coding. 虞台文. Content. Overview Linear PCM (Pulse-Code Modulation) Nonlinear PCM Max-Lloyd Algorithm Differential PCM (DPCM) Adaptive PCM (ADPCM) Delta Modulation (DM). Speech Coding (Part I) Waveform Coding. Overview.
E N D
Content • Overview • Linear PCM (Pulse-Code Modulation) • Nonlinear PCM • Max-Lloyd Algorithm • Differential PCM (DPCM) • Adaptive PCM (ADPCM) • Delta Modulation (DM)
Classification of Coding schemes • Waveform coding • Vocoding • Hybrid coding
Waveform coding • Encode the waveform itself in an efficient way • Signal independent • Offer good quality speech requiring a bandwidth of 16 kbps or more. • Time-domain techniques • Linear PCM (Pulse-Code Modulation) • Nonlinear PCM: -law, a-law • Differential Coding: DM, DPCM, ADPCM • Frequency-domain techniques • SBC (Sub-band Coding) , ATC (Adaptive Transform Coding) • Wavelet techniques
Vocoding • ‘Voice’ + ‘coding’ . • Encoding information about how the speech signal was produced by the human vocal system. • These techniques can produce intelligible communication at very low bit rates, usually below 4.8 kbps. • However, the reproduced speech signal often sounds quite synthetic and the speaker is often notrecognisable. • LPC-10 Codec: 2400 bps American Military Standard.
Hybrid coding • Combining waveform and source coding methods in order to improve the speech quality and reduce the bitrate. • Typical bandwidth requirements lie between4.8 and 16 kbps. • Technique: Analysis-by-synthesis • RELP (Residual Excited Linear Prediction) • CELP (Codebook Excited Linear Prediction) • MPLP (Multipulse Excited Linear Prediction) • RPE (Regular Pulse Excitation)
Speech Coding (Part I) Waveform Coding Linear PCM (Pulse-Code Modulation)
Pulse-Code Modulation (PCM) • A method for quantizing an analog signal for the purpose of transmitting or storing the signal in digital form.
Quantization • A method for quantizing an analog signal for the purpose of transmitting or storing the signal in digital form.
Quantization Error/Noise overload noise overload noise granular noise
Quantization Step Size Quantization Error/Noise Unquantized sinewave 3-bit quantization waveform 3-bit quantization error 8-bit quantization error
+ + Quantization Step Size The Model of Quantization Noise
Signal-to-Quatization-Noise Ratio (SQNR) • A measurement of the effect of quantization errors introduced by analog-to-digital conversion at the ADC.
Signal-to-Quatization-Noise Ratio (SQNR) Assume Is the assumption always appropriate?
Signal-to-Quatization-Noise Ratio (SQNR) Each code bit contributes 6dB. constant The term Xmax/x tells how big a signal can be accurately represented
Signal-to-Quatization-Noise Ratio (SQNR) Determined by A/D converter. Depending on the distribution of signal, which, in turn, depends on users and time.
Signal-to-Quatization-Noise Ratio (SQNR) In what condition, the formula is reasonable?
midtread midrise Overload Distortion
midtread midrise Assume Probability of Distortion
midtread midrise Assume Probability of Distortion
midtread midrise Assume Overload and Quantization Noise withGaussian Input pdf and b=4
Uniform Input Pdf Gaussian Input Pdf Uniform Quantizer Performance
More on Uniform Quantization • Conceptually and implementationally simple. • Imposes norestrictions on signal's statistics • Maintains a constantmaximum error across its total dynamic range. • xvaries so much (order of 40 dB) across sounds, speakers, and input conditions. • We need a quantizing system where the SQNR is independent of the signal’s dynamic range, i.e., a near-constantSQNR across its dynamic range.
Speech Coding (Part I) Waveform Coding Nonlinear PCM
Probability Density Functionsof Speech Signals Counting the number of samples in each interval provides an estimate of the pdf of the signal.
Probability Density Functionsof Speech Signals • Good approx. is a gamma distribution, of the form • Simpler approx. is a Laplacian density, of the form:
Probability Density Functionsof Speech Signals • Distribution normalized so that x=0 and x=1• • Gamma density more closely approximates measured distribution for speech thanLaplacian. • Laplacian is still a good model in analytical studies. • Smallamplitudes much more likely than large amplitudes—by 100:1 ratio.
Companding • The dynamic range of signals is compressed before transmission and is expanded to the original value at the receiver. • Allowing signals with a large dynamic range to be transmitted over facilities that have a smaller dynamic range capability. • Companding reduces the noise and crosstalk levels at the receiver.
Companding Compressor Uniform Quantizer Expander
Companding Compressor Uniform Quantizer Expander
Companding After compression, yis Nearly uniformly distributed Compressor Uniform Quantizer Expander
The Quantization-Error Variance of Nonuniform Quantizer Compressor Uniform Quantizer Expander Jayant and Noll
The Quantization-Error Variance of Nonuniform Quantizer Compressor Uniform Quantizer Expander Jayant and Noll
Jayant and Noll The Optimal C(x) If the signal’s pdf is known, then the minimum SQNR, is achievable by letting Compressor Uniform Quantizer Expander
Jayant and Noll The Optimal C(x) If the signal’s pdf is known, then the minimum SQNR, is achievable by letting Is the assumption realistic. Compressor Uniform Quantizer Expander
PDF-Independent Nonuniform Quantization Assuming overload free, We require thatSQNRis independent onp(x).
-Law & A-Law Companding • -Law • A North American PCM standard • Used by North America and Japan • A-Law • An ITU PCM standard • Used by Europe
-Law & A-Law Companding • -Law • A North American PCM standard • Used by North America and Japan • A-Law • An ITU PCM standard • Used by Europe (=255 in U.S. and Canada) (A=87.56 in Europe)