1 / 5

Scalable Speech Coding for IP Networks

Scalable Speech Coding for IP Networks. Koji Seto Signal Processing Research Lab. (SPRL), Department of Electrical Engineering, Santa Clara University, CA 95053, USA setocom@yahoo.com IEEE Signal Processing Society Santa Clara Valley Chapter Ph.D. Elevator Pitch to Professionals

cbrand
Télécharger la présentation

Scalable Speech Coding for IP Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Scalable Speech Coding for IP Networks Koji Seto Signal Processing Research Lab. (SPRL), Department of Electrical Engineering, Santa Clara University, CA 95053, USA setocom@yahoo.com IEEE Signal Processing Society Santa Clara Valley Chapter Ph.D. Elevator Pitch to Professionals Wednesday Dec. 9, 2015

  2. Motivation Challenge of VoIP: Lack of guarantee for reasonable speech quality because of the possibility of packet loss. Transition from the PSTN to an all IP Network (Voice over IP) RequiresHigh Robustness to Packet Loss Most current speech codecs [CELP]: Frame dependency causes error propagation in the case of packet loss!! Solution: • CELP + Side Information • Frame-independent Coding [iLBC (internet Low Bit-rate Codec)] However, the iLBC lacks some of the key features: • Rate Flexibility • Scalability • Wideband Support

  3. Proposed codec Layer 1 QMF Analysis Filter Bank Multi-Rate iLBC Enc. HPF 50Hz Lower-band signal Layer 3 Multi-Rate iLBC Dec. Wideband input signal AVQ Dec. AVQ (0–1or2 kHz) – Perceptual Weighting + + AVQ (1or2–8 kHz) AVQ Dec. Layer 4 WPT/MDCT (0–4 kHz) Layer 2 TDBWE Enc. LPF 3kHz (-1)n – + + Higher-band signal TDBWE Dec. – Layer 5 AVQ (0–8 kHz) WPT/MDCT (4–8 kHz) + + Block diagram of the encoder • Rate Flexibility: by encoding in the frequency domain • Scalability: by encoding the coding error from a lower layer • Wideband Support: by employing bandwidth scalability

  4. Proposed Codec Proposed codec was developed by adding the following three functionalities to the iLBC • Rate Flexibility: by encoding in the frequency domain • Scalability: by encoding the coding error from a lower layer • Wideband Support: by employing bandwidth scalability Proposed Codec using the WPT (Wavelet Transform) and the MDCT vs. G.729.1 Clean channel condition Lossy channel condition (16 kbps) Note: PLC algorithm is not optimized for our proposed codec

  5. Key Contributions • A Scalable Wideband Speech Codec for IP Networks using the iLBC was developed by adding Rate Flexibility, Scalability, and Wideband Support to the original iLBC. • This work shows that there is a convincing alternative option to the current industry trend for codec design, which is to consider a frame-independent codec such as the iLBC-based codec as a choice of the core-layer codec. • This work also shows that using the wavelet transform (WT) instead of the MDCT to encode the coding error from a core codec is an effective technique to use possibly for any codec.

More Related