1 / 46

Other Features

Other Features. Echo Cancellation. Ecan. Acoustic Echo. Ecan. Line echo. hybrid. hybrid. Telephone 1. Telephone 2. Ecan. Subjective reaction to echo. Ecan. Ecan. Subjective effect of 15 dB echo returns loss. Ecan. 4w. switch. comp. inv. 4w. switch. Echo suppress or.

hailey
Télécharger la présentation

Other Features

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. OtherFeatures

  2. Echo Cancellation

  3. Ecan Acoustic Echo

  4. Ecan Line echo hybrid hybrid Telephone 1 Telephone 2

  5. Ecan Subjective reaction to echo

  6. Ecan

  7. Ecan Subjective effect of 15 dB echo returns loss.

  8. Ecan 4w switch comp inv 4w switch Echo suppressor In practice need more: VOX, over-ride, reset, etc.

  9. Ecan - near end far end Why not echo suppresion? • Echo suppression makes conversation half duplex • Waste of full-duplex infrastructure • Conversation unnatural • Hard to break in • Dead sounding line It would be better to cancel the echo subtract the echo signal allowing desired signal through but that requires DSP.

  10. Ecan clean Echo cancellation? Unfortunately, it’s not so easy Outgoing signal is delayed, attenuated, distorted Two echo canceller architectures: MODEM TYPE LINE ECHO CANCELLER (LEC) - echo path near end far end clean clean - near end far end echo path

  11. Ecan LEC architecture h y b r i d A/D NLP - Y filter H doubletalk detector adapt near end far end X D/A

  12. Ecan Adaptive Algorithms How do we • find the echo cancelling filter? • keep it correct even if the echo path parameters change? Need an algorithm that continually changes the filter parameters All adaptive algorithms are based on the same ideas (lack of corellation between desired signal and interference) Let’s start with a simpler case - adaptive noise cancellation

  13. Ecan Noise cancellation y h n x e n y x - n h e

  14. Ecan Noise cancellation - cont. Assume that noise is distorted only by unknown gain h We correct by transmitting e n so that the audience hears y = x + h n - e n = x + (h-e) n the energy of this signal is Ey = < y2 > = < x2> + (h-e)2< n2> + 2(h-e) < x n> Assume that Cxn = < x n> = 0 We need only seteto minimize Ey ! (turn knob until minimal) Even if the distortion is a complete filter h we set the ANC filter e to minimize Ey

  15. Ecan The LMS algorithm Gradient descent on energy correction to H is proportional to error d times input X H H + l d X

  16. Ecan Nonlinear processing Because of finite numeric precision the LEC (linear) filtering can not completely remove echo Standard LEC adds center clipping to remove residual echo Clipping threshold needs to be properly set by adaptation

  17. Ecan Doubletalk detection Adaptation of H should take place only when far end speaks So we freeze adaptation when no far end or double-talk, that is whenever near end speaks Geigel algorithm compares absolute value of near-end speech to half the maximum absolute value in X buffer If near-end exceeds far-end can assume only near-end is speaking

  18. DataRelays

  19. Relays The need for relays Voice is a relatively forgiving signal (rather the ear is) Compression techniques are designed to pass voice but may hopelessly distort other signals Even simple tones (or DTMF) may not be passed by coders We could go back to 64Kbps G.711 for non-voice signals But isn’t that silly? Using 64Kbps for 64bps or even 9.6Kbps data? The solution is to use a relay

  20. Open Channel Reasons to use 64Kbps G.711 (open channel) (32 KbpsADPCM may work as well) • Inexpensive • Simple design • Robust Even open channel is not trivial! • Need dynamic BW mechanism • Need to detect the event (fax/modem tone, DTMF, MF, CPT, etc.) • Need to return to compressed voice (end of session, time-out)

  21. Relays Fax Fax PSN Tone / Fax / Modem Relay Demodulate/ Remodulate Demodulate/ Remodulate A/D D/A Analog 64 Kbps 64 Kbps A/D D/A Analog • Problems: • need highly accurate detectors • need low false alarm rate • need appropriate protocol • need accurate timing • need expensive DSP processing • delay may be too large • may need “spoofing” • can sides operate with different parameters?

  22. Relays PSN VoP DSP Architecture Voice Packet Module Tone Detector PCM Interface Tone Generator LEC VAD CNG DISC. Packet Voice Protocol Multi Channel Codec Speech Coders Serial Port Playout Unit Real Time Operating System Control

  23. Relays VoP System Implementation Signaling Network Management Module NM info Telephony Signaling Module Microprocessor PSTN ATM / FR / IP Network Voice Packet Module Packet Protocol Module Voice Voice & Signaling Packets DSP Microprocessor

  24. Quality of Service

  25. QoS The meaning of QoS For general purpose data: • Every little bit counts • only lossless compression • best effort delivery • Real-time not essential • dynamic routing and packet reordering allowed For speech: • Only subjective quality counts • Can use lossy compression • Can drop segments with little effect • Real-time essential • predetermined route preferable (traffic engineering)

  26. QoS PSTN QoS • Virtually all calls (>95%) completed • Once connected virtually no disconnects or faults • Toll quality voice • Low delay (except satellite calls) • Full switching, optimized routing • Call Management • Fax/Modem functions • Wireline and wireless services

  27. QoS Paying for QoS • Law of Photonics Price of transmitting a bit drops by half every 9 months • Free Internet telephony Several firms offering free long distance service over Internet Strong compression, significant delay and jitter We no longer need to pay for service … but we are willing to pay for quality of service

  28. QoS Paying for QoS toll wire service mobile service

  29. SpeechQualityMeasurement

  30. SQM Why does it sound the way it sounds? PSTN • BW=0.2-3.8 KHz, SNR>30 dB • PCM, ADPCM (BER 10-3) • five nines reliability • line echo cancellation Voice over packet network • speech compression • delay, delay variation, jitter • packet loss/corruption/priority • echo cancellation

  31. SQM Subjective Voice Quality Old Measures • 5/9 • DRT • DAM The modern scale • MOS • DMOS meet neat seat feet Pete beat heat

  32. SQM MOS according to ITU P.800 Subjective Determination of Transmission Quality Annex B: Absolute Category Rating (ACR) Listening Quality Listening Effort 5 excellent relaxed 4 good attention needed 3 fair moderate effort 2 poor considerable effort 1 bad no meaning with feasible effort

  33. SQM MOS according to ITU (cont) Annex D Degradation Category Rating (DCR) Annex E Comparison Category Rating (CCR) • ACR not good at high quality speech DCR CCR 5 inaudible 4 not annoying 3 slightly annoying much better 2 annoying better 1 very annoying slightly better 0 the same -1 slightly worse -2 worse -3 much worse

  34. SQM Some MOS numbers Effect of Speech Compression: (from ITU-T Study Group 15) • Quiet room 48 KHz 16 bit linear sampling 5.0 • PCM (A-law/mlaw) 64 Kb/s 4.1 • G.723.1 @ 6.3 Kb/s 3.9 • G.729 @ 8 Kb/s 3.9 • ADPCM G.726 32 Kb/s 3.8 toll quality • GSM @ 13Kb/s 3.6 • VSELP IS54 @ 8Kb/s 3.4

  35. SQM The Problem(s) with MOS Accurate MOS tests are the only reliable benchmark BUT • MOS tests are off-line • MOS tests are slow • MOS tests are expensive • Different labs give consistently different results • Most MOS tests only check one aspect of system

  36. SQM The Problem(s) with SNR Naive question: Isn’t CCR the same as SNR? SNR does not correlate well with subjective criteria Squared difference is not an accurate comparator • Gain • Delay • Phase • Nonlinear processing

  37. SQM Speech distance measures Many objective measures have been proposed: • Segmental SNR • Itakura Saito distance • Euclidean distance in Cepstrum space • Bark spectral distortion • Coherence Function None correlate well with MOS ITU target - find a quality-measure that does correlate well

  38. SQM Return to Biology Standard speech model (LPC) (used by most speech processing/compression/recognition systems) is a model of speech production Unfortunately, speech production and perception systems are not matched Speech quality measurement idea: use a models of human auditory system (perception) ITU-T P.861 Perceptual Speech Quality Measurement (PSQM) ITU-T P.862 Perceptual Evaluation of Speech Quality (PESQ) ITU-R BS1387 Objective Measurements of Perceived Audio Quality

  39. SQM Some objective methods Perceptual Speech Quality Measurement (PSQM) ITU-T P.861 Perceptual Analysis Measurement System (PAMS) BT proprietary technique Perceptual Evaluation of Speech Quality (PESQ) ITU-T P.862 Objective Measurement of Perceived Audio Quality (PAQM) ITU-R BS.1387 E-model ITU-T G.107, G.108 ETSI ETR-250

  40. SQM channel QM to MOS QM MOS estimate Objective Quality Strategy speech

  41. SQM PSQM philosophy(from P.861) Internal Representation Perceptual model Audible Difference Cognitive Model Perceptual model Internal Representation

  42. SQM PSQM philosophy (cont) Perceptual Modelling (Internal representation) • Short time Fourier transform • Frequency warping (telephone-band filtering, Hoth noise) • Intensity warping Cognitive Modelling • Loudness scaling • Internal cognitive noise • Asymmetry • Silent interval processing PSQM Values • 0 (no degradation) to 6.5 (maximum degradation) Conversion to MOS • PSQM to MOS calibration using known references • Equivalent Q values

  43. SQM Problems with PSQM Designed for telephony grade speech codecs Doesn’t take network effects into account: • filtering • variable time delay • localized distortions Draft standard P.862 adds: • transfer function equalization • time alignment, delay skipping • distortion averaging

  44. SQM PESQ philosophy(from P.862) Perceptual model Internal Representation Cognitive Model Time Alignment Audible Difference Internal Representation Perceptual model

  45. SQM E-model R factor mouth to ear transmission quality model R = R0 - Is - Id - Ie + A where R0 effect of SNR Is effect of simultaneous impairments Id effect of delayed impairments Ie effect of equipment distortion A advantage of method (e.g. mobility of cellphone) Defined in ITU-T G.107, G.108 and ETSI ETR-250

  46. SQM VQMon PSQM and PESQ are intrusive techniques PSQM and PESQ require on-line DSP processing Given the speech encoder shouldn’t there be a connection between network parameterse.g. packet loss, jitter and speech quality? A nonintrusive technique has been developed based on the E-model Invented by AD Clark (Telchemy) accepted by ETSI TIPHON

More Related