Speaker Identification Using a Pitch Detection Algorithm

Speaker Identification Using a Pitch Detection Algorithm Presenters: Estefany Carrillo Roberto M. Meléndez Komal Syed Montgomery College Speech Processing Center Faculty Advisor: Dr. UchechukwuAbanulo

Presentation Outline • Introduction • Speech Classification Algorithm • Pitch Detection Algorithm • Application and Results • Summary Presenters: Estefany Carrillo Roberto M. Meléndez Komal Syed Montgomery College Speech Processing Center Faculty Advisor: Dr. UchechukwuAbanulo

Objectives Introduction Speech Classification Algorithm Pitch Detection Algorithm Application and Results Summary • To estimate the pitch contour of a given speech signal using autocorrelation • To determine the effectiveness of pitch for speaker identification

Speech Signals To understand pitch, one must first understand some basic concepts of speech signals Introduction Speech Classification Algorithm Pitch Detection Algorithm Application and Results Summary

Voiced vs. Unvoiced Speech • Voiced • Quasi-periodic excitation • Modulation by vocal tract • Production of mainly vowels • High Energy • Unvoiced • No periodic vibration of vocal chords • Noise-like nature • Production of most consonants • Low Energy Introduction Speech Classification Algorithm Pitch Detection Algorithm Application and Results Summary 5

Speech Signals Introduction Speech Classification Algorithm Pitch Detection Algorithm Application and Results Summary

Pitch Illustration Introduction Speech Classification Algorithm Pitch Detection Algorithm Application and Results Summary No periodicity, no frequency • Pitch period is the distance in time from one peak to the next • Approximately the same for the same phoneme by the same speaker

How do we measure the pitch period Automatically? • Correlation • Measure of similarity between two signals • Two signals compared by • Sliding one signal by a certain time lag • Multiplying both the overlapping regions • Repeating the process and adding the products until there is no more overlap • Cross-correlation – two different signals compared • Autocorrelation – the same signal correlated • Results in a maximum peak at which we set time = 0, and the rest of the correlation signals tapers of to zero Introduction Speech Classification Algorithm Pitch Detection Algorithm Application and Results Summary

Rationale for Autocorrelation A periodic (or quasi-periodic) signal will be similar from one period to the next It is expected that the maximum peak in the autocorrelation function will occur at the pitch period value for each speech frame. Introduction Speech Classification Algorithm Pitch Detection Algorithm Application and Results Summary

Introduction Speech Classification Algorithm Pitch Detection Algorithm Application and Results Summary Speech Classification Algorithm

Speech Classification Given a normalized speech signal (amplitudes from -1 to 1) Since speech is non-stationary (changes characteristics frequently with time), we first segment this signal into short frames (of about 10 ms) We then compute the average energy of each frame: Based on a pre-determined threshold, we classify the speech into voiced or unvoiced or background Introduction Speech Classification Algorithm Pitch Detection Algorithm Application and Results Summary

Introduction Speech Classification Algorithm Pitch Detection Algorithm Application and Results Summary Pitch Detection Algorithm

Autocorrelation Based PDA First we automatically assign a pitch of zero to every unvoiced or silence frame determined from the speech classification algorithm We then compute the autocorrelation function of each voiced frame A peak is searched for within the 2ms to 16ms range The lag of this peak is considered the pitch period for that frame, and the pitch is computed as the inverse of that lag. Introduction Speech Classification Algorithm Pitch Detection Algorithm Application and Results Summary Pitch = 0 Pitch = 0 Zero lag

Autocorrelation Based PDA - Illustration Introduction Speech Classification Algorithm Pitch Detection Algorithm Application and Results Summary

Introduction Speech Classification Algorithm Pitch Detection Algorithm Application and Results Summary Application and Results

Reference Speech Feature Extraction Model Building Test Speech Feature Extraction Recognition Decision Comparison Speaker Recognition Introduction Speech Classification Algorithm Pitch Detection Algorithm Application and Results Summary System Output

Reference Speech Pitch Detection Average Pitch of Signal Test Speech Test Speech Test Speech Pitch Detection and average pitch computation Speaker = Minimum distance Distance Computation Speaker Identification using PDA Introduction Speech Classification Algorithm Pitch Detection Algorithm Application and Results Summary System Output

Experiment Group I: 10 Women Group II: 10 Men Introduction Speech Classification Algorithm Pitch Detection Algorithm Application and Results Summary • Record each group member twice saying the same phrase • Record each group member saying a different phrase

Categories Case I: Female/Same Phrase Case II: Male/Same Phrase Case III: Female/Different Phrase Case IV: Male Different Phrase Case V: Female and Male/Same Phrase Case VI: Female and Male/Different Phrase

Procedure Select a range of thresholds for unvoiced segments of speech Range = [0.001:0.0005:0.01] Construct the pitch contour for each of the reference and test speech files for all thresholds Using minimum distance criterion, determine the test speakerthat matches the reference speaker

Pitch Contours Reference Speaker AMP L I T U D E Time (ms) P I T C H Time (ms)

Pitch Contours Matched Test Speaker AMP L I T U D E Time (ms) P I T C H Time (ms)

Best Threshold • Select threshold that gives maximum number of correctly matched speakers for each category Introduction Speech Classification Algorithm Pitch Detection Algorithm Application and Results Summary

Noise Add different levels of noise (5dB to 30dB) to: Both reference and test speech files Only reference speech file Only test speech files Examine the number of matched speakers vs. the level of SNR (Signal to Noise Ratio)

Female/Same Phrase Noise Added to Both Files Noise Added to Reference File Noise Added to Test File

Male/Same Phrase Noise Added to Both Files Noise Added to Reference File Noise Added to Test File

Female/Different Phrase Noise Added to Both Files Noise Added to Reference File Noise Added to Test File

Male/Different Phrase Noise Added to Both Files Noise Added to Reference File Noise Added to Test File

Male and Female/Same Phrase Noise Added to Both Files Noise Added to Reference File Noise Added to Test File

Male and Female/Different Phrase Noise Added to Both Files Noise Added to Reference File Noise Added to Test File

Introduction Speech Classification Algorithm Pitch Detection Algorithm Application and Results Summary Summary

Summary Pitch detection algorithms are heavily dependent on speech segmentation accuracy Pitch is somewhat effective as a simple speaker identifier Introduction Speech Classification Algorithm Pitch Detection Algorithm Application and Results Summary

Results 3. As signal to noise ratios increase, the number of correctly identified speakers increases 4. There seems to be an optimum signal to noise ratio that gives the maximum number of correctly matched speakers

Presenters: Estefany Carrillo Roberto M. Meléndez Komal Syed Montgomery College Speech Processing Center Faculty Advisor: Dr. UchechukwuAbanulo

Speaker Identification Using a Pitch Detection Algorithm

Speaker Identification Using a Pitch Detection Algorithm

Presentation Transcript

Speaker Identification and Verification

Business Identification: Spatial Detection

Text Independent Speaker Identification Using Gaussian Mixture Model

Event Detection using a Clustering Algorithm

A Robust Algorithm for Pitch Tracking

AUDIO-VISUAL SPEAKER IDENTIFICATION USING THE CUAVE DATABASE

Business Identification: Spatial Detection

Erroneous Distribution Data Identification Using Outlier Detection Techniques

Speaker Identification using Gaussian Mixture Model

Language and Speaker Identification using Gaussian Mixture Model

A Speaker Pruning Algorithm for Real-Time Speaker Identification

Multi-Speaker Detection

Speaker Identification Using Wavelet Analysis and ANN

Eifel Detection Algorithm*

Speaker Change Detection using Support Vector Machines

Using Speaker Recognition

Speaker Detection Without Models

Credit Card Fraud Detection using Fire Fly Algorithm

A Robust Speaker Identification System

Speaker Identification and Verification

Particle Detection and Identification

Identification of Disease in Leaves using Genetic Algorithm