1 / 6

PROJECT PROPOSAL

PROJECT PROPOSAL. Shamalee Deshpande. Problem Statement. Extracting soft biometric features Age Gender Accent. Speaker Database. A Speaker database from the LDC Corpus Catalog* Preferable use half the speaker set for training and the later half for verification of results

ailani
Télécharger la présentation

PROJECT PROPOSAL

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PROJECT PROPOSAL Shamalee Deshpande

  2. Problem Statement Extracting soft biometric features • Age • Gender • Accent

  3. Speaker Database • A Speaker database from the LDC Corpus Catalog* • Preferable use half the speaker set for training and the later half for verification of results • Contain varying gender, age and accent *Linguistic Data Consortium, http://www.ldc.upenn.edu/Catalog/

  4. Window DFT IDFT Speech Cepstrum Possible Computation for Gender • Pitch In Cepstrum Analysis, Formants are completely removed from the spectrum thus isolating the pitch frequency. LPC also used to find pitch Pitch is used to classify speech with regards to Gender Av Males=100-132Hz Av Females=142-256Hz LOG

  5. Possible Computation for Accent • People usually have characteristic styles of pronouncing phonemes from an early age dependant on the primary language learned. • Cepstral coefficients may again be used and presumably the MFCCs for the analysis of the speech spectrum to identify local/non-local speakers in a database.

  6. TUBE Vocal tract BUZZER Glottal excitation Characterized by intensity and pitch Characterized by formants Possible Computation for Age Vocal tract length is said to be a good classifier of the age of a speaker Formant frequencies derived using LPC co-relate to the length of the vocal tract Children are said to have a higher formant frequency range than adults Specifically, elderly speakers are said to have lower formant frequencies F1,F2,F3 than their younger counterparts more so seen with regards to F1

More Related