1 / 9

Mel-spectrum to Mel-cepstrum Computation A Speech Recognition p resentati on October 1 2003

Mel-spectrum to Mel-cepstrum Computation A Speech Recognition p resentati on October 1 2003. Ji Gu J.Gu@umail.LeidenUniv.nl. Mel-spectrum to Mel-cepstrum Computation. Now we have known :

teenie
Télécharger la présentation

Mel-spectrum to Mel-cepstrum Computation A Speech Recognition p resentati on October 1 2003

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Mel-spectrum to Mel-cepstrum ComputationA Speech RecognitionpresentationOctober 1 2003 Ji Gu J.Gu@umail.LeidenUniv.nl

  2. Mel-spectrum to Mel-cepstrum Computation Now we have known: • The FFT processing step converts each frame of N samples from the time domain into the frequency domain. • The result of the Mel-spectrum computation is:

  3. Mel-spectrum to Mel-cepstrum Computation To compute Mel-cepstrum: • We convert the log Mel-spectrum back to time domain using the Discrete Cosine Transform (DCT). (Because the Mel-spectrum coefficients and their logarithm are real numbers) • The result obtained is called the Mel Frequency Cepstrum Coefficients (MFCC).

  4. Mel-spectrum to Mel-cepstrum Computation Therefore : A DCT is applied to the natural logarithm of the Mel-spectrum to obtain the Mel-cepstrum,c[n] as: C is the number of the cepstral coefficients

  5. Mel-spectrum to Mel-cepstrum Computation In SPHINX III Signal Processing Front End Specification • First, the Cosine section of c[n] is computed: int32 fe_compute_melcosine(melfb_t *MEL_FB) { float period, freq; int32 i,j; period = (float)2*MEL_FB->num_filters; if ((MEL_FB->mel_cosine = (float **) fe_create_2d(MEL_FB->num_cepstra,MEL_FB->num_filters, sizeof(float)))==NULL){ fprintf(stderr,"memory alloc failed in fe_compute_melcosine()\n...exiting\n"); exit(0); }

  6. Mel-spectrum to Mel-cepstrum Computation for (i=0; i<MEL_FB->num_cepstra; i++) { freq = 2*(float)M_PI*(float)i/period; for (j=0;j< MEL_FB->num_filters;j++) MEL_FB->mel_cosine[i][j] = (float)cos((double)(freq*(j+0.5))); } return(0); } • Second, a Cosine transform of the Logarithm of the Mel-spectrum:

  7. Mel-spectrum to Mel-cepstrum Computation void fe_mel_cep(fe_t *FE, double *mfspec, double *mfcep) { int32 i,j; /* static int first_run=1; */ /* unreferenced variable */ int32 period; float beta; period = FE->MEL_FB->num_filters; for (i=0;i<FE->MEL_FB->num_filters; ++i) { if (mfspec[i]>0) mfspec[i] = log(mfspec[i]); else mfspec[i] = -1.0e+5; }

  8. Mel-spectrum to Mel-cepstrum Computation for (i=0; i< FE->NUM_CEPSTRA; ++i){ mfcep[i] = 0; for (j=0;j<FE->MEL_FB->num_filters; j++){ if (j==0) beta = 0.5; else beta = 1.0; mfcep[i] += beta*mfspec[j]*FE->MEL_FB->mel_cosine[i][j]; } mfcep[i] /= (float)period; } return; }

  9. Mel-spectrum to Mel-cepstrum Computation By applying the procedure described above: • For each speech frame, a set of mel-frequency cepstrum coefficients(MFCC) is computed. • This set of coefficients is called an acoustic vector which represents the phonetically important characteristics of speech and is very useful for further analysis and processing in Speech Recognition. End

More Related