Mel-spectrum to Mel-cepstrum Computation A Speech Recognition p resentati on October 1 2003

Mel-spectrum to Mel-cepstrum ComputationA Speech RecognitionpresentationOctober 1 2003 Ji Gu J.Gu@umail.LeidenUniv.nl

Mel-spectrum to Mel-cepstrum Computation Now we have known: • The FFT processing step converts each frame of N samples from the time domain into the frequency domain. • The result of the Mel-spectrum computation is:

Mel-spectrum to Mel-cepstrum Computation To compute Mel-cepstrum: • We convert the log Mel-spectrum back to time domain using the Discrete Cosine Transform (DCT). (Because the Mel-spectrum coefficients and their logarithm are real numbers) • The result obtained is called the Mel Frequency Cepstrum Coefficients (MFCC).

Mel-spectrum to Mel-cepstrum Computation Therefore : A DCT is applied to the natural logarithm of the Mel-spectrum to obtain the Mel-cepstrum,c[n] as: C is the number of the cepstral coefficients

Mel-spectrum to Mel-cepstrum Computation In SPHINX III Signal Processing Front End Specification • First, the Cosine section of c[n] is computed: int32 fe_compute_melcosine(melfb_t *MEL_FB) { float period, freq; int32 i,j; period = (float)2*MEL_FB->num_filters; if ((MEL_FB->mel_cosine = (float **) fe_create_2d(MEL_FB->num_cepstra,MEL_FB->num_filters, sizeof(float)))==NULL){ fprintf(stderr,"memory alloc failed in fe_compute_melcosine()\n...exiting\n"); exit(0); }

Mel-spectrum to Mel-cepstrum Computation for (i=0; i<MEL_FB->num_cepstra; i++) { freq = 2*(float)M_PI*(float)i/period; for (j=0;j< MEL_FB->num_filters;j++) MEL_FB->mel_cosine[i][j] = (float)cos((double)(freq*(j+0.5))); } return(0); } • Second, a Cosine transform of the Logarithm of the Mel-spectrum:

Mel-spectrum to Mel-cepstrum Computation void fe_mel_cep(fe_t *FE, double *mfspec, double *mfcep) { int32 i,j; /* static int first_run=1; */ /* unreferenced variable */ int32 period; float beta; period = FE->MEL_FB->num_filters; for (i=0;i<FE->MEL_FB->num_filters; ++i) { if (mfspec[i]>0) mfspec[i] = log(mfspec[i]); else mfspec[i] = -1.0e+5; }

Mel-spectrum to Mel-cepstrum Computation for (i=0; i< FE->NUM_CEPSTRA; ++i){ mfcep[i] = 0; for (j=0;j<FE->MEL_FB->num_filters; j++){ if (j==0) beta = 0.5; else beta = 1.0; mfcep[i] += beta*mfspec[j]*FE->MEL_FB->mel_cosine[i][j]; } mfcep[i] /= (float)period; } return; }

Mel-spectrum to Mel-cepstrum Computation By applying the procedure described above: • For each speech frame, a set of mel-frequency cepstrum coefficients(MFCC) is computed. • This set of coefficients is called an acoustic vector which represents the phonetically important characteristics of speech and is very useful for further analysis and processing in Speech Recognition. End

Mel-spectrum to Mel-cepstrum Computation A Speech Recognition p resentati on October 1 2003

Mel-spectrum to Mel-cepstrum Computation A Speech Recognition p resentati on October 1 2003

Presentation Transcript

O Mel

MEL-Con:

Mel Brooks

Providence St. Mel

Mel Balser

MEL

Ma-Mel-48a Ma-Mel-48b Ma-Mel-48c

Mel-spectrum computation new_fe_sp.c

MEL-Con:

MEL-Con:

MEL

MEL