1 / 21

Incorporating Dynamic Time Warping (DTW) in the SeqRec.m File

Incorporating Dynamic Time Warping (DTW) in the SeqRec.m File. Presented by: Clay McCreary, MSEE. Agenda. Project Scope DTW Basics Algorithm Implementation Observations Conclusion Continuing Research. Project Scope. Modify SeqRec.m to incorporate the DTW algorithm

hugh
Télécharger la présentation

Incorporating Dynamic Time Warping (DTW) in the SeqRec.m File

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Incorporating Dynamic Time Warping (DTW) in the SeqRec.m File Presented by: Clay McCreary, MSEE

  2. Agenda • Project Scope • DTW Basics • Algorithm Implementation • Observations • Conclusion • Continuing Research

  3. Project Scope • Modify SeqRec.m to incorporate the DTW algorithm • Test the implemented DTW algorithm using a small, manually created dictionary containing words of varying lengths • Determine if the addition of the DTW algorithm provided improvement to the recognition capability of SeqReq.m

  4. DTW Basics • DTW provides a method of comparing two vectors of different lengths • If the vectors are of the same length, the correlation function provides adequate comparison • DTW compares the measured vector to a template vector and provides a “likeness score” • This process is repeated for multiple templates • The template with the lowest “likeness score” is the template that most closely matches the measured vector

  5. DTW Basics (cont) • This method is especially useful when the measured vector must be one of the templates • Uttered speech compared to words in a language dictionary for example • DTW will compensate for “slurred” words/sounds

  6. DTW Basics (cont.) • The comparison is accomplished by organizing the measured and the template data into a matrix • The cells of the matrix are filled with the likelihood of column values matching the row value • Each cell contains the “Local” distance (maximum likelihood)

  7. DTW Basics (cont.) • The global distance is the addition of all of the local distances encountered on a “path” from the lower left cell to the upper right cell • The path cannot move down or to the left • Multiple paths are available • The DTW algorithm searches for the path with the lowest global distance • This global distance is the “Likeness Score”

  8. DTW Basics (cont.) • The following DTW path has a global score of 15 • This is the lowest possible path • This path is accomplished in 6 steps

  9. DTW Basics (cont.) • The global distance must be normalized to allow comparison of the measured vector to templates of various lengths • This is accomplished by dividing the global score by the number of steps used on the path • The previous example would have a normalized likeness score of 2.5

  10. Algorithm Implementation • Current operation • SeqRec.m compares each sound passed to the script (a vector of sounds assigned at the input) to each sound in a list of 14 (vowels) • The likelihood of each of the input sounds matching each of the sounds in the list is determined • From these likelihoods, a list of recognized sounds is generated

  11. Algorithm Implementation (cont.) • Each sound is compared to the input sound • If any do not match, the recognized sound sequence is declared in error

  12. Algorithm Implementation (cont.) • DTW implementation • The calculations of the likelihoods of each input sound to be each of the 14 sounds from the list is placed in a matrix, probmatrix • A matrix of the local distances for the template vs. measured vector matrix, dtwmatrix, is created using values stored in probmatrix • This matrix is upside down compared to the example when visualized

  13. Algorithm Implementation (cont.) • costmatrix1 is created to determine the lowest global distance • A cell is filled by adding the local distance for that cell obtained from dtwmatrix to the lowest value from previous adjacent cells (left, down, or left, down diagonal), which were filled in the same manner • This results in the value in the cell being the shortest global distance to that cell • Starting from index (1,1), the top row and left column are filled • Avoids ‘0’ index which is illegal in MATLAB • Then, the remaining cells are filled

  14. Algorithm Implementation (cont.) • This results in the shortest global distance being recorded in the bottom, right cell • Then, the path is determined by moving to the lowest previous, adjacent cell repeatedly until reaching the (1,1) index counting the number of steps • The shortest global distance is then divided by the number of steps to normalize

  15. Algorithm Implementation (cont.) • This process is repeated for all templates of each word length in the dictionary storing the global distances in 2 vectors • dtwvector contains all the global distances for each length • The minimum determines the best global distance for that length • dtwwordlength contains the best global distances of each length • The minimum determine the best global distance

  16. Algorithm Implementation (cont.) • The dictionary created to test this algorithm consisted of all permutations of: • 1 2 3 • 1 2 3 4 • 1 2 3 4 5 • This limited dictionary restricted testing the full capability of the DTW algorithm

  17. Observations • The previous SeqRec function typically had a high error rate (>50%) • DTW used these erroneous words for comparison to the templates • Even with the DTW algorithm using these erroneous words, the error rate was improved to <10%, typically

  18. Observations (cont.) • Before normalization, erroneous DTW words were usually the same length as the correct word • After normalization, erroneous DTW words were of various lengths

  19. Conclusions • By comparing the sequence of input sounds to templates of the whole sequence rather than to each part of the sequence, the DTW algorithm improves recognition by ~5X • DTW allows some error, where as the previous SeqRec function required that the recognized word be perfect

  20. Continuing Research • The basic DTW algorithm was implemented in the SeqRec function, but the limited dictionary using words without repeated sounds only allowed the function of the DTW algorithm to be tested, not the full capability

  21. Continuing Research (cont.) • Thus, the algorithm should be tested using larger dictionaries containing words in which sounds are used multiple times in the same word • The code was written in a general format to allow the easy incorporation of new dictionaries

More Related