1 / 13

Topic: Tuning the Labels

Topic: Tuning the Labels. Kishore Prahallad Email: skishore@cs.cmu.edu Carnegie Mellon University & International Institute of Information Technology Hyderabad. Objective of this Lecture. To tune the labels (phone boundaries) to get better quality output. Better Labels. Research:

Télécharger la présentation

Topic: Tuning the Labels

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Topic: Tuning the Labels Kishore Prahallad Email: skishore@cs.cmu.edu Carnegie Mellon University & International Institute of Information Technology Hyderabad Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu)

  2. Objective of this Lecture • To tune the labels (phone boundaries) to get better quality output Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu)

  3. Better Labels • Research: • Automatic segmentation models such as HMMs or neural networks could be tuned to obtain better labels. • Practical: • Use existing state-of-art speech segmentation algorithm • Manually verify and correct the misaligned labels • For small databases, manual correction is more apt. • Emulabel is the tool best suited for this purpose. Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu)

  4. Install Emulabel Step 1: untar the package (see course web site for emulabel package) $tar xvfz EMULABEL.tar.gz $ cd EMULABEL $ls (type ls to see the contents) Step 2: untar the emu-linux $tar xvfz emu-linux-1.4.2.tar.gz $cd emulabel Step 3: Login as root to install Emulabel $su .... #./doinstall.sh (install emulabel) Step 4: Go back to the EMULABEL directory #cd .. Step 5: Install TCL/TK versions.... #rpm -i tcl-8.0.5-35.i386.rpm --force #rpm -i tk-8.0.5-35.i386.rpm --force #rpm -i tclx-8.0.5-35.i386.rpm --force Step 6: Check to See, whether it runs: #exit (come out of the root login) Step 7: Go to voice directory $emulabel etc/emu_lab (this command should invoke GUI of emulabel) Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu)

  5. Emulabel invoked… Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu)

  6. Step 1 Press return Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu)

  7. Step 2 List of wave files appear here. Select one to label Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu)

  8. Step 3 Wave files and the red-labels appear Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu)

  9. Step 3 Move these red markers to move the boundaries Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu)

  10. Step 3 Listen to this red-marked region by right-click the mouse Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu)

  11. Manual Labels • Save the manual corrected labels • Labels are stored in lab/ directory. Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu)

  12. Evaluation • Compare the voice samples synthesized • before labeling Vs after labeling Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu)

  13. Additional Reading for the lecture • http://festvox.org • 11-752 CMU Course Lecture Notes • http://festvox.org/festtut/notes/festtut_toc.html • http://festvox.org/bsv/bsv-pitchmarks-sect.html Speech Technology - Kishore Prahallad (skishore@cs.cmu.edu)

More Related