1 / 15

Retrieval Methods for QBSH (Query By Singing/Humming)

Retrieval Methods for QBSH (Query By Singing/Humming). J.-S. Roger Jang ( 張智星 ) jang@mirlab.org http://mirlab.org/jang Multimedia Information Retrieval Lab CSIE Dept, National Taiwan University. Retrieval Methods for QBSH. Goal Find the most similar melody in the database Challenges

orsen
Télécharger la présentation

Retrieval Methods for QBSH (Query By Singing/Humming)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Retrieval Methods for QBSH (Query By Singing/Humming) J.-S. Roger Jang (張智星) jang@mirlab.org http://mirlab.org/jang Multimedia Information Retrieval Lab CSIE Dept, National Taiwan University

  2. Retrieval Methods for QBSH • Goal • Find the most similar melody in the database • Challenges • Robust pitch tracking for various acoustic inputs • Input from mobile devices • Input at a noisy karaoke box • Comparison methods should be able to deal with… • Key variations in users’ input (for instance, due to gender difference) • Tempo variations in users’ input • Reasonable response time, e.g., 5 seconds

  3. Evaluation of QBSH Methods • Two categories for evaluating QBSH methods • Efficiency: How fast is the system? • Can it deal with a music database of size 100K? • Effectiveness: How accurate is the system? • Top-10 recognition rates for n queries: • (1+0+0+1+1…)/n • Top-10 mean reciprocal rank for n queries: • (1/3+1/inf+1/4+1/2+1/5…)/n • True positive and true negative to deal with out-of-vocabulary (OOV) problem

  4. Types of QBSH Approaches • Categories of approaches to QBSH • Histogram/statistics-based • Note vs. note • Edit distance • Frame vs. note • HMM • Frame vs. frame • Linear scaling, DTW, recursive alignment

  5. Linear Scaling (LS) • Concept • Scale the query linearly to match the candidates • Assumption • Uniform tempo variation • Rest handling • Cut leading and trailing zeros (silence) • All the other zeros (rests) are replaced with the previous non-zero pitch

  6. Linear Scaling • Scale the query pitch linearly to match the candidates Target pitch in database Compressed by 0.5 Compressed by 0.75 Original pitch Original input pitch Best match Stretched by 1.25 Stretched by 1.5

  7. Strength and Weakness of LS • Strength • One-shot for dealing with key transposition • Efficient and effective • Indexing methods available • Weakness • Cannot deal with non-uniform tempo variations • Typical mapping path

  8. Shorten or Lengthen a Pitch Vector • Given a pitch vector x of length m, how to shorten or lengthen it to length n? • x2=interp1(1:m, x, linspace(1, m, n)); • Examples • m=7, n=13 • m=7, n=9

  9. Distance Function for LS • Commonly used distance function for LS • Normalized Lp-norm • Characteristics • Usually p=1 or 2 for LS • Normalization to get rid of length variations

  10. Key Transposition in LS • How to find the best transposed query that has the smallest distance from the database items: • Best transposition • In practice… Query Database item Transposed query

  11. Example of Linear Scaling via L1 Norm • linScaling01.m

  12. Linear Scaling via L1 and L2 Norm • linScaling02.m

  13. DTW (Dynamic Time Warping) • About DTW • DTW introduction • DTW for QBSH • #1 method for task 2 in QBSH/MIREX 2006

  14. RA (Recursive Alignment) • Characteristics • Combine characteristics of LS & DTW • #1 method for task 1 in QBSH/MIREX 2006 • A typical mapping path

  15. Modified Edit Distance • Note segmentation • Modified edit distance

More Related