1 / 21

Robust Temporal and Spectral Modeling for Query By Melody

Robust Temporal and Spectral Modeling for Query By Melody. Shai Shalev, Hebrew University Yoram Singer, Hebrew University Nir Friedman, Hebrew University Shlomo Dubnov, Ben-Gurion University. Prelude. Problem Setting. Find: performances of the queried melody. Query: a melody.

jocasta
Télécharger la présentation

Robust Temporal and Spectral Modeling for Query By Melody

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Robust Temporal and Spectral Modeling for Query By Melody Shai Shalev, Hebrew University Yoram Singer, Hebrew University Nir Friedman, Hebrew University Shlomo Dubnov, Ben-Gurion University

  2. Prelude

  3. Problem Setting Find: performances of the queried melody Query: a melody Database of real recordings

  4. Challenge • Find performances of the queried melody independent of: • Tempo • Performing instrument • Dynamics • Expression • Accompaniment

  5. Related Work • A. Ghias, et al. “Query by humming” • A. S. Durey and M. A. Clements. “Melody spotting using hidden markov models” • C. Raphael. “Automatic segmentation of acoustic musical signals using HMMs” • B. Doval and X. Rodet. “Fundamental frequency estimation using a new harmonic matching method”

  6. Overview of Solution • Employ a statistical framework • Align a melody to a performance using an explicit tempo modeling • Employ a maximum likelihood model for the spectrum of a note given the note’s pitch value • Find the best alignment of a melody to a performance using dynamic programming

  7. A melody query Ranked list of A database of real recordings Query Engine According to For each recording find: Statistical Framework

  8. Tempo Melody Aligned Melody Sound Melody Modeling Legend: Hidden Variable Observed Variable

  9. Tempo Modeling • Sequence of scaling factors (one per note) • Model tempo as a first order Markov model • Use log-normal distribution to model conditional probability of tempo

  10. Spectral Modeling

  11. Spectral Modeling

  12. Spectral Modeling (cont.)

  13. Spectral Modeling (cont.) • Estimate the amplitude at each harmonyand global variance of the noise using the maximum likelihood principle • Resulting signal-to-noise likelihood function:

  14. Finding the best melody-performance alignment • Recurse over tempo and end-time of the previous note  Dynamic Programming procedure • Complexity: #Possible Tempo values #notes Length of Signal

  15. Experimental Results • Queries: 50 melodies from opera arias (from Midi files) • Database: over 800 performances of opera arias performed by over 50 tenors with full orchestral accompaniment • Compared our variable-tempo (VT) model vs. fixed-tempo (FT) and locally-fixed-tempo (LFT) models • Compared our Harmonic with Scaled Noise (HSN) spectral model vs. Harmonic with Independent Noise (HIN) model

  16. Evaluation Measures + - Oerr = 0 + Cov = 3 - 2 Likelihood Value - - - - - 1 2 3 4 5 Index of Performancein the ranked list

  17. Summary of Results • One Error of VT+HSN: 8% • Average Precision of VT+HSN: 95% • Coverage of VT+HSN: 0.21

  18. Results Spectral Distribution Model HSN HIN AvgP Cov Oerr AvgP Cov Oerr 25 Sec. VT 0.95 0.21 0.08 0.92 0.40 0.10 LFT 0.66 5.90 0.46 0.63 5.98 0.48 FT 0.34 20.69 0.77 0.33 22.46 0.79 15 Sec. VT 0.86 1.75 0.19 0.83 3.02 0.19 LFT 0.66 8.10 0.44 0.66 8.15 0.42 FT 0.38 19.83 0.71 0.36 19.08 0.73 5 Sec. VT 0.51 10.67 0.65 0.46 11.83 0.69 LFT 0.43 17.33 0.69 0.37 17.94 0.75 FT 0.38 22.96 0.69 0.35 21.67 0.75

  19. Precision-Recall

  20. Illustration of Segmentation

  21. Future Work • More data • Other genre of music • Alternative spectral distribution models using supervised learning methods. • Use alignment results for separating a soloist from the accompaniment

More Related