1 / 22

Song-level Multi-pitch Tracking by Heavily Constrained Clustering

Song-level Multi-pitch Tracking by Heavily Constrained Clustering. Zhiyao Duan , Jinyu Han and Bryan Pardo EECS Dept., Northwestern Univ. Interactive Audio Lab, http://music.cs.northwestern.edu For presentation in ICASSP 2010, Dallas, Texas, USA. Multi-pitch Estimation & Tracking Task.

tasya
Télécharger la présentation

Song-level Multi-pitch Tracking by Heavily Constrained Clustering

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Song-level Multi-pitch Tracking by Heavily Constrained Clustering ZhiyaoDuan, Jinyu Han and Bryan Pardo EECS Dept., Northwestern Univ. Interactive Audio Lab, http://music.cs.northwestern.edu For presentation in ICASSP 2010, Dallas, Texas, USA.

  2. Multi-pitch Estimation & Tracking Task • Given polyphonic music played by several monophonic harmonic instruments (Num known) • Estimate a pitch trajectory for each instrument Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu

  3. Potential Applications • Automatic music transcription • Harmonic source separation • Other applications • Melody-based music search • Chord recognition • Source localization • Music education • …… Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu

  4. The 2-stage Standard Approach • Stage 1: Multi-pitch Estimation (MPE): estimate pitches in each single time frame • Z. Duan, B. Pardo and C. Zhang. , “Multiple Fundamental Frequency Estimation by Modeling Spectral Peaks and Non-peak Regions”, IEEE Trans. Audio Speech Language Process., in press. • Stage 2: Multi-pitch Tracking (MPT): connect pitch estimates across frames into pitch trajectories Frequency … Time

  5. State of the Art of MPT • What existing MPT methods do • Form short pitch trajectories within a note, (note-level) according to local time-frequency proximity of pitch estimates • Our contribution • Form long pitch trajectories through multiple notes (song-level) using a new constrained clustering algorithm Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu

  6. Try Clustering by Timbre • Each trajectory is a cluster of pitch estimates • One cluster per instrument • Clustering principle: maintain timbre consistency in each cluster ? Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu

  7. Timbre Feature of Pitch Estimates • Harmonic structure: relative amplitudes of first 50 harmonics Frequency Time

  8. Minimize This Objective Function Number of Clusters Center of k-th cluster For all pitch estimates in k-th cluster A partition into K clusters The 50-d harmonic structure of i-th pitch estimate Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu

  9. Objective Function Is Not Enough Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu

  10. Add Pitch-locality Constraints • Must-link: pitch estimates close in both time and frequency should be in the same cluster • Cannot-link:simultaneous pitches should not be in the same cluster (only for monophonic instruments) Frequency Time

  11. Properties of Our Problem • Objective: timbre consistency • Constraints: pitch locality • Previous constrained clustering algorithms do not apply due to the following properties: • Inconsistent constraints: pitch estimates sometimes erroneous may make constraints unsatisfiable • Heavily constrained: nearly every pitch estimate is involved in at least one constraint Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu

  12. The Proposed Clustering Algorithm : clustering in n-th iteration; : {all constraints satisfied by } ; 1. Start from an initial clustering , which satisfies , a subset of all constraints; n=1; 2. Find a new clustering that decreases the objective and also satisfies ; 3. = {all constraints satisfied by } ; 4. Repeat 2-4 until the objective (nearly) cannot be decreased; Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu

  13. Initial Clustering • Trivial one • : a random partition • : constraints satisfied by , may be empty • A more informative one for MPT • : label pitches according to pitch order in each frame: highest, second-highest, third.., fourth… • : will contain all cannot-links Frequency Frequency … … Time Time Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu

  14. Find A New Clustering • 1. Satisfy current constraints • 2. Decrease the objective function : satisfied cannot-link : unsatisfied cannot-link : satisfied must-link : unsatisfied cannot-link • Swap set: A connected subgraph between two clusters. • Traverse all swap sets until finding a new clustering that decreases the objective function 1 3 2 7 1 1 3 3 6 5 2 2 3 4 8 7 7 6 6 5 5 3 3 4 4 8 8 Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu

  15. Algorithm Review : partition of points into clusters : feasible solution space under constraints Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu

  16. Experiments • Data set • 10 J.S. Bach chorales (quartets, played by violin, clarinet, saxophone and bassoon) • Each instrument is recorded individually, then mixed • Ground-truth pitch trajectories • Use YIN on monophonic tracks before mixing • Input pitch estimates • Our previous work in [1] • Input accuracy: 70.0+-3.1% [1] ZhiyaoDuan, Bryan Pardo and Changshui Zhang, “Multiple Fundamental Frequency Estimation by Modeling Spectral Peaks and Non-peak Regions”, IEEE Trans. Audio Speech Language Process., in press. Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu

  17. Overall Multi-pitch Tracking Results Mean % of correct pitch estimates Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu

  18. Among Correctly Estimated Pitches Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu

  19. An Example Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu

  20. An Example Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu

  21. Conclusion • Formulate the song-level Multi-pitch Tracking problem as a constrained clustering problem • Objective: timbre consistency • Constraints: pitch locality • Existing constrained clustering algorithms do not apply due to problem properties • Propose a new constrained clustering algorithm • Experimental results are promising Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu

  22. Thanks you! Northwestern University, Interactive Audio Lab. http://music.cs.northwestern.edu

More Related