1 / 44

Query by Pitch

Query by Pitch. Jin Yi and Russell Brennan. Introduction. Input: Sing a snippet of a song Output: Name of the song, artist, genre etc. Marketable: Integrate with online music shops Useful: Provides a quick, easy solution for determining song information . Methodology. Vocal delivery

muireann
Télécharger la présentation

Query by Pitch

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Query by Pitch Jin Yi and Russell Brennan

  2. Introduction • Input: Sing a snippet of a song • Output: Name of the song, artist, genre etc. • Marketable: Integrate with online music shops • Useful: Provides a quick, easy solution for determining song information

  3. Methodology • Vocal delivery • Subject to sing into microphone • Filtering • Filter noise via ~100 – 800Hz bandpass filter • Pitch Detection • Calculate difference function to determine fundamental frequency • Segmentation • Determine discrete pitches throughout signal

  4. Methodology (continued) • Indexing/Database Building • Calculate ratios of pitches and pitch durations to previous pitches and durations • Create database of known song ratios for comparison • Comparison • Compute second difference function, sliding vocal ratios across database ratio windows • Result • Song with lowest difference

  5. Bandpass Filter • Needed for filtering out noise • Butterworth filter doesn’t have ripple in the passband, unlike the Chebyshev filter

  6. Bandpass Filter (as originally intended, 4th order bandpass filter)

  7. First Order Bandpass Filter + First Order Lowpass Filter The output signal is too low since voltage is consumed in the resistor First order low pass filter First order bandpass filter

  8. Inverted Amplifier Added op-amp Gain = -r2/r1

  9. Final Circuit For Bandpass Filter • Bandpass filter cuts off the low frequency but has a long transition band for the high cutoff. We added two more low pass filters. Microphone Inverted op-amp Inverted op-amp Low pass filter Bandpass filter Low pass filter dsp

  10. Pitch Detection • Vocal delivery creates a periodic signal in short-time… • It should have a high correlation with itself, when shifted one period

  11. Detect the Period • A difference function squared: • dt(tau) = sum(j=1 to W) (sj – sj-tau)2 • Can detect the offset, tau = period • The period will be at a minimum of this difference function.

  12. Segmentation • Exponentially Weighted Moving Average (EWMA) • EWMA is often used in statistical process control to detect shifts in the mean of a process • The pitch from dsp should be smoothed to detect changes in pitch better • EWMA weights current and past values to create a current estimate of a signal average • A(i)=r * signal(i) + (1-r) * A(i-1)

  13. Segmentation (continued) • Use EWMA thusly: • EWMA “smoothes” the signal greatly • Detects shift in pitch by detecting a trend line • A trend of 4 in a row increasing or decreasing indicates a shift in mean • Can we trust the EWMA? • Each trend line becomes a mark

  14. Segmentation (conclusion) • By default, a mark is placed at the first and last samples of the pitch signal • Calculate means of pitch signal within eachmark section, i.e. 1-25:26-39:39-50 • If means are reasonably close, consider them one (this happens often) • Ratios of mean(i-1) / mean(i) are used for comparison

  15. Block Diagram for Calculating the Ratio Index Mark The Pitch EWMA Calculate Ratio Combine Close Pitch Calculate Pitch

  16. Example of calculating the ratio • Marks: 1 1 36 111 168 • Pitches : 214.4 161.4 240.0 • Ratios: 161/214 , 240/161 = .737, 1.52

  17. Algorithm for Finding the Right Song • R1 = Ratio of The Database • R2 = Ratio of The Current Input • Difference = (R1 – R2) ^2

  18. d 91 (2-8)^2 (3-9)^2 (1-0)^2 (4-3)^2 (5-4)^2 (6-2)^2

  19. 91 135 (3-8)^2 (1-9)^2 (4-0)^2 (5-3)^2 (6-4)^2 (7-2)^2

  20. 91 135 (1-8)^2 195 (4-9)^2 (5-0)^2 (6-3)^2 (7-4)^2 (8-2)^2

  21. 91 135 (4-8)^2 195 149 (5-9)^2 (6-0)^2 (7-3)^2 (8-4)^2 (9-2)^2

  22. 91 135 195 (5-8)^2 149 121 (6-9)^2 (7-0)^2 (8-3)^2 (9-4)^2 (0-2)^2

  23. 91 135 195 (6-8)^2 149 121 (7-9)^2 125 (8-0)^2 (9-3)^2 (0-4)^2 (3-2)^2

  24. 91 135 195 (7-8)^2 149 121 (8-9)^2 125 97 (9-0)^2 (0-3)^2 (3-4)^2 (4-2)^2

  25. 91 135 (8-8)^2 195 149 (9-9)^2 121 125 (0-0)^2 97 (3-3)^2 0 (4-4)^2 (2-2)^2

  26. 91 135 (9-8)^2 195 149 (0-9)^2 121 125 (0-3)^2 97 (4-3)^2 0 (2-4)^2 97 (1-2)^2

  27. 91 135 195 (2-8)^2 149 121 (3-9)^2 125 97 (4-3)^2 0 (4-3)^2 97 (5-4)^2 91 (6-2)^2

  28. 91 135 195 (3-8)^2 149 121 (4-9)^2 125 (2-3)^2 97 0 (1-3)^2 97 (4-4)^2 91 64 (5-2)^2

  29. Returns the minimum 91 135 195 149 121 125 Comparison Result for this song is 0 97 0 97 91 64

  30. In case of missing the pitch • This doesn’t work since one missing pitch will cause two incorrect ratios 91 pitches ratio Correct pitch 4 2 5 6 7 8 3 4/2 2/5 5/6 6/7 7/8 8/3 = 2 0.4 0.83 0.86 .88 2.67 missing One pitch 4 2 5 6 8 3 4/2 2/5 5/6 6/8 8/3 = 2 0.4 0.83 .75 2.67 Messing up this pitch and there is one pitch missing

  31. Build • Band-Pass Filter • Capacitors, inductors and op-amps, as well as resistors • Pitch Detection • TI 54x DSP dev board • Code Composer Studio version 1.2 • Serial Transmission of pitch indexes • Start/Stop signal capabilities

  32. Build (continued) • Pitch index reception/post-processing • Programmed as a standalone application in C++ • Ability to change song database on-the-fly

  33. Testing • Band-Pass Filter • Input a sinusoid and observed the result in the oscilloscope. • Measured the voltage at nodes to debug. • Output had a -0.6v Offset. Getting rid of this offset is not necessary since we are detecting only periodicity.

  34. Testing • Pitch Detector • Most testing done in Matlab environment • Sinusoid, swept sinusoid, noisy sinusoid, harmonic stack with noise, vocal singing • From DSP, serial output and memory dumps • Stepwise expectation verification

  35. Testing (continued)

  36. Testing Serial Port • Tera Term was used to test output of DSP. • Assembly serial port output function did not work for some reason. We had to use C functions written for ECE 420 • ASCII code was interpreted and found to correspond to correct pitches. Sending characters to the DSP was tested using a DSP on/off technique.

  37. Debugging the Software • In the Unix programming environment, most people use ‘printf’ to debug • In visual C++(api), printf cannot be used, so we debugged using popup windows • To view intermediate values or any results, we converted floating point numbers or integers to strings for use with popups.

  38. Testing Segmentation • Segmentation was first tested in Matlab to facilitate quick changes • Test clips of pepole singing short tunes were used • Parameters such as average weight decay and trend length were adjusted • Finally, the algorithm was integrated into our main executable

  39. Discussion (Successes/Failures) • Vocal extraction (failure) • Missing pitches • At least 5-6 pitches are needed • The program could match some songs almost 90 percent of the time

  40. Recommendations • Missing pitches ( curve-fitting ) • Duration of pitches • “De-esser” • Harmonic search (vocal extraction) • Double pitch output

  41. References • A. D. Cheveigne and H. Kawahara. Yin: A fundamental frequency estimator for speech and music. Journal of theAcoustical Society of America, 111(4), 2002. • Mark Hasegawa-Johnson. Audio Engineering Lecture Notes for ECE 403. January 20, 2005. • Robert Morrison, Jason Laska, Douglas Jones. Digital Signal Processing Laboratory. Feb. 21, 2005. http://cnx.rice.edu/content/col10236/latest/ • Alex Spektor, Personal Communication, Summer 2005

More Related