1 / 17

Fusion of Live Audio Recordings for Blind Noise Reduction

Fusion of Live Audio Recordings for Blind Noise Reduction. Aaron Ballew Aleksandar Kuzmanovic C. C. Lee Northwestern University Dept. of Electrical Engineering and Computer Science July 7 th 2011. Observation. You Attend a Concert. Bootleggers.

swiger
Télécharger la présentation

Fusion of Live Audio Recordings for Blind Noise Reduction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Fusion of Live Audio Recordings for Blind Noise Reduction Aaron Ballew Aleksandar Kuzmanovic C. C. Lee Northwestern University Dept. of Electrical Engineering and Computer Science July 7th 2011

  2. Observation You Attend a Concert Bootleggers At the show, you remember cell phones and cameras in the air • You’d like a recording of the show • Live albums exist, but… • You want the show you went to, back in San Jose CA on Feb 22nd 2010

  3. Observation, cont’d Seek it Out Online Database • You find some of those recordings uploaded • Not just one, but three, four, or five copies of your favorite songs • Varying quality

  4. Opportunity • Each song is an unknown source signal with receiver diversity • There must be a way to take advantage of the diversity in these recordings to generate a new recording whose quality is better than any of the originals

  5. Opportunity, cont’d • All the recordings have something in common – a sameness from the music that was generated • They have something uncommon too – a differentness from noisy applause, screaming fans, wind, etc.

  6. Complications • No reference (except in your mind) that defines which part is music rather than noise • Studio recording won’t work in general • You don’t know the SNR of any signal • There’s no pilot signal to imply the channel • No opportunity to pre-code a digital waveform • It’s an Analog source • No M-ary QPSK, Matched-Filters • Uncountably many sources and relatively few recordings, not a good fit for ICA

  7. Assumptions • Recordings are mono • Stage speakers may be physically separated and multitrack • Relative to venue’s scale and listener’s perspective the multitracks arrive synchronized and recorded as mono by mic • Recordings are not synchronized to each other • Different start/stop times and duration • Receivers are distributed arbitrarily among audience • Noise at one receiver is not the same noise at another • Not necessarily true if two receivers are close to each other • Not true out-of-context, such as a quiet auditorium  Sample vs. Sample  Noise vs. Noise

  8. Strategy • We will never know the absolute SNR of any of the recordings • However, if we could be confident their signal powers were equal, then the differences in their total powers would be due to the noise • Assumes the noise is (close to) uncorrelated • Does not assume we know what the signal power actually is • If we could use the total power as a proxy for noise power (given bullet 2 above), we could: • Rank recordings by SNR • Apply a classic averaging technique to cancel noise • Measure whether noise power went up or down compared to any original recording

  9. Strategy, cont’d • It would look like this:

  10. Step 1 – Internal Reference Similarity & Synchronization • Cross-correlations show: • Which sample is most similar to all other samples • The time-shift (lag) between any sample pair • No external reference, so pick internal one from the sample set

  11. Step 2 – Normalize In Absence of SNR, • The effect of combining samples is unclear • Need a way to isolate changes in signal or noise power • It would be helpful if signal powers were already equal • Implies combining affects the noise

  12. Step 2 – Normalize, cont’d Use the Right Tool • Use covariance, not r, to normalize signal powers • You still don’t know the absolute signal powers • You only know that the differences are due to noise • Now, you can tell whether noise goes up or down after combining

  13. Step 3 – Fusion “Weighted” Average • Find the average of the first M ranked samples, such that total power is minimized • Why the first M? • A sample’s noise power may be so large it increases the composite’s noise *not to scale

  14. Benefits • Identify a “best” quality recording without having to manually listen to each • Generate a recording that exceeds the “best” in quality • Encourage user-generated (crowd-sourced) content sharing • Applicable to any context where the source signal is completely unknown

  15. Ongoing and Future • Ongoing: Time-variability of noise • Shows up as “low-frequency” noise that downselects against such a recording • We window in time (and frequency) to take advantage of the high-quality parts of the recordings • Stitching the windows back together post-fusion requires some attention due to an audible discontinuity when adjacent windows generate a different composite • Future: Maximal Ratio Combining • Well-known technique that requires channel knowledge • Gives optimal weighting of samples for maximal fusion gain • I believe we can adapt the inference technique to MRC, such that we get the “maximal” SNR gain, though I may not know exactly what the gain is!

  16. Conclusion Thank You! http://networks.cs.northwestern.edu/~aaron/fusion.html

  17. Fusion of Live Audio Recordings for Blind Noise Reduction Aaron Ballew Aleksandar Kuzmanovic C. C. Lee Northwestern University Dept. of Electrical Engineering and Computer Science

More Related