1 / 11

Single-Channel Audio Source Separation based on Probabilistic Latent Variable Decomposition

Single-Channel Audio Source Separation based on Probabilistic Latent Variable Decomposition. Ph.D Student Jeongsoo Park. Introduction. Spectral basis. Weight. Desired signal. Reconstruction. Spectrogram. Residuals. Background signal.

tahirah
Télécharger la présentation

Single-Channel Audio Source Separation based on Probabilistic Latent Variable Decomposition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Single-Channel Audio Source Separation based on Probabilistic Latent Variable Decomposition Ph.D Student Jeongsoo Park

  2. Introduction Spectral basis Weight Desired signal Reconstruction Spectrogram Residuals Background signal • Signal separation from mixed single-channel recordings • Time-frequency components of the desired signal are selected and reconstructed from the mixed signal • Blind Source Separation (BSS): No information of desired source is available • Informed Source Separation (ISS): Spectral and temporal information is available

  3. Conventional approaches on BSS (1/5) • Non-negative Matrix Factorization (NMF) • A group of algorithms in multivariate analysis and linear algebra where a matrix is factorized into (usually) two matrices • NMF minimizes the error between V and WH while restricting W and H to be entry-wise non-negative • Two commonly used cost functions (Lee & Seung, 2001)

  4. Conventional approaches on BSS (2/5) • Independent Component Analysis (ICA) • Separating a multivariate signal into additive subcomponents supposing the mutual statistical independence of the non-Gaussian source signals • When the independence assumption is correct, blind ICA separation of a mixed signal gives very good results • Definitions of independence for ICA • Minimization of Mutual Information(MMI) • Maximization of non-Gaussianity

  5. Conventional approaches on BSS (3/5) Source separation from sound mixtures Training I don't know who you are. I don't know what you want. If you are looking for ransom, I can tell you I don't have money. But what I do have are a very particular set of skills; skills I have acquired over a very long career. Skills that make me a nightmare for people like you. If you let my daughter go now, that'll be the end of it. I will not look for you, I will not pursue you. But if you don't, I will look for you, I will find you, and I will kill you. I love my daughter. Result I love my daughter. I hate North Korea. • Probabilistic Latent Component Analysis (PLCA) • If we have sufficient information of a source (speaker), we can extract the signal of the source from sound mixtures

  6. Conventional approaches on BSS (4/5) Frequency distribution How they appear in time weight Probability mass function z1 z2 • Probabilistic Latent Component Analysis (PLCA) • Interpretation of time-frequency representation of audio signal (spectrogram) as 2D histogram (outcomes of a discrete random process) • Avectoris interpretedasweightedsumoflatent variables’ distribution

  7. Conventional approaches on BSS (5/5) • Performance evaluation

  8. Conventional approaches on ISS (1/2) User-guided signal learning Mixed signal separation • Separation by Humming (Smaragdis et al., 2009) • User-guided signal is given to inform desired signal

  9. Conventional approaches on ISS (2/2) • Performance evaluation

  10. Ideas (1/2) Frequency Sparse 5513Hz Dense 0 Hz • Goal • Quickening EM algorithm • Approach • 1. Sparsity of high frequency components • 2. Application of Zwicker’s model Time

  11. Ideas (2/2) tap tap tap tap tap tap tap tap Formant extraction Rearrange • Goal • Tapping based ISS • Approach • Extracting formants based on tapping information • Using formant and temporal information, we might be able to extract desired source

More Related