110 likes | 116 Vues
Single-Channel Audio Source Separation based on Probabilistic Latent Variable Decomposition. Ph.D Student Jeongsoo Park. Introduction. Spectral basis. Weight. Desired signal. Reconstruction. Spectrogram. Residuals. Background signal.
E N D
Single-Channel Audio Source Separation based on Probabilistic Latent Variable Decomposition Ph.D Student Jeongsoo Park
Introduction Spectral basis Weight Desired signal Reconstruction Spectrogram Residuals Background signal • Signal separation from mixed single-channel recordings • Time-frequency components of the desired signal are selected and reconstructed from the mixed signal • Blind Source Separation (BSS): No information of desired source is available • Informed Source Separation (ISS): Spectral and temporal information is available
Conventional approaches on BSS (1/5) • Non-negative Matrix Factorization (NMF) • A group of algorithms in multivariate analysis and linear algebra where a matrix is factorized into (usually) two matrices • NMF minimizes the error between V and WH while restricting W and H to be entry-wise non-negative • Two commonly used cost functions (Lee & Seung, 2001)
Conventional approaches on BSS (2/5) • Independent Component Analysis (ICA) • Separating a multivariate signal into additive subcomponents supposing the mutual statistical independence of the non-Gaussian source signals • When the independence assumption is correct, blind ICA separation of a mixed signal gives very good results • Definitions of independence for ICA • Minimization of Mutual Information(MMI) • Maximization of non-Gaussianity
Conventional approaches on BSS (3/5) Source separation from sound mixtures Training I don't know who you are. I don't know what you want. If you are looking for ransom, I can tell you I don't have money. But what I do have are a very particular set of skills; skills I have acquired over a very long career. Skills that make me a nightmare for people like you. If you let my daughter go now, that'll be the end of it. I will not look for you, I will not pursue you. But if you don't, I will look for you, I will find you, and I will kill you. I love my daughter. Result I love my daughter. I hate North Korea. • Probabilistic Latent Component Analysis (PLCA) • If we have sufficient information of a source (speaker), we can extract the signal of the source from sound mixtures
Conventional approaches on BSS (4/5) Frequency distribution How they appear in time weight Probability mass function z1 z2 • Probabilistic Latent Component Analysis (PLCA) • Interpretation of time-frequency representation of audio signal (spectrogram) as 2D histogram (outcomes of a discrete random process) • Avectoris interpretedasweightedsumoflatent variables’ distribution
Conventional approaches on BSS (5/5) • Performance evaluation
Conventional approaches on ISS (1/2) User-guided signal learning Mixed signal separation • Separation by Humming (Smaragdis et al., 2009) • User-guided signal is given to inform desired signal
Conventional approaches on ISS (2/2) • Performance evaluation
Ideas (1/2) Frequency Sparse 5513Hz Dense 0 Hz • Goal • Quickening EM algorithm • Approach • 1. Sparsity of high frequency components • 2. Application of Zwicker’s model Time
Ideas (2/2) tap tap tap tap tap tap tap tap Formant extraction Rearrange • Goal • Tapping based ISS • Approach • Extracting formants based on tapping information • Using formant and temporal information, we might be able to extract desired source