This paper proposes a basis training algorithm for discriminative non-negative matrix factorization (NMF) with applications to single-channel audio source separation. With an NMF-based approach to supervised audio source separation, NMF is first applied to train the basis spectra of each source using training examples and then applied to the spectrogram of a mixture signal using the pretrained basis spectra at test time. The source signals can then be separated out using a Wiener filter. Here, a typical way to train the basis spectra is to minimize the dissimilarity measure between the observed spectrogram and the NMF model. However, obtaining the basis spectra in this way does not ensure that the separated signal will be optimal at test time due to the inconsistency between the objective functions for training and separation (Wiener filtering). To address this mismatch, a framework called discriminative NMF (DNMF) has recently been proposed. While this framework is noteworthy in that it uses a common objective function for training and separation, the objective function becomes more analytically complex than that of regular NMF. In the original DNMF work, a multiplicative update algorithm was proposed for the basis training; however, the convergence of the algorithm is not guaranteed and can be very slow. To overcome this weakness, this paper proposes a convergence-guaranteed algorithm for DNMF based on a majorization-minimization principle. Experimental results show that the proposed algorithm outperform the conventional DNMF algorithm as well as the regular NMF algorithm in terms of both the signal-to-distortion and signal-to-interference ratios.
ASJC Scopus subject areas
- コンピュータ サイエンス（全般）