TY - GEN
T1 - Mel-Generalized cepstral regularization for discriminative non-negative matrix factorization
AU - Li, Li
AU - Kameoka, Hirokazu
AU - Makino, Shoji
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/12/5
Y1 - 2017/12/5
N2 - The non-negative matrix factorization (NMF) approach has shown to work reasonably well for monaural speech enhancement tasks. This paper proposes addressing two shortcomings of the original NMF approach: (1) the objective functions for the basis training and separation (Wiener filtering) are inconsistent (the basis spectra are not trained so that the separated signal becomes optimal); (2) minimizing spectral divergence measures does not necessarily lead to an enhancement in the feature domain (e.g., cepstral domain) or in terms of perceived quality. To address the first shortcoming, we have previously proposed an algorithm for Discriminative NMF (DNMF), which optimizes the same objective for basis training and separation. To address the second shortcoming, we have previously introduced novel frameworks called the cepstral distance regularized NMF (CDRNMF) and mel-generalized cepstral distance regularized NMF (MGCRNMF), which aim to enhance speech both in the spectral domain and feature domain. This paper proposes combining the goals of DNMF and MGCRNMF by incorporating the MGC regularizer into the DNMF objective function and proposes an algorithm for parameter estimation. The experimental results revealed that the proposed method outperformed the baseline approaches.
AB - The non-negative matrix factorization (NMF) approach has shown to work reasonably well for monaural speech enhancement tasks. This paper proposes addressing two shortcomings of the original NMF approach: (1) the objective functions for the basis training and separation (Wiener filtering) are inconsistent (the basis spectra are not trained so that the separated signal becomes optimal); (2) minimizing spectral divergence measures does not necessarily lead to an enhancement in the feature domain (e.g., cepstral domain) or in terms of perceived quality. To address the first shortcoming, we have previously proposed an algorithm for Discriminative NMF (DNMF), which optimizes the same objective for basis training and separation. To address the second shortcoming, we have previously introduced novel frameworks called the cepstral distance regularized NMF (CDRNMF) and mel-generalized cepstral distance regularized NMF (MGCRNMF), which aim to enhance speech both in the spectral domain and feature domain. This paper proposes combining the goals of DNMF and MGCRNMF by incorporating the MGC regularizer into the DNMF objective function and proposes an algorithm for parameter estimation. The experimental results revealed that the proposed method outperformed the baseline approaches.
KW - Discriminative non-negative matrix factorization
KW - Mel-generalized cepstral representation
KW - Single-channel
KW - Speech enhancement
UR - http://www.scopus.com/inward/record.url?scp=85042320831&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85042320831&partnerID=8YFLogxK
U2 - 10.1109/MLSP.2017.8168142
DO - 10.1109/MLSP.2017.8168142
M3 - Conference contribution
AN - SCOPUS:85042320831
T3 - IEEE International Workshop on Machine Learning for Signal Processing, MLSP
SP - 1
EP - 6
BT - 2017 IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2017 - Proceedings
A2 - Ueda, Naonori
A2 - Chien, Jen-Tzung
A2 - Matsui, Tomoko
A2 - Larsen, Jan
A2 - Watanabe, Shinji
PB - IEEE Computer Society
T2 - 2017 IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2017
Y2 - 25 September 2017 through 28 September 2017
ER -