Coupled initialization of multi-channel non-negative matrix factorization based on spatial and spectral information

Yuuki Tachioka, Tomohiro Narita, Iori Miura, Takanobu Uramoto, Natsuki Monta, Shingo Uenohara, Ken'ichi Furuya, Shinji Watanabe, Jonathan Le Roux

研究成果: Conference article査読

8 被引用数 (Scopus)

抄録

Multi-channel non-negative matrix factorization (MNMF) is a multi-channel extension of NMF and often outperforms NMF because it can deal with spatial and spectral information simultaneously. On the other hand, MNMF has a larger number of parameters and its performance heavily depends on the initial values. MNMF factorizes an observation matrix into four matrices: spatial correlation, basis, cluster-indicator latent variables, and activation matrices. This paper proposes effective initialization methods for these matrices. First, the spatial correlation matrix, which shows the largest initial value dependencies, is initialized using the cross-spectrum method from enhanced speech by binary masking. Second, when the target is speech, constructing bases from phonemes existing in an utterance can improve the performance: this paper proposes a speech bases selection by using automatic speech recognition (ASR). Third, we also propose an initialization method for the cluster-indicator latent variables that couple the spatial and spectral information, which can achieve the simultaneous optimization of above two matrices. Experiments on a noisy ASR task show that the proposed initialization significantly improves the performance of MNMF by reducing the initial value dependencies.

本文言語English
ページ(範囲)2461-2465
ページ数5
ジャーナルProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
2017-August
DOI
出版ステータスPublished - 2017
外部発表はい
イベント18th Annual Conference of the International Speech Communication Association, INTERSPEECH 2017 - Stockholm, Sweden
継続期間: 2017 8 202017 8 24

ASJC Scopus subject areas

  • 言語および言語学
  • 人間とコンピュータの相互作用
  • 信号処理
  • ソフトウェア
  • モデリングとシミュレーション

フィンガープリント

「Coupled initialization of multi-channel non-negative matrix factorization based on spatial and spectral information」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル