SepNet: A deep separation matrix prediction network for multichannel audio source separation

Shota Inoue, Hirokazu Kameoka, Li Li, Shoji Makino

研究成果: Conference article査読

抄録

In this paper, we propose SepNet, a deep neural network (DNN) designed to predict separation matrices from multichannel observations. One well-known approach to blind source separation (BSS) involves independent component analysis (ICA). A recently developed method called independent low-rank matrix analysis (ILRMA) is one of its powerful variants. These methods allow the estimation of separation matrices based on deterministic iterative algorithms. Specifically, ILRMA is designed to update the separation matrix according to an update rule derived based on the majorization-minimization principle. Although ILRMA performs reasonably well under some conditions, there is still room for improvement in terms of both separation accuracy and computation time, especially for large-scale microphone arrays. The existence of a deterministic iterative algorithm that can find one of the stationary points of the BSS problem implies that a DNN can also play that role if designed and trained properly. Motivated by this, we propose introducing a DNN that learns to convert a predefined input (e.g., an identity matrix) into a true separation matrix in accordance with a multichannel observation. To enable it to find one of the multiple solutions corresponding to different permutations of the source indices, we further propose adopting a permutation invariant training strategy to train the network. By using a fully convolutional architecture, we can design the network so that the forward propagation can be computed efficiently. The experimental results revealed that SepNet was able to find separation matrices faster and with better separation accuracy than ILRMA for mixtures of two sources.

本文言語English
ページ(範囲)191-195
ページ数5
ジャーナルICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
2021-June
DOI
出版ステータスPublished - 2021
イベント2021 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2021 - Virtual, Toronto, Canada
継続期間: 2021 6 62021 6 11

ASJC Scopus subject areas

  • ソフトウェア
  • 信号処理
  • 電子工学および電気工学

フィンガープリント

「SepNet: A deep separation matrix prediction network for multichannel audio source separation」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル