Underdetermined convolutive blind source separation via frequency bin-wise clustering and permutation alignment

Hiroshi Sawada*, Shoko Araki, Shoji Makino

*この研究の対応する著者

研究成果: Article査読

283 被引用数 (Scopus)

抄録

This paper presents a blind source separation method for convolutive mixtures of speech/audio sources. The method can even be applied to an underdetermined case where there are fewer microphones than sources. The separation operation is performed in the frequency domain and consists of two stages. In the first stage, frequency-domain mixture samples are clustered into each source by an expectationmaximization (EM) algorithm. Since the clustering is performed in a frequency bin-wise manner, the permutation ambiguities of the bin-wise clustered samples should be aligned. This is solved in the second stage by using the probability on how likely each sample belongs to the assigned class. This two-stage structure makes it possible to attain a good separation even under reverberant conditions. Experimental results for separating four speech signals with three microphones under reverberant conditions show the superiority of the new method over existing methods. We also report separation results for a benchmark data set and live recordings of speech mixtures.

本文言語English
論文番号5473129
ページ(範囲)516-527
ページ数12
ジャーナルIEEE Transactions on Audio, Speech and Language Processing
19
3
DOI
出版ステータスPublished - 2011
外部発表はい

ASJC Scopus subject areas

  • 音響学および超音波学
  • 電子工学および電気工学

フィンガープリント

「Underdetermined convolutive blind source separation via frequency bin-wise clustering and permutation alignment」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル