Blind speech separation in a meeting situation with maximum SNR beamformers

Shoko Araki*, Hiroshi Sawada, Shoji Makino

*この研究の対応する著者

研究成果: Conference contribution

62 被引用数 (Scopus)

抄録

We propose a speech separation method for a meeting situation, where each speaker sometimes speaks and the number of speakers changes every moment. Many source separation methods have already been proposed, however, they consider a case where all the speakers keep speaking: this is not always true in a real meeting. In such, cases, in addition, to separation, speech detection and the classification of the detected speech according to speaker become important issues. For that purpose, we propose a method that employs a maximum signal-to-noise (MaxSNR) beamformer combined with a voice activity detector and online clustering. We also discuss the scaling ambiguity problem as regards the MaxSNR beamformer, and provide their solutions. We report some encouraging results for a real meetingin a room with a reverberation time of about 350 ms.

本文言語English
ホスト出版物のタイトル2007 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '07
ページI41-I44
DOI
出版ステータスPublished - 2007
外部発表はい
イベント2007 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '07 - Honolulu, HI, United States
継続期間: 2007 4 152007 4 20

出版物シリーズ

名前ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
1
ISSN(印刷版)1520-6149

Conference

Conference2007 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '07
国/地域United States
CityHonolulu, HI
Period07/4/1507/4/20

ASJC Scopus subject areas

  • ソフトウェア
  • 信号処理
  • 電子工学および電気工学

フィンガープリント

「Blind speech separation in a meeting situation with maximum SNR beamformers」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル