Blocked Gibbs sampling based multi-scale mixture model for speaker clustering on noisy data

Naohiro Tawara, Tetsuji Ogawa, Shinji Watanabe, Atsushi Nakamura, Tetsunori Kobayashi

研究成果: Conference contribution

1 引用 (Scopus)

抜粋

A novel sampling method is proposed for estimating a continuous multi-scale mixture model. The multi-scale mixture models we assume have a hierarchical structure in which each component of the mixture is represented by a Gaussian mixture model (GMM). In speaker modeling from speech, this GMM represents intra-speaker dynamics derived from the difference in the attributes such as phoneme contexts and the existence of non-stationary noise and the mixture of GMMs (MoGMMs) represents inter-speaker dynamics derived from the difference in speakers. Gibbs sampling is a powerful technique to estimate such hierarchically structured models but can easily induce the local optima problem depending on its use especially when the elemental GMMs are complex in structure. To solve this problem, a highly accurate and robust sampling method based on the blocked Gibbs sampling and iterative conditional modes (ICM) is proposed and effectively applied for reducing a singularity solution given in the model with complex multi-modal distributions. In speaker clustering experiments under non-stationary noise, the proposed sampling-based model estimation improved the clustering performance by 17% on average compared to the conventional sampling-based methods.

元の言語English
ホスト出版物のタイトル2013 IEEE International Workshop on Machine Learning for Signal Processing - Proceedings of MLSP 2013
DOI
出版物ステータスPublished - 2013 12 1
イベント2013 16th IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2013 - Southampton, United Kingdom
継続期間: 2013 9 222013 9 25

出版物シリーズ

名前IEEE International Workshop on Machine Learning for Signal Processing, MLSP
ISSN(印刷物)2161-0363
ISSN(電子版)2161-0371

Conference

Conference2013 16th IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2013
United Kingdom
Southampton
期間13/9/2213/9/25

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Signal Processing

フィンガープリント Blocked Gibbs sampling based multi-scale mixture model for speaker clustering on noisy data' の研究トピックを掘り下げます。これらはともに一意のフィンガープリントを構成します。

  • これを引用

    Tawara, N., Ogawa, T., Watanabe, S., Nakamura, A., & Kobayashi, T. (2013). Blocked Gibbs sampling based multi-scale mixture model for speaker clustering on noisy data. : 2013 IEEE International Workshop on Machine Learning for Signal Processing - Proceedings of MLSP 2013 [6661902] (IEEE International Workshop on Machine Learning for Signal Processing, MLSP). https://doi.org/10.1109/MLSP.2013.6661902