Fully Bayesian speaker clustering based on hierarchically structured utterance-oriented Dirichlet process mixture model

Naohiro Tawara, Tetsuji Ogawa, Shinji Watanabe, Atsushi Nakamura, Tetsunori Kobayashi

研究成果: Conference contribution

2 引用 (Scopus)

抜粋

We have proposed a novel speaker clustering method based on a hierarchically structured utterance-oriented Dirichlet process mixture model. In the proposed method, the number of speakers can be determined from the given data using a nonparametric Bayesian manner and intra-speaker variability is successfully handled by multi-scale mixture modeling. Experimental result showed that the proposed method is computationally-efficientand effective in speaker clustering. The proposed method significantly improve the accuracy of speaker clustering systems as compared with the conventional method, particularly for the case in which the number of utterances varied from speaker to speaker.

元の言語English
ホスト出版物のタイトル13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
ページ2163-2166
ページ数4
出版物ステータスPublished - 2012 12 1
イベント13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012 - Portland, OR, United States
継続期間: 2012 9 92012 9 13

出版物シリーズ

名前13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
3

Conference

Conference13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
United States
Portland, OR
期間12/9/912/9/13

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Communication

これを引用

Tawara, N., Ogawa, T., Watanabe, S., Nakamura, A., & Kobayashi, T. (2012). Fully Bayesian speaker clustering based on hierarchically structured utterance-oriented Dirichlet process mixture model. : 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012 (pp. 2163-2166). (13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012; 巻数 3).