A sampling-based speaker clustering using utterance-oriented Dirichlet process mixture model and its evaluation on large-scale data

Naohiro Tawara*, Tetsuji Ogawa, Shinji Watanabe, Atsushi Nakamura, Tetsunori Kobayashi

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

An infinite mixture model is applied to model-based speaker clustering with sampling-based optimization to make it possible to estimate the number of speakers. For this purpose, a framework of non-parametric Bayesian modeling is implemented with the Markov chain Monte Carlo and incorporated in the utterance-oriented speaker model. The proposed model is called the utterance-oriented Dirichlet process mixture model (UO-DPMM). The present paper demonstrates that UO-DPMM is successfully applied on large-scale data and outperforms the conventional hierarchical agglomerative clustering, especially for large amounts of utterances.

Original languageEnglish
JournalAPSIPA Transactions on Signal and Information Processing
Volume4
DOIs
Publication statusPublished - 2015 Oct 28

Keywords

  • Gibbs sampling
  • Non-parametric Bayesian model
  • Sampling approach
  • Speaker clustering
  • Utterance-oriented Dirichlet process mixture model

ASJC Scopus subject areas

  • Signal Processing
  • Information Systems

Fingerprint

Dive into the research topics of 'A sampling-based speaker clustering using utterance-oriented Dirichlet process mixture model and its evaluation on large-scale data'. Together they form a unique fingerprint.

Cite this