A GMM sound source model for blind speech separation in under-determined conditions

Yasuharu Hirasawa, Naoki Yasuraoka, Toru Takahashi, Tetsuya Ogata, Hiroshi G. Okuno

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

This paper focuses on blind speech separation in under-determined conditions, that is, in the case when there are more sound sources than microphones. We introduce a sound source model based on the Gaussian mixture model (GMM) to represent a speech signal in the time-frequency domain, and derive rules for updating the model parameters using the auxiliary function method. Our GMM sound source model consists of two kinds of Gaussians: sharp ones representing harmonic parts and smooth ones representing nonharmonic parts. Experimental results reveal that our method outperforms the method based on non-negative matrix factorization (NMF) by 0.7dB in the signal-to-distortion ratio (SDR), and by 1.7dB in the signal-to-interference ratio (SIR). This means that our method effectively removes interference coming from other talkers.

Original languageEnglish
Title of host publicationLatent Variable Analysis and Signal Separation - 10th International Conference, LVA/ICA 2012, Proceedings
Pages446-453
Number of pages8
DOIs
Publication statusPublished - 2012 Feb 27
Externally publishedYes
Event10th International Conference on Latent Variable Analysis and Signal Separation, LVA/ICA 2012 - Tel Aviv, Israel
Duration: 2012 Mar 122012 Mar 15

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume7191 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference10th International Conference on Latent Variable Analysis and Signal Separation, LVA/ICA 2012
CountryIsrael
CityTel Aviv
Period12/3/1212/3/15

Keywords

  • Auxiliary function method
  • Blind speech separation
  • GMM sound source model
  • Under-determined condition

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint Dive into the research topics of 'A GMM sound source model for blind speech separation in under-determined conditions'. Together they form a unique fingerprint.

  • Cite this

    Hirasawa, Y., Yasuraoka, N., Takahashi, T., Ogata, T., & Okuno, H. G. (2012). A GMM sound source model for blind speech separation in under-determined conditions. In Latent Variable Analysis and Signal Separation - 10th International Conference, LVA/ICA 2012, Proceedings (pp. 446-453). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7191 LNCS). https://doi.org/10.1007/978-3-642-28551-6_55