Voice activity detection using frame-wise model re-estimation method based on Gaussian pruning with weight normalization

Masakiyo Fujimoto, Shinji Watanabe, Tomohiro Nakatani

Research output: Contribution to conferencePaper

5 Citations (Scopus)

Abstract

This paper proposes a frame-wise model re-estimation method based on Gaussian pruning with weight normalization for noise robust voice activity detection (VAD). Our previous work, switching Kalman filter-based VAD, sequentially estimates a non-stationary noise Gaussian mixture model (GMM) and constructs GMMs of observed noisy speech signals by composing pre-trained silence and clean GMMs and sequentially estimated noise GMMs. However, the composed models are not optimal, because they do not fully reflect the characteristics of the observed signal. Thus, to ensure the optimality of the composed models, we investigate a method for re-estimating the composed model. Since our VAD method works under the frame-wise sequential processing, there are insufficient re-training data for re-estimation of whole model parameters. To solve this problem, we propose a model re-estimation method that involves the extraction of reliable information using Gaussian pruning with weight normalization. Namely, the proposed method re-estimates the model by pruning non-dominant Gaussian distributions in expressing the local characteristics of each frame and by normalizing Gaussian weights of remaining distributions.

Original languageEnglish
Pages3102-3105
Number of pages4
Publication statusPublished - 2010 Dec 1
Event11th Annual Conference of the International Speech Communication Association: Spoken Language Processing for All, INTERSPEECH 2010 - Makuhari, Chiba, Japan
Duration: 2010 Sep 262010 Sep 30

Conference

Conference11th Annual Conference of the International Speech Communication Association: Spoken Language Processing for All, INTERSPEECH 2010
CountryJapan
CityMakuhari, Chiba
Period10/9/2610/9/30

Keywords

  • Gaussian pruning
  • Gaussian weight normalization
  • Switching Kalman filter
  • Voice activity detection

ASJC Scopus subject areas

  • Language and Linguistics
  • Speech and Hearing

Fingerprint Dive into the research topics of 'Voice activity detection using frame-wise model re-estimation method based on Gaussian pruning with weight normalization'. Together they form a unique fingerprint.

  • Cite this

    Fujimoto, M., Watanabe, S., & Nakatani, T. (2010). Voice activity detection using frame-wise model re-estimation method based on Gaussian pruning with weight normalization. 3102-3105. Paper presented at 11th Annual Conference of the International Speech Communication Association: Spoken Language Processing for All, INTERSPEECH 2010, Makuhari, Chiba, Japan.