Voice activity detection using frame-wise model re-estimation method based on Gaussian pruning with weight normalization

Masakiyo Fujimoto*, Shinji Watanabe, Tomohiro Nakatani

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Citations (Scopus)

Abstract

This paper proposes a frame-wise model re-estimation method based on Gaussian pruning with weight normalization for noise robust voice activity detection (VAD). Our previous work, switching Kalman filter-based VAD, sequentially estimates a non-stationary noise Gaussian mixture model (GMM) and constructs GMMs of observed noisy speech signals by composing pre-trained silence and clean GMMs and sequentially estimated noise GMMs. However, the composed models are not optimal, because they do not fully reflect the characteristics of the observed signal. Thus, to ensure the optimality of the composed models, we investigate a method for re-estimating the composed model. Since our VAD method works under the frame-wise sequential processing, there are insufficient re-training data for re-estimation of whole model parameters. To solve this problem, we propose a model re-estimation method that involves the extraction of reliable information using Gaussian pruning with weight normalization. Namely, the proposed method re-estimates the model by pruning non-dominant Gaussian distributions in expressing the local characteristics of each frame and by normalizing Gaussian weights of remaining distributions.

Original languageEnglish
Title of host publicationProceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010
PublisherInternational Speech Communication Association
Pages3102-3105
Number of pages4
Publication statusPublished - 2010
Externally publishedYes

Publication series

NameProceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010

Keywords

  • Gaussian pruning
  • Gaussian weight normalization
  • Switching Kalman filter
  • Voice activity detection

ASJC Scopus subject areas

  • Language and Linguistics
  • Speech and Hearing
  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Modelling and Simulation

Fingerprint

Dive into the research topics of 'Voice activity detection using frame-wise model re-estimation method based on Gaussian pruning with weight normalization'. Together they form a unique fingerprint.

Cite this