Gated convolutional neural network-based voice activity detection under high-level noise environments

Li Li, Kouei Yamaoka, Yuki Koshino, Mitsuo Matsumoto, Shoji Makino

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper deals with voice activity detection (VAD) tasks under high-level noise environments where signal-to-noise ratios (SNRs) are lower than -5 dB. With the increasing needs for hands-free applications, it is unavoidable to face critically low SNR situations where the noise can be internal self-created ego noise or external noise occurring in the environment, e.g., rescue robots in a disaster or navigation in a high-speed moving car. To achieve accurate VAD results under such situations, this paper proposes a gated convolutional neural network-based approach that is able to capture long- and short-term dependencies in time series as cues for detection. Experimental evaluations using high-level ego noise of a hose-shaped rescue robot revealed that the proposed method was able to averagely achieve about 86% VAD accuracy in environments with SNR in the range of -30 dB to -5 dB.

Original languageEnglish
Title of host publicationProceedings of the 23rd International Congress on Acoustics
Subtitle of host publicationIntegrating 4th EAA Euroregio 2019
EditorsMartin Ochmann, Vorlander Michael, Janina Fels
PublisherInternational Commission for Acoustics (ICA)
Pages2862-2869
Number of pages8
ISBN (Electronic)9783939296157
DOIs
Publication statusPublished - 2019
Externally publishedYes
Event23rd International Congress on Acoustics: Integrating 4th EAA Euroregio, ICA 2019 - Aachen, Germany
Duration: 2019 Sep 92019 Sep 23

Publication series

NameProceedings of the International Congress on Acoustics
Volume2019-September
ISSN (Print)2226-7808
ISSN (Electronic)2415-1599

Conference

Conference23rd International Congress on Acoustics: Integrating 4th EAA Euroregio, ICA 2019
CountryGermany
CityAachen
Period19/9/919/9/23

Keywords

  • Ego noise
  • Gated convolutional neural networks
  • Low SNR
  • Rescue robot
  • Voice activity detection (VAD)

ASJC Scopus subject areas

  • Mechanical Engineering
  • Acoustics and Ultrasonics

Fingerprint Dive into the research topics of 'Gated convolutional neural network-based voice activity detection under high-level noise environments'. Together they form a unique fingerprint.

Cite this