Data selection by sequence summarizing neural network in mismatch condition training

Kateřina Žmolíková, Martin Karafiát, Karel Veselý, Marc Delcroix, Shinji Watanabe, Lukáš Burget, Jan Cěrnocký

Research output: Contribution to journalConference articlepeer-review

4 Citations (Scopus)

Abstract

Data augmentation is a simple and efficient technique to improve the robustness of a speech recognizer when deployed in mismatched training-test conditions. Our paper proposes a new approach for selecting data with respect to similarity of acoustic conditions. The similarity is computed based on a sequence summarizing neural network which extracts vectors containing acoustic summary (e.g. noise and reverberation characteristics) of an utterance. Several configurations of this network and different methods of selecting data using these "summary-vectors" were explored. The results are reported on a mismatched condition using AMI training set with the proposed data selection and CHiME3 test set.

Original languageEnglish
Pages (from-to)2354-2358
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume08-12-September-2016
DOIs
Publication statusPublished - 2016
Externally publishedYes
Event17th Annual Conference of the International Speech Communication Association, INTERSPEECH 2016 - San Francisco, United States
Duration: 2016 Sept 82016 Sept 16

Keywords

  • Automatic speech recognition
  • Data augmentation
  • Data selection
  • Mismatch training condition
  • Sequence summarization

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Modelling and Simulation

Fingerprint

Dive into the research topics of 'Data selection by sequence summarizing neural network in mismatch condition training'. Together they form a unique fingerprint.

Cite this