Data selection by sequence summarizing neural network in mismatch condition training

Kateřina Žmolíková, Martin Karafiát, Karel Veselý, Marc Delcroix, Shinji Watanabe, Lukáš Burget, Jan Cěrnocký

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

Data augmentation is a simple and efficient technique to improve the robustness of a speech recognizer when deployed in mismatched training-test conditions. Our paper proposes a new approach for selecting data with respect to similarity of acoustic conditions. The similarity is computed based on a sequence summarizing neural network which extracts vectors containing acoustic summary (e.g. noise and reverberation characteristics) of an utterance. Several configurations of this network and different methods of selecting data using these "summary-vectors" were explored. The results are reported on a mismatched condition using AMI training set with the proposed data selection and CHiME3 test set.

Original languageEnglish
Pages (from-to)2354-2358
Number of pages5
JournalUnknown Journal
Volume08-12-September-2016
DOIs
Publication statusPublished - 2016
Externally publishedYes

Fingerprint

education
Acoustics
Neural Networks
Neural networks
Reverberation
Acoustic noise
Data Augmentation
acoustics
reverberation
Test Set
Robustness
Configuration
augmentation
configurations
Training
Data Selection
Summary
Mismatch
Similarity
Augmentation

Keywords

  • Automatic speech recognition
  • Data augmentation
  • Data selection
  • Mismatch training condition
  • Sequence summarization

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Modelling and Simulation

Cite this

Žmolíková, K., Karafiát, M., Veselý, K., Delcroix, M., Watanabe, S., Burget, L., & Cěrnocký, J. (2016). Data selection by sequence summarizing neural network in mismatch condition training. Unknown Journal, 08-12-September-2016, 2354-2358. https://doi.org/10.21437/Interspeech.2016-741

Data selection by sequence summarizing neural network in mismatch condition training. / Žmolíková, Kateřina; Karafiát, Martin; Veselý, Karel; Delcroix, Marc; Watanabe, Shinji; Burget, Lukáš; Cěrnocký, Jan.

In: Unknown Journal, Vol. 08-12-September-2016, 2016, p. 2354-2358.

Research output: Contribution to journalArticle

Žmolíková, K, Karafiát, M, Veselý, K, Delcroix, M, Watanabe, S, Burget, L & Cěrnocký, J 2016, 'Data selection by sequence summarizing neural network in mismatch condition training', Unknown Journal, vol. 08-12-September-2016, pp. 2354-2358. https://doi.org/10.21437/Interspeech.2016-741
Žmolíková K, Karafiát M, Veselý K, Delcroix M, Watanabe S, Burget L et al. Data selection by sequence summarizing neural network in mismatch condition training. Unknown Journal. 2016;08-12-September-2016:2354-2358. https://doi.org/10.21437/Interspeech.2016-741
Žmolíková, Kateřina ; Karafiát, Martin ; Veselý, Karel ; Delcroix, Marc ; Watanabe, Shinji ; Burget, Lukáš ; Cěrnocký, Jan. / Data selection by sequence summarizing neural network in mismatch condition training. In: Unknown Journal. 2016 ; Vol. 08-12-September-2016. pp. 2354-2358.
@article{6ddba5c309bb4ee1817487b6c7f62a20,
title = "Data selection by sequence summarizing neural network in mismatch condition training",
abstract = "Data augmentation is a simple and efficient technique to improve the robustness of a speech recognizer when deployed in mismatched training-test conditions. Our paper proposes a new approach for selecting data with respect to similarity of acoustic conditions. The similarity is computed based on a sequence summarizing neural network which extracts vectors containing acoustic summary (e.g. noise and reverberation characteristics) of an utterance. Several configurations of this network and different methods of selecting data using these {"}summary-vectors{"} were explored. The results are reported on a mismatched condition using AMI training set with the proposed data selection and CHiME3 test set.",
keywords = "Automatic speech recognition, Data augmentation, Data selection, Mismatch training condition, Sequence summarization",
author = "Kateřina Žmol{\'i}kov{\'a} and Martin Karafi{\'a}t and Karel Vesel{\'y} and Marc Delcroix and Shinji Watanabe and Luk{\'a}š Burget and Jan Cěrnock{\'y}",
year = "2016",
doi = "10.21437/Interspeech.2016-741",
language = "English",
volume = "08-12-September-2016",
pages = "2354--2358",
journal = "Nuclear Physics A",
issn = "0375-9474",
publisher = "Elsevier",

}

TY - JOUR

T1 - Data selection by sequence summarizing neural network in mismatch condition training

AU - Žmolíková, Kateřina

AU - Karafiát, Martin

AU - Veselý, Karel

AU - Delcroix, Marc

AU - Watanabe, Shinji

AU - Burget, Lukáš

AU - Cěrnocký, Jan

PY - 2016

Y1 - 2016

N2 - Data augmentation is a simple and efficient technique to improve the robustness of a speech recognizer when deployed in mismatched training-test conditions. Our paper proposes a new approach for selecting data with respect to similarity of acoustic conditions. The similarity is computed based on a sequence summarizing neural network which extracts vectors containing acoustic summary (e.g. noise and reverberation characteristics) of an utterance. Several configurations of this network and different methods of selecting data using these "summary-vectors" were explored. The results are reported on a mismatched condition using AMI training set with the proposed data selection and CHiME3 test set.

AB - Data augmentation is a simple and efficient technique to improve the robustness of a speech recognizer when deployed in mismatched training-test conditions. Our paper proposes a new approach for selecting data with respect to similarity of acoustic conditions. The similarity is computed based on a sequence summarizing neural network which extracts vectors containing acoustic summary (e.g. noise and reverberation characteristics) of an utterance. Several configurations of this network and different methods of selecting data using these "summary-vectors" were explored. The results are reported on a mismatched condition using AMI training set with the proposed data selection and CHiME3 test set.

KW - Automatic speech recognition

KW - Data augmentation

KW - Data selection

KW - Mismatch training condition

KW - Sequence summarization

UR - http://www.scopus.com/inward/record.url?scp=84994382229&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84994382229&partnerID=8YFLogxK

U2 - 10.21437/Interspeech.2016-741

DO - 10.21437/Interspeech.2016-741

M3 - Article

VL - 08-12-September-2016

SP - 2354

EP - 2358

JO - Nuclear Physics A

JF - Nuclear Physics A

SN - 0375-9474

ER -