Accent neutralization for speech recognition of non-native speakers

Kacper Radzikowski, Mateusz Forc, Le Wang, Osamu Yoshie, Robert Nowak

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

These days, automatic speech recognition (ASR) systems achieve higher and higher accuracy rates. The score drops significantly, in case when the ASR system is being used with a non-native speaker of the language to be recognized. The main reason is specific pronunciation and accent features. A limited volume of labeled nonnative speech datasets makes it difficult to train new ASR systems for non-native speakers. In our research,we tried tackling the problem and its influence on the accuracy of ASR systems, using the style transfer methodology. We designed a pipeline for modifying the speech of a non-native speaker, so that it resembles the native speech to a higher extent. Our methodology can be used as a wrapper for any existing ASR system, which reduces the necessity of training new algorithms for non-native speech. The modification can be thus performed before passing the data forward to the speech recognition system itself.

Original languageEnglish
Title of host publication21st International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2019 - Proceedings
EditorsMaria Indrawan-Santiago, Eric Pardede, Ivan Luiz Salvadori, Matthias Steinbauer, Ismail Khalil, Gabriele Anderst-Kotsis
PublisherAssociation for Computing Machinery
ISBN (Electronic)9781450371797
DOIs
Publication statusPublished - 2019 Dec 2
Event21st International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2019 - Munich, Germany
Duration: 2019 Dec 22019 Dec 4

Publication series

NameACM International Conference Proceeding Series

Conference

Conference21st International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2019
CountryGermany
CityMunich
Period19/12/219/12/4

Keywords

  • Deep learning
  • Machine learning
  • Neural network
  • Non-native speaker
  • Speech recognition
  • Style transfer

ASJC Scopus subject areas

  • Software
  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Computer Networks and Communications

Fingerprint Dive into the research topics of 'Accent neutralization for speech recognition of non-native speakers'. Together they form a unique fingerprint.

  • Cite this

    Radzikowski, K., Forc, M., Wang, L., Yoshie, O., & Nowak, R. (2019). Accent neutralization for speech recognition of non-native speakers. In M. Indrawan-Santiago, E. Pardede, I. L. Salvadori, M. Steinbauer, I. Khalil, & G. Anderst-Kotsis (Eds.), 21st International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2019 - Proceedings (ACM International Conference Proceeding Series). Association for Computing Machinery. https://doi.org/10.1145/3366030.3366083