Acoustic Modeling for Overlapping Speech Recognition: Jhu Chime-5 Challenge System

Vimal Manohar, Szu Jui Chen, Zhiqi Wang, Y. Fujita, Shinji Watanabe, Sanjeev Khudanpur

研究成果: Conference contribution

抄録

This paper summarizes our acoustic modeling efforts in the Johns Hopkins University speech recognition system for the CHiME-5 challenge to recognize highly-overlapped dinner party speech recorded by multiple microphone arrays. We explore data augmentation approaches, neural network architectures, front-end speech dereverberation, beamforming and robust i-vector extraction with comparisons of our in-house implementations and publicly available tools. We finally achieved a word error rate of 69.4% on the development set, which is a 11.7% absolute improvement over the previous baseline of 81.1%, and release this improved baseline with refined techniques/tools as an advanced CHiME-5 recipe.

元の言語English
ホスト出版物のタイトル2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings
出版者Institute of Electrical and Electronics Engineers Inc.
ページ6665-6669
ページ数5
ISBN(電子版)9781479981311
DOI
出版物ステータスPublished - 2019 5 1
イベント44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Brighton, United Kingdom
継続期間: 2019 5 122019 5 17

出版物シリーズ

名前ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
2019-May
ISSN(印刷物)1520-6149

Conference

Conference44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019
United Kingdom
Brighton
期間19/5/1219/5/17

Fingerprint

Speech recognition
Acoustics
Microphones
Beamforming
Network architecture
Neural networks

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

これを引用

Manohar, V., Chen, S. J., Wang, Z., Fujita, Y., Watanabe, S., & Khudanpur, S. (2019). Acoustic Modeling for Overlapping Speech Recognition: Jhu Chime-5 Challenge System. : 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings (pp. 6665-6669). [8682556] (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; 巻数 2019-May). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICASSP.2019.8682556

Acoustic Modeling for Overlapping Speech Recognition : Jhu Chime-5 Challenge System. / Manohar, Vimal; Chen, Szu Jui; Wang, Zhiqi; Fujita, Y.; Watanabe, Shinji; Khudanpur, Sanjeev.

2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2019. p. 6665-6669 8682556 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; 巻 2019-May).

研究成果: Conference contribution

Manohar, V, Chen, SJ, Wang, Z, Fujita, Y, Watanabe, S & Khudanpur, S 2019, Acoustic Modeling for Overlapping Speech Recognition: Jhu Chime-5 Challenge System. : 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings., 8682556, ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 巻. 2019-May, Institute of Electrical and Electronics Engineers Inc., pp. 6665-6669, 44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019, Brighton, United Kingdom, 19/5/12. https://doi.org/10.1109/ICASSP.2019.8682556
Manohar V, Chen SJ, Wang Z, Fujita Y, Watanabe S, Khudanpur S. Acoustic Modeling for Overlapping Speech Recognition: Jhu Chime-5 Challenge System. : 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings. Institute of Electrical and Electronics Engineers Inc. 2019. p. 6665-6669. 8682556. (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings). https://doi.org/10.1109/ICASSP.2019.8682556
Manohar, Vimal ; Chen, Szu Jui ; Wang, Zhiqi ; Fujita, Y. ; Watanabe, Shinji ; Khudanpur, Sanjeev. / Acoustic Modeling for Overlapping Speech Recognition : Jhu Chime-5 Challenge System. 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2019. pp. 6665-6669 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings).
@inproceedings{713e7ce29b94489e834379ea7d7eba95,
title = "Acoustic Modeling for Overlapping Speech Recognition: Jhu Chime-5 Challenge System",
abstract = "This paper summarizes our acoustic modeling efforts in the Johns Hopkins University speech recognition system for the CHiME-5 challenge to recognize highly-overlapped dinner party speech recorded by multiple microphone arrays. We explore data augmentation approaches, neural network architectures, front-end speech dereverberation, beamforming and robust i-vector extraction with comparisons of our in-house implementations and publicly available tools. We finally achieved a word error rate of 69.4{\%} on the development set, which is a 11.7{\%} absolute improvement over the previous baseline of 81.1{\%}, and release this improved baseline with refined techniques/tools as an advanced CHiME-5 recipe.",
keywords = "acoustic modeling, CHiME-5 challenge, Kaldi, Robust speech recognition",
author = "Vimal Manohar and Chen, {Szu Jui} and Zhiqi Wang and Y. Fujita and Shinji Watanabe and Sanjeev Khudanpur",
year = "2019",
month = "5",
day = "1",
doi = "10.1109/ICASSP.2019.8682556",
language = "English",
series = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "6665--6669",
booktitle = "2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings",

}

TY - GEN

T1 - Acoustic Modeling for Overlapping Speech Recognition

T2 - Jhu Chime-5 Challenge System

AU - Manohar, Vimal

AU - Chen, Szu Jui

AU - Wang, Zhiqi

AU - Fujita, Y.

AU - Watanabe, Shinji

AU - Khudanpur, Sanjeev

PY - 2019/5/1

Y1 - 2019/5/1

N2 - This paper summarizes our acoustic modeling efforts in the Johns Hopkins University speech recognition system for the CHiME-5 challenge to recognize highly-overlapped dinner party speech recorded by multiple microphone arrays. We explore data augmentation approaches, neural network architectures, front-end speech dereverberation, beamforming and robust i-vector extraction with comparisons of our in-house implementations and publicly available tools. We finally achieved a word error rate of 69.4% on the development set, which is a 11.7% absolute improvement over the previous baseline of 81.1%, and release this improved baseline with refined techniques/tools as an advanced CHiME-5 recipe.

AB - This paper summarizes our acoustic modeling efforts in the Johns Hopkins University speech recognition system for the CHiME-5 challenge to recognize highly-overlapped dinner party speech recorded by multiple microphone arrays. We explore data augmentation approaches, neural network architectures, front-end speech dereverberation, beamforming and robust i-vector extraction with comparisons of our in-house implementations and publicly available tools. We finally achieved a word error rate of 69.4% on the development set, which is a 11.7% absolute improvement over the previous baseline of 81.1%, and release this improved baseline with refined techniques/tools as an advanced CHiME-5 recipe.

KW - acoustic modeling

KW - CHiME-5 challenge

KW - Kaldi

KW - Robust speech recognition

UR - http://www.scopus.com/inward/record.url?scp=85068973715&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85068973715&partnerID=8YFLogxK

U2 - 10.1109/ICASSP.2019.8682556

DO - 10.1109/ICASSP.2019.8682556

M3 - Conference contribution

AN - SCOPUS:85068973715

T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

SP - 6665

EP - 6669

BT - 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

ER -