TY - GEN
T1 - Transfer Learning of Language-independent End-to-end ASR with Language Model Fusion
AU - Inaguma, Hirofumi
AU - Cho, Jaejin
AU - Baskar, Murali Karthick
AU - Kawahara, Tatsuya
AU - Watanabe, Shinji
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/5
Y1 - 2019/5
N2 - This work explores better adaptation methods to low-resource languages using an external language model (LM) under the framework of transfer learning. We first build a language-independent ASR system in a unified sequence-to-sequence (S2S) architecture with a shared vocabulary among all languages. During adaptation, we perform LM fusion transfer, where an external LM is integrated into the decoder network of the attention-based S2S model in the whole adaptation stage, to effectively incorporate linguistic context of the target language. We also investigate various seed models for transfer learning. Experimental evaluations using the IARPA BABEL data set show that LM fusion transfer improves performances on all target five languages compared with simple transfer learning when the external text data is available. Our final system drastically reduces the performance gap from the hybrid systems.
AB - This work explores better adaptation methods to low-resource languages using an external language model (LM) under the framework of transfer learning. We first build a language-independent ASR system in a unified sequence-to-sequence (S2S) architecture with a shared vocabulary among all languages. During adaptation, we perform LM fusion transfer, where an external LM is integrated into the decoder network of the attention-based S2S model in the whole adaptation stage, to effectively incorporate linguistic context of the target language. We also investigate various seed models for transfer learning. Experimental evaluations using the IARPA BABEL data set show that LM fusion transfer improves performances on all target five languages compared with simple transfer learning when the external text data is available. Our final system drastically reduces the performance gap from the hybrid systems.
KW - end-to-end ASR
KW - low-resource language
KW - multilingual speech recognition
KW - transfer learning
UR - http://www.scopus.com/inward/record.url?scp=85068996878&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85068996878&partnerID=8YFLogxK
U2 - 10.1109/ICASSP.2019.8682918
DO - 10.1109/ICASSP.2019.8682918
M3 - Conference contribution
AN - SCOPUS:85068996878
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 6096
EP - 6100
BT - 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019
Y2 - 12 May 2019 through 17 May 2019
ER -