Abstract
Recent work on multichannel end-to-end automatic speech recognition (ASR) has shown that multichannel speech enhancement and speech recognition functions can be integrated into a deep neural network (DNN)-based system, and promising experimental results have been shown using the CHiME-4 and AMI corpora. In other recent DNN-based hidden Markov model (DNN-HMM) hybrid architectures, the effectiveness of speaker adaptation has been well established. Motivated by these results, we propose a multi-path adaptation scheme for end-to-end multichannel ASR, which combines the unprocessed noisy speech features with a speech-enhanced pathway to improve upon previous end-to-end ASR approaches. Experimental results using CHiME-4 show that (1) our proposed multi-path adaptation scheme improves ASR performance and (2) adapting the encoder network is more effective than adapting the neural beamformer, attention mechanism, or decoder network.
Original language | English |
---|---|
Title of host publication | 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Proceedings |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 6707-6711 |
Number of pages | 5 |
Volume | 2018-April |
ISBN (Print) | 9781538646588 |
DOIs | |
Publication status | Published - 2018 Sep 10 |
Externally published | Yes |
Event | 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 - Calgary, Canada Duration: 2018 Apr 15 → 2018 Apr 20 |
Other
Other | 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018 |
---|---|
Country | Canada |
City | Calgary |
Period | 18/4/15 → 18/4/20 |
Keywords
- Attention-based encoder-decoder
- Multichannel end-to-end ASR
- Neural beamformer
- Speaker adaptation
ASJC Scopus subject areas
- Software
- Signal Processing
- Electrical and Electronic Engineering