TY - GEN
T1 - Effectiveness of discriminative training and feature transformation for reverberated and noisy speech
AU - Tachioka, Yuuki
AU - Watanabe, Shinji
AU - Hershey, John R.
PY - 2013/10/18
Y1 - 2013/10/18
N2 - Automatic speech recognition in the presence of non-stationary interference and reverberation remains a challenging problem. The 2nd 'CHiME' Speech Separation and Recognition Challenge introduces a new and difficult task with time-varying reverberation and non-stationary interference including natural background speech, home noises, or music. This paper establishes baselines using state-of-the-art ASR techniques such as discriminative training and various feature transformation on the middle-vocabulary sub-task of this challenge. In addition, we propose an augmented discriminative feature transformation that introduces arbitrary features to a discriminative feature transformation. We present experimental results showing that discriminative training of model parameters and feature transforms is highly effective for this task, and that the augmented feature transformation provides some preliminary benefits. The training code will be released as an advanced ASR baseline.
AB - Automatic speech recognition in the presence of non-stationary interference and reverberation remains a challenging problem. The 2nd 'CHiME' Speech Separation and Recognition Challenge introduces a new and difficult task with time-varying reverberation and non-stationary interference including natural background speech, home noises, or music. This paper establishes baselines using state-of-the-art ASR techniques such as discriminative training and various feature transformation on the middle-vocabulary sub-task of this challenge. In addition, we propose an augmented discriminative feature transformation that introduces arbitrary features to a discriminative feature transformation. We present experimental results showing that discriminative training of model parameters and feature transforms is highly effective for this task, and that the augmented feature transformation provides some preliminary benefits. The training code will be released as an advanced ASR baseline.
KW - Augmented discriminative feature transformation
KW - CHiME challenge
KW - Discriminative training
KW - Feature transformation
KW - Kaldi
UR - http://www.scopus.com/inward/record.url?scp=84890503970&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84890503970&partnerID=8YFLogxK
U2 - 10.1109/ICASSP.2013.6639006
DO - 10.1109/ICASSP.2013.6639006
M3 - Conference contribution
AN - SCOPUS:84890503970
SN - 9781479903566
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 6935
EP - 6939
BT - 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Proceedings
T2 - 2013 38th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013
Y2 - 26 May 2013 through 31 May 2013
ER -