Effectiveness of discriminative training and feature transformation for reverberated and noisy speech

Yuuki Tachioka, Shinji Watanabe, John R. Hershey

研究成果: Conference contribution

13 引用 (Scopus)

抄録

Automatic speech recognition in the presence of non-stationary interference and reverberation remains a challenging problem. The 2nd 'CHiME' Speech Separation and Recognition Challenge introduces a new and difficult task with time-varying reverberation and non-stationary interference including natural background speech, home noises, or music. This paper establishes baselines using state-of-the-art ASR techniques such as discriminative training and various feature transformation on the middle-vocabulary sub-task of this challenge. In addition, we propose an augmented discriminative feature transformation that introduces arbitrary features to a discriminative feature transformation. We present experimental results showing that discriminative training of model parameters and feature transforms is highly effective for this task, and that the augmented feature transformation provides some preliminary benefits. The training code will be released as an advanced ASR baseline.

元の言語English
ホスト出版物のタイトル2013 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Proceedings
ページ6935-6939
ページ数5
DOI
出版物ステータスPublished - 2013 10 18
外部発表Yes
イベント2013 38th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Vancouver, BC
継続期間: 2013 5 262013 5 31

Other

Other2013 38th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013
Vancouver, BC
期間13/5/2613/5/31

Fingerprint

Reverberation
Speech recognition

ASJC Scopus subject areas

  • Signal Processing
  • Software
  • Electrical and Electronic Engineering

これを引用

Tachioka, Y., Watanabe, S., & Hershey, J. R. (2013). Effectiveness of discriminative training and feature transformation for reverberated and noisy speech. : 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Proceedings (pp. 6935-6939). [6639006] https://doi.org/10.1109/ICASSP.2013.6639006

Effectiveness of discriminative training and feature transformation for reverberated and noisy speech. / Tachioka, Yuuki; Watanabe, Shinji; Hershey, John R.

2013 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Proceedings. 2013. p. 6935-6939 6639006.

研究成果: Conference contribution

Tachioka, Y, Watanabe, S & Hershey, JR 2013, Effectiveness of discriminative training and feature transformation for reverberated and noisy speech. : 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Proceedings., 6639006, pp. 6935-6939, 2013 38th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013, Vancouver, BC, 13/5/26. https://doi.org/10.1109/ICASSP.2013.6639006
Tachioka Y, Watanabe S, Hershey JR. Effectiveness of discriminative training and feature transformation for reverberated and noisy speech. : 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Proceedings. 2013. p. 6935-6939. 6639006 https://doi.org/10.1109/ICASSP.2013.6639006
Tachioka, Yuuki ; Watanabe, Shinji ; Hershey, John R. / Effectiveness of discriminative training and feature transformation for reverberated and noisy speech. 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Proceedings. 2013. pp. 6935-6939
@inproceedings{33dddb9d89fc47b999891f102de15a6f,
title = "Effectiveness of discriminative training and feature transformation for reverberated and noisy speech",
abstract = "Automatic speech recognition in the presence of non-stationary interference and reverberation remains a challenging problem. The 2nd 'CHiME' Speech Separation and Recognition Challenge introduces a new and difficult task with time-varying reverberation and non-stationary interference including natural background speech, home noises, or music. This paper establishes baselines using state-of-the-art ASR techniques such as discriminative training and various feature transformation on the middle-vocabulary sub-task of this challenge. In addition, we propose an augmented discriminative feature transformation that introduces arbitrary features to a discriminative feature transformation. We present experimental results showing that discriminative training of model parameters and feature transforms is highly effective for this task, and that the augmented feature transformation provides some preliminary benefits. The training code will be released as an advanced ASR baseline.",
keywords = "Augmented discriminative feature transformation, CHiME challenge, Discriminative training, Feature transformation, Kaldi",
author = "Yuuki Tachioka and Shinji Watanabe and Hershey, {John R.}",
year = "2013",
month = "10",
day = "18",
doi = "10.1109/ICASSP.2013.6639006",
language = "English",
isbn = "9781479903566",
pages = "6935--6939",
booktitle = "2013 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Proceedings",

}

TY - GEN

T1 - Effectiveness of discriminative training and feature transformation for reverberated and noisy speech

AU - Tachioka, Yuuki

AU - Watanabe, Shinji

AU - Hershey, John R.

PY - 2013/10/18

Y1 - 2013/10/18

N2 - Automatic speech recognition in the presence of non-stationary interference and reverberation remains a challenging problem. The 2nd 'CHiME' Speech Separation and Recognition Challenge introduces a new and difficult task with time-varying reverberation and non-stationary interference including natural background speech, home noises, or music. This paper establishes baselines using state-of-the-art ASR techniques such as discriminative training and various feature transformation on the middle-vocabulary sub-task of this challenge. In addition, we propose an augmented discriminative feature transformation that introduces arbitrary features to a discriminative feature transformation. We present experimental results showing that discriminative training of model parameters and feature transforms is highly effective for this task, and that the augmented feature transformation provides some preliminary benefits. The training code will be released as an advanced ASR baseline.

AB - Automatic speech recognition in the presence of non-stationary interference and reverberation remains a challenging problem. The 2nd 'CHiME' Speech Separation and Recognition Challenge introduces a new and difficult task with time-varying reverberation and non-stationary interference including natural background speech, home noises, or music. This paper establishes baselines using state-of-the-art ASR techniques such as discriminative training and various feature transformation on the middle-vocabulary sub-task of this challenge. In addition, we propose an augmented discriminative feature transformation that introduces arbitrary features to a discriminative feature transformation. We present experimental results showing that discriminative training of model parameters and feature transforms is highly effective for this task, and that the augmented feature transformation provides some preliminary benefits. The training code will be released as an advanced ASR baseline.

KW - Augmented discriminative feature transformation

KW - CHiME challenge

KW - Discriminative training

KW - Feature transformation

KW - Kaldi

UR - http://www.scopus.com/inward/record.url?scp=84890503970&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84890503970&partnerID=8YFLogxK

U2 - 10.1109/ICASSP.2013.6639006

DO - 10.1109/ICASSP.2013.6639006

M3 - Conference contribution

AN - SCOPUS:84890503970

SN - 9781479903566

SP - 6935

EP - 6939

BT - 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Proceedings

ER -