A generalized discriminative training framework for system combination

Yuuki Tachioka, Shinji Watanabe, Jonathan Le Roux, John R. Hershey

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

This paper proposes a generalized discriminative training framework for system combination, which encompasses acoustic modeling (Gaussian mixture models and deep neural networks) and discriminative feature transformation. To improve the performance by combining base systems with complementary systems, complementary systems should have reasonably good performance while tending to have different outputs compared with the base system. Although it is difficult to balance these two somewhat opposite targets in conventional heuristic combination approaches, our framework provides a new objective function that enables to adjust the balance within a sequential discriminative training criterion. We also describe how the proposed method relates to boosting methods. Experiments on highly noisy middle vocabulary speech recognition task (2nd CHiME challenge track 2) and LVCSR task (Corpus of Spontaneous Japanese) show the effectiveness of the proposed method, compared with a conventional system combination approach.

Original languageEnglish
Title of host publication2013 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2013 - Proceedings
Pages43-48
Number of pages6
DOIs
Publication statusPublished - 2013
Externally publishedYes
Event2013 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2013 - Olomouc, Czech Republic
Duration: 2013 Dec 82013 Dec 13

Other

Other2013 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2013
CountryCzech Republic
CityOlomouc
Period13/12/813/12/13

Fingerprint

Neural Networks (Computer)
Vocabulary
Acoustics
Recognition (Psychology)
Heuristics

Keywords

  • boosting
  • discriminative training
  • margin training
  • system combination

ASJC Scopus subject areas

  • Speech and Hearing

Cite this

Tachioka, Y., Watanabe, S., Le Roux, J., & Hershey, J. R. (2013). A generalized discriminative training framework for system combination. In 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2013 - Proceedings (pp. 43-48). [6707703] https://doi.org/10.1109/ASRU.2013.6707703

A generalized discriminative training framework for system combination. / Tachioka, Yuuki; Watanabe, Shinji; Le Roux, Jonathan; Hershey, John R.

2013 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2013 - Proceedings. 2013. p. 43-48 6707703.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Tachioka, Y, Watanabe, S, Le Roux, J & Hershey, JR 2013, A generalized discriminative training framework for system combination. in 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2013 - Proceedings., 6707703, pp. 43-48, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2013, Olomouc, Czech Republic, 13/12/8. https://doi.org/10.1109/ASRU.2013.6707703
Tachioka Y, Watanabe S, Le Roux J, Hershey JR. A generalized discriminative training framework for system combination. In 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2013 - Proceedings. 2013. p. 43-48. 6707703 https://doi.org/10.1109/ASRU.2013.6707703
Tachioka, Yuuki ; Watanabe, Shinji ; Le Roux, Jonathan ; Hershey, John R. / A generalized discriminative training framework for system combination. 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2013 - Proceedings. 2013. pp. 43-48
@inproceedings{5f0f4e74a2894a79bcc43dd4ebd5a679,
title = "A generalized discriminative training framework for system combination",
abstract = "This paper proposes a generalized discriminative training framework for system combination, which encompasses acoustic modeling (Gaussian mixture models and deep neural networks) and discriminative feature transformation. To improve the performance by combining base systems with complementary systems, complementary systems should have reasonably good performance while tending to have different outputs compared with the base system. Although it is difficult to balance these two somewhat opposite targets in conventional heuristic combination approaches, our framework provides a new objective function that enables to adjust the balance within a sequential discriminative training criterion. We also describe how the proposed method relates to boosting methods. Experiments on highly noisy middle vocabulary speech recognition task (2nd CHiME challenge track 2) and LVCSR task (Corpus of Spontaneous Japanese) show the effectiveness of the proposed method, compared with a conventional system combination approach.",
keywords = "boosting, discriminative training, margin training, system combination",
author = "Yuuki Tachioka and Shinji Watanabe and {Le Roux}, Jonathan and Hershey, {John R.}",
year = "2013",
doi = "10.1109/ASRU.2013.6707703",
language = "English",
isbn = "9781479927562",
pages = "43--48",
booktitle = "2013 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2013 - Proceedings",

}

TY - GEN

T1 - A generalized discriminative training framework for system combination

AU - Tachioka, Yuuki

AU - Watanabe, Shinji

AU - Le Roux, Jonathan

AU - Hershey, John R.

PY - 2013

Y1 - 2013

N2 - This paper proposes a generalized discriminative training framework for system combination, which encompasses acoustic modeling (Gaussian mixture models and deep neural networks) and discriminative feature transformation. To improve the performance by combining base systems with complementary systems, complementary systems should have reasonably good performance while tending to have different outputs compared with the base system. Although it is difficult to balance these two somewhat opposite targets in conventional heuristic combination approaches, our framework provides a new objective function that enables to adjust the balance within a sequential discriminative training criterion. We also describe how the proposed method relates to boosting methods. Experiments on highly noisy middle vocabulary speech recognition task (2nd CHiME challenge track 2) and LVCSR task (Corpus of Spontaneous Japanese) show the effectiveness of the proposed method, compared with a conventional system combination approach.

AB - This paper proposes a generalized discriminative training framework for system combination, which encompasses acoustic modeling (Gaussian mixture models and deep neural networks) and discriminative feature transformation. To improve the performance by combining base systems with complementary systems, complementary systems should have reasonably good performance while tending to have different outputs compared with the base system. Although it is difficult to balance these two somewhat opposite targets in conventional heuristic combination approaches, our framework provides a new objective function that enables to adjust the balance within a sequential discriminative training criterion. We also describe how the proposed method relates to boosting methods. Experiments on highly noisy middle vocabulary speech recognition task (2nd CHiME challenge track 2) and LVCSR task (Corpus of Spontaneous Japanese) show the effectiveness of the proposed method, compared with a conventional system combination approach.

KW - boosting

KW - discriminative training

KW - margin training

KW - system combination

UR - http://www.scopus.com/inward/record.url?scp=84893650888&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84893650888&partnerID=8YFLogxK

U2 - 10.1109/ASRU.2013.6707703

DO - 10.1109/ASRU.2013.6707703

M3 - Conference contribution

SN - 9781479927562

SP - 43

EP - 48

BT - 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2013 - Proceedings

ER -