TY - GEN
T1 - A generalized discriminative training framework for system combination
AU - Tachioka, Yuuki
AU - Watanabe, Shinji
AU - Le Roux, Jonathan
AU - Hershey, John R.
PY - 2013
Y1 - 2013
N2 - This paper proposes a generalized discriminative training framework for system combination, which encompasses acoustic modeling (Gaussian mixture models and deep neural networks) and discriminative feature transformation. To improve the performance by combining base systems with complementary systems, complementary systems should have reasonably good performance while tending to have different outputs compared with the base system. Although it is difficult to balance these two somewhat opposite targets in conventional heuristic combination approaches, our framework provides a new objective function that enables to adjust the balance within a sequential discriminative training criterion. We also describe how the proposed method relates to boosting methods. Experiments on highly noisy middle vocabulary speech recognition task (2nd CHiME challenge track 2) and LVCSR task (Corpus of Spontaneous Japanese) show the effectiveness of the proposed method, compared with a conventional system combination approach.
AB - This paper proposes a generalized discriminative training framework for system combination, which encompasses acoustic modeling (Gaussian mixture models and deep neural networks) and discriminative feature transformation. To improve the performance by combining base systems with complementary systems, complementary systems should have reasonably good performance while tending to have different outputs compared with the base system. Although it is difficult to balance these two somewhat opposite targets in conventional heuristic combination approaches, our framework provides a new objective function that enables to adjust the balance within a sequential discriminative training criterion. We also describe how the proposed method relates to boosting methods. Experiments on highly noisy middle vocabulary speech recognition task (2nd CHiME challenge track 2) and LVCSR task (Corpus of Spontaneous Japanese) show the effectiveness of the proposed method, compared with a conventional system combination approach.
KW - boosting
KW - discriminative training
KW - margin training
KW - system combination
UR - http://www.scopus.com/inward/record.url?scp=84893650888&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84893650888&partnerID=8YFLogxK
U2 - 10.1109/ASRU.2013.6707703
DO - 10.1109/ASRU.2013.6707703
M3 - Conference contribution
AN - SCOPUS:84893650888
SN - 9781479927562
T3 - 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2013 - Proceedings
SP - 43
EP - 48
BT - 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2013 - Proceedings
T2 - 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2013
Y2 - 8 December 2013 through 13 December 2013
ER -