A hybrid approach to enhance task portability of acoustic models in Chinese speech recognition

Jin Song Zhang, Shu Wu Zhang, Yoshinori Sagisaka, Satoshi Nakamura

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

This paper presents our approach to enhance the portability of acoustic models by mitigating the phonetic mismatch arising from a new testing task which is rather different from the training data. The approach is a hybrid one which combines knowledge-based context categorization to generate a context rich set of subword units, and data-driven-based acoustic model clustering on the level of context category. Compared with the conventional approach of only phonetic decision tree based model clustering and unseen model generation, the new approach improved greatly the desired subword coverage for the new testing domain, and achieved an error rate reduction by 10.8% for Chinese character accuracy in the recognition experiments. Together with the effect of the newly adopted basic units of 9 glottal stops, we achieved a total 23.5% error rate reduction in the testing compared to the baseline system.

Original languageEnglish
Title of host publicationEUROSPEECH 2001 - SCANDINAVIA - 7th European Conference on Speech Communication and Technology
PublisherInternational Speech Communication Association
Pages1661-1664
Number of pages4
ISBN (Electronic)8790834100, 9788790834104
Publication statusPublished - 2001
Externally publishedYes
Event7th European Conference on Speech Communication and Technology - Scandinavia, EUROSPEECH 2001 - Aalborg, Denmark
Duration: 2001 Sep 32001 Sep 7

Other

Other7th European Conference on Speech Communication and Technology - Scandinavia, EUROSPEECH 2001
CountryDenmark
CityAalborg
Period01/9/301/9/7

    Fingerprint

ASJC Scopus subject areas

  • Communication
  • Linguistics and Language
  • Computer Science Applications
  • Software

Cite this

Zhang, J. S., Zhang, S. W., Sagisaka, Y., & Nakamura, S. (2001). A hybrid approach to enhance task portability of acoustic models in Chinese speech recognition. In EUROSPEECH 2001 - SCANDINAVIA - 7th European Conference on Speech Communication and Technology (pp. 1661-1664). International Speech Communication Association.