Black box optimization for automatic speech recognition

Shinji Watanabe, Jonathan Le Roux

研究成果: Conference contribution

16 被引用数 (Scopus)

抄録

State-of-the-art automatic speech recognition (ASR) systems are very complex, combining multiple techniques and involving many types of tuning parameters (e.g., numbers of states and Gaussians in HMMs, numbers of neurons/layers and learning rates in neural networks, etc.). To reach optimal performance in such systems, deep understanding and expertise of each component is necessary, thus limiting the development of ASR systems to skilled experts. To overcome the problem, this paper studies the use of black box optimization, which automatically tunes systems without any prior knowledge. We consider an ASR system as a function with tuning parameters as input and speech recognition performance (e.g., word accuracy) as output, and we investigate two probabilistic black box optimization techniques: Covariance Mean Adaptation Evolution Strategy (CMA-ES) and Bayesian optimization using Gaussian process. Middle-vocabulary speech recognition experiments show the effectiveness of black box optimization, as performance approaching that of fine-tuned systems obtained by experts and/or outperforming that of sub-optimal systems can be automatically obtained.

本文言語English
ホスト出版物のタイトル2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014
出版社Institute of Electrical and Electronics Engineers Inc.
ページ3256-3260
ページ数5
ISBN(印刷版)9781479928927
DOI
出版ステータスPublished - 2014 1 1
外部発表はい
イベント2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014 - Florence, Italy
継続期間: 2014 5 42014 5 9

出版物シリーズ

名前ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN(印刷版)1520-6149

Conference

Conference2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014
CountryItaly
CityFlorence
Period14/5/414/5/9

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

フィンガープリント 「Black box optimization for automatic speech recognition」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル