Sequence summarizing neural network for speaker adaptation

Karel Vesely, Shinji Watanabe, Katerina Zmolikova, Martin Karafiat, Lukas Burget, Jan Honza Cernocky

研究成果: Conference contribution

44 被引用数 (Scopus)

抄録

In this paper, we propose a DNN adaptation technique, where the i-vector extractor is replaced by a Sequence Summarizing Neural Network (SSNN). Similarly to i-vector extractor, the SSNN produces a «summary vector», representing an acoustic summary of an utterance. Such vector is then appended to the input of main network, while both networks are trained together optimizing single loss function. Both the i-vector and SSNN speaker adaptation methods are compared on AMI meeting data. The results show comparable performance of both techniques on FBANK system with frame-classification training. Moreover, appending both the i-vector and «summary vector» to the FBANK features leads to additional improvement comparable to the performance of FMLLR adapted DNN system.

本文言語English
ホスト出版物のタイトル2016 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016 - Proceedings
出版社Institute of Electrical and Electronics Engineers Inc.
ページ5315-5319
ページ数5
ISBN(電子版)9781479999880
DOI
出版ステータスPublished - 2016 5 18
外部発表はい
イベント41st IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016 - Shanghai, China
継続期間: 2016 3 202016 3 25

出版物シリーズ

名前ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
2016-May
ISSN(印刷版)1520-6149

Other

Other41st IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016
国/地域China
CityShanghai
Period16/3/2016/3/25

ASJC Scopus subject areas

  • ソフトウェア
  • 信号処理
  • 電子工学および電気工学

フィンガープリント

「Sequence summarizing neural network for speaker adaptation」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル