Sequence summarizing neural network for speaker adaptation

Karel Vesely, Shinji Watanabe, Katerina Zmolikova, Martin Karafiat, Lukas Burget, Jan Honza Cernocky

Research output: Chapter in Book/Report/Conference proceedingConference contribution

32 Citations (Scopus)

Abstract

In this paper, we propose a DNN adaptation technique, where the i-vector extractor is replaced by a Sequence Summarizing Neural Network (SSNN). Similarly to i-vector extractor, the SSNN produces a «summary vector», representing an acoustic summary of an utterance. Such vector is then appended to the input of main network, while both networks are trained together optimizing single loss function. Both the i-vector and SSNN speaker adaptation methods are compared on AMI meeting data. The results show comparable performance of both techniques on FBANK system with frame-classification training. Moreover, appending both the i-vector and «summary vector» to the FBANK features leads to additional improvement comparable to the performance of FMLLR adapted DNN system.

Original languageEnglish
Title of host publication2016 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages5315-5319
Number of pages5
Volume2016-May
ISBN (Electronic)9781479999880
DOIs
Publication statusPublished - 2016 May 18
Externally publishedYes
Event41st IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016 - Shanghai, China
Duration: 2016 Mar 202016 Mar 25

Other

Other41st IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016
CountryChina
CityShanghai
Period16/3/2016/3/25

    Fingerprint

Keywords

  • adaptation
  • DNN
  • i-vector
  • sequence summary
  • SSNN

ASJC Scopus subject areas

  • Signal Processing
  • Software
  • Electrical and Electronic Engineering

Cite this

Vesely, K., Watanabe, S., Zmolikova, K., Karafiat, M., Burget, L., & Cernocky, J. H. (2016). Sequence summarizing neural network for speaker adaptation. In 2016 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016 - Proceedings (Vol. 2016-May, pp. 5315-5319). [7472692] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICASSP.2016.7472692