Promising Accurate Prefix Boosting for Sequence-to-sequence ASR

Murali Karthick Baskar, Lukas Burget, Shinji Watanabe, Martin Karafiat, Takaaki Hori, Jan Honza Cernocky

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper, we present promising accurate prefix boosting (PAPB), a discriminative training technique for attention based sequence-to-sequence (seq2seq) ASR. PAPB is devised to unify the training and testing scheme effectively. The training procedure involves maximizing the score of each partial correct sequence obtained during beam search compared to other hypotheses. The training objective also includes minimization of token (character) error rate. PAPB shows its efficacy by achieving 10.8% and 3.8% WER with and without external RNNLM respectively on Wall Street Journal dataset.

Original languageEnglish
Title of host publication2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages5646-5650
Number of pages5
ISBN (Electronic)9781479981311
DOIs
Publication statusPublished - 2019 May 1
Event44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Brighton, United Kingdom
Duration: 2019 May 122019 May 17

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2019-May
ISSN (Print)1520-6149

Conference

Conference44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019
CountryUnited Kingdom
CityBrighton
Period19/5/1219/5/17

Fingerprint

Testing

Keywords

  • Attention models
  • Beam search training
  • discriminative training
  • sequence learning
  • softmax-margin

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Cite this

Baskar, M. K., Burget, L., Watanabe, S., Karafiat, M., Hori, T., & Cernocky, J. H. (2019). Promising Accurate Prefix Boosting for Sequence-to-sequence ASR. In 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings (pp. 5646-5650). [8682782] (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. 2019-May). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICASSP.2019.8682782

Promising Accurate Prefix Boosting for Sequence-to-sequence ASR. / Baskar, Murali Karthick; Burget, Lukas; Watanabe, Shinji; Karafiat, Martin; Hori, Takaaki; Cernocky, Jan Honza.

2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2019. p. 5646-5650 8682782 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. 2019-May).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Baskar, MK, Burget, L, Watanabe, S, Karafiat, M, Hori, T & Cernocky, JH 2019, Promising Accurate Prefix Boosting for Sequence-to-sequence ASR. in 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings., 8682782, ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, vol. 2019-May, Institute of Electrical and Electronics Engineers Inc., pp. 5646-5650, 44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019, Brighton, United Kingdom, 19/5/12. https://doi.org/10.1109/ICASSP.2019.8682782
Baskar MK, Burget L, Watanabe S, Karafiat M, Hori T, Cernocky JH. Promising Accurate Prefix Boosting for Sequence-to-sequence ASR. In 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings. Institute of Electrical and Electronics Engineers Inc. 2019. p. 5646-5650. 8682782. (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings). https://doi.org/10.1109/ICASSP.2019.8682782
Baskar, Murali Karthick ; Burget, Lukas ; Watanabe, Shinji ; Karafiat, Martin ; Hori, Takaaki ; Cernocky, Jan Honza. / Promising Accurate Prefix Boosting for Sequence-to-sequence ASR. 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2019. pp. 5646-5650 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings).
@inproceedings{2d603742f0314735a8eacedf9a1f71f1,
title = "Promising Accurate Prefix Boosting for Sequence-to-sequence ASR",
abstract = "In this paper, we present promising accurate prefix boosting (PAPB), a discriminative training technique for attention based sequence-to-sequence (seq2seq) ASR. PAPB is devised to unify the training and testing scheme effectively. The training procedure involves maximizing the score of each partial correct sequence obtained during beam search compared to other hypotheses. The training objective also includes minimization of token (character) error rate. PAPB shows its efficacy by achieving 10.8{\%} and 3.8{\%} WER with and without external RNNLM respectively on Wall Street Journal dataset.",
keywords = "Attention models, Beam search training, discriminative training, sequence learning, softmax-margin",
author = "Baskar, {Murali Karthick} and Lukas Burget and Shinji Watanabe and Martin Karafiat and Takaaki Hori and Cernocky, {Jan Honza}",
year = "2019",
month = "5",
day = "1",
doi = "10.1109/ICASSP.2019.8682782",
language = "English",
series = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "5646--5650",
booktitle = "2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings",

}

TY - GEN

T1 - Promising Accurate Prefix Boosting for Sequence-to-sequence ASR

AU - Baskar, Murali Karthick

AU - Burget, Lukas

AU - Watanabe, Shinji

AU - Karafiat, Martin

AU - Hori, Takaaki

AU - Cernocky, Jan Honza

PY - 2019/5/1

Y1 - 2019/5/1

N2 - In this paper, we present promising accurate prefix boosting (PAPB), a discriminative training technique for attention based sequence-to-sequence (seq2seq) ASR. PAPB is devised to unify the training and testing scheme effectively. The training procedure involves maximizing the score of each partial correct sequence obtained during beam search compared to other hypotheses. The training objective also includes minimization of token (character) error rate. PAPB shows its efficacy by achieving 10.8% and 3.8% WER with and without external RNNLM respectively on Wall Street Journal dataset.

AB - In this paper, we present promising accurate prefix boosting (PAPB), a discriminative training technique for attention based sequence-to-sequence (seq2seq) ASR. PAPB is devised to unify the training and testing scheme effectively. The training procedure involves maximizing the score of each partial correct sequence obtained during beam search compared to other hypotheses. The training objective also includes minimization of token (character) error rate. PAPB shows its efficacy by achieving 10.8% and 3.8% WER with and without external RNNLM respectively on Wall Street Journal dataset.

KW - Attention models

KW - Beam search training

KW - discriminative training

KW - sequence learning

KW - softmax-margin

UR - http://www.scopus.com/inward/record.url?scp=85068959446&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85068959446&partnerID=8YFLogxK

U2 - 10.1109/ICASSP.2019.8682782

DO - 10.1109/ICASSP.2019.8682782

M3 - Conference contribution

T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

SP - 5646

EP - 5650

BT - 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

ER -