Low-Resource Contextual Topic Identification on Speech

Chunxi Liu, Matthew Wiesner, Shinji Watanabe, Craig Harman, Jan Trmal, Najim Dehak, Sanjeev Khudanpur

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In topic identification (topic ID) on real-world unstructured audio, an audio instance of variable topic shifts is first broken into sequential segments, and each segment is independently classified. We first present a general purpose method for topic ID on spoken segments in low-resource languages, using a cascade of universal acoustic modeling, translation lexicons to English, and English-language topic classification. Next, instead of classifying each segment independently, we demonstrate that exploring the contextual dependencies across sequential segments can provide large improvements. In particular, we propose an attention-based contextual model which is able to leverage the contexts in a selective manner. We test both our contextual and non-contextual models on four LORELEI languages, and on all but one our attention-based contextual model significantly outperforms the context-independent models.

Original languageEnglish
Title of host publication2018 IEEE Spoken Language Technology Workshop, SLT 2018 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages656-663
Number of pages8
ISBN (Electronic)9781538643341
DOIs
Publication statusPublished - 2019 Feb 11
Externally publishedYes
Event2018 IEEE Spoken Language Technology Workshop, SLT 2018 - Athens, Greece
Duration: 2018 Dec 182018 Dec 21

Publication series

Name2018 IEEE Spoken Language Technology Workshop, SLT 2018 - Proceedings

Conference

Conference2018 IEEE Spoken Language Technology Workshop, SLT 2018
CountryGreece
CityAthens
Period18/12/1818/12/21

Fingerprint

resources
language
acoustics
English language
Acoustics

Keywords

  • attention
  • recurrent neural networks
  • Topic identification
  • universal acoustic modeling

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition
  • Human-Computer Interaction
  • Linguistics and Language

Cite this

Liu, C., Wiesner, M., Watanabe, S., Harman, C., Trmal, J., Dehak, N., & Khudanpur, S. (2019). Low-Resource Contextual Topic Identification on Speech. In 2018 IEEE Spoken Language Technology Workshop, SLT 2018 - Proceedings (pp. 656-663). [8639544] (2018 IEEE Spoken Language Technology Workshop, SLT 2018 - Proceedings). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/SLT.2018.8639544

Low-Resource Contextual Topic Identification on Speech. / Liu, Chunxi; Wiesner, Matthew; Watanabe, Shinji; Harman, Craig; Trmal, Jan; Dehak, Najim; Khudanpur, Sanjeev.

2018 IEEE Spoken Language Technology Workshop, SLT 2018 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2019. p. 656-663 8639544 (2018 IEEE Spoken Language Technology Workshop, SLT 2018 - Proceedings).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Liu, C, Wiesner, M, Watanabe, S, Harman, C, Trmal, J, Dehak, N & Khudanpur, S 2019, Low-Resource Contextual Topic Identification on Speech. in 2018 IEEE Spoken Language Technology Workshop, SLT 2018 - Proceedings., 8639544, 2018 IEEE Spoken Language Technology Workshop, SLT 2018 - Proceedings, Institute of Electrical and Electronics Engineers Inc., pp. 656-663, 2018 IEEE Spoken Language Technology Workshop, SLT 2018, Athens, Greece, 18/12/18. https://doi.org/10.1109/SLT.2018.8639544
Liu C, Wiesner M, Watanabe S, Harman C, Trmal J, Dehak N et al. Low-Resource Contextual Topic Identification on Speech. In 2018 IEEE Spoken Language Technology Workshop, SLT 2018 - Proceedings. Institute of Electrical and Electronics Engineers Inc. 2019. p. 656-663. 8639544. (2018 IEEE Spoken Language Technology Workshop, SLT 2018 - Proceedings). https://doi.org/10.1109/SLT.2018.8639544
Liu, Chunxi ; Wiesner, Matthew ; Watanabe, Shinji ; Harman, Craig ; Trmal, Jan ; Dehak, Najim ; Khudanpur, Sanjeev. / Low-Resource Contextual Topic Identification on Speech. 2018 IEEE Spoken Language Technology Workshop, SLT 2018 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2019. pp. 656-663 (2018 IEEE Spoken Language Technology Workshop, SLT 2018 - Proceedings).
@inproceedings{908ae5f1b72540cd8eb0d73b259f70cb,
title = "Low-Resource Contextual Topic Identification on Speech",
abstract = "In topic identification (topic ID) on real-world unstructured audio, an audio instance of variable topic shifts is first broken into sequential segments, and each segment is independently classified. We first present a general purpose method for topic ID on spoken segments in low-resource languages, using a cascade of universal acoustic modeling, translation lexicons to English, and English-language topic classification. Next, instead of classifying each segment independently, we demonstrate that exploring the contextual dependencies across sequential segments can provide large improvements. In particular, we propose an attention-based contextual model which is able to leverage the contexts in a selective manner. We test both our contextual and non-contextual models on four LORELEI languages, and on all but one our attention-based contextual model significantly outperforms the context-independent models.",
keywords = "attention, recurrent neural networks, Topic identification, universal acoustic modeling",
author = "Chunxi Liu and Matthew Wiesner and Shinji Watanabe and Craig Harman and Jan Trmal and Najim Dehak and Sanjeev Khudanpur",
year = "2019",
month = "2",
day = "11",
doi = "10.1109/SLT.2018.8639544",
language = "English",
series = "2018 IEEE Spoken Language Technology Workshop, SLT 2018 - Proceedings",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "656--663",
booktitle = "2018 IEEE Spoken Language Technology Workshop, SLT 2018 - Proceedings",

}

TY - GEN

T1 - Low-Resource Contextual Topic Identification on Speech

AU - Liu, Chunxi

AU - Wiesner, Matthew

AU - Watanabe, Shinji

AU - Harman, Craig

AU - Trmal, Jan

AU - Dehak, Najim

AU - Khudanpur, Sanjeev

PY - 2019/2/11

Y1 - 2019/2/11

N2 - In topic identification (topic ID) on real-world unstructured audio, an audio instance of variable topic shifts is first broken into sequential segments, and each segment is independently classified. We first present a general purpose method for topic ID on spoken segments in low-resource languages, using a cascade of universal acoustic modeling, translation lexicons to English, and English-language topic classification. Next, instead of classifying each segment independently, we demonstrate that exploring the contextual dependencies across sequential segments can provide large improvements. In particular, we propose an attention-based contextual model which is able to leverage the contexts in a selective manner. We test both our contextual and non-contextual models on four LORELEI languages, and on all but one our attention-based contextual model significantly outperforms the context-independent models.

AB - In topic identification (topic ID) on real-world unstructured audio, an audio instance of variable topic shifts is first broken into sequential segments, and each segment is independently classified. We first present a general purpose method for topic ID on spoken segments in low-resource languages, using a cascade of universal acoustic modeling, translation lexicons to English, and English-language topic classification. Next, instead of classifying each segment independently, we demonstrate that exploring the contextual dependencies across sequential segments can provide large improvements. In particular, we propose an attention-based contextual model which is able to leverage the contexts in a selective manner. We test both our contextual and non-contextual models on four LORELEI languages, and on all but one our attention-based contextual model significantly outperforms the context-independent models.

KW - attention

KW - recurrent neural networks

KW - Topic identification

KW - universal acoustic modeling

UR - http://www.scopus.com/inward/record.url?scp=85063079080&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85063079080&partnerID=8YFLogxK

U2 - 10.1109/SLT.2018.8639544

DO - 10.1109/SLT.2018.8639544

M3 - Conference contribution

AN - SCOPUS:85063079080

T3 - 2018 IEEE Spoken Language Technology Workshop, SLT 2018 - Proceedings

SP - 656

EP - 663

BT - 2018 IEEE Spoken Language Technology Workshop, SLT 2018 - Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

ER -