Cross-Lingual Transfer for Speech Processing Using Acoustic Language Similarity

Peter Wu*, Jiatong Shi, Yifan Zhong, Shinji Watanabe, Alan W. Black

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Speech processing systems currently do not support the vast majority of languages, in part due to the lack of data in low-resource languages. Cross-lingual transfer offers a compelling way to help bridge this digital divide by incorporating high-resource data into low-resource systems. Current cross-lingual algorithms have shown success in text-based tasks and speech-related tasks over some low-resource languages. However, scaling up speech systems to support hundreds of low-resource languages remains unsolved. To help bridge this gap, we propose a language similarity approach that can efficiently identify acoustic cross-lingual transfer pairs across hundreds of languages. We demonstrate the effectiveness of our approach in language family classification, speech recognition, and speech synthesis tasks.

Original languageEnglish
Title of host publication2021 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2021 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1050-1057
Number of pages8
ISBN (Electronic)9781665437394
DOIs
Publication statusPublished - 2021
Externally publishedYes
Event2021 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2021 - Cartagena, Colombia
Duration: 2021 Dec 132021 Dec 17

Publication series

Name2021 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2021 - Proceedings

Conference

Conference2021 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2021
Country/TerritoryColombia
CityCartagena
Period21/12/1321/12/17

Keywords

  • ASR
  • cross-lingual
  • TTS
  • zero-shot

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition
  • Signal Processing
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Cross-Lingual Transfer for Speech Processing Using Acoustic Language Similarity'. Together they form a unique fingerprint.

Cite this