Improving singing aid system for laryngectomees with statistical voice conversion and VAE-space

Li Li, Tomoki Toda, Kazuho Morikawa, Kazuhiro Kobayashi, Shoji Makino

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

This paper proposes an improved singing aid system for laryngectomees that converts electrolaryngeal (EL) speech produced using an electrolarynx to a more naturally sounding singing voice. Although the previously proposed system employing a noise suppression process and a rulebased pitch control approach has achieved preliminary success in converting EL speech into a singing voice, there are still two major limitations. First, the converted singing voice still sounds mechanical and unnatural owing to the adverse impacts of spectrograms extracted from EL speeches, also making the effect of pitch control limited. Second, the capability and flexibility of the rulebased pitch control in modeling various singing styles are insufficient, causing the converted singing voices to lack variety. To address these limitations, this paper proposes an improved system that uses 1) a statistical voice conversion approach to convert spectrograms extracted from EL speeches into those of natural speeches and 2) a deep generative model-based approach called VAE-SPACE for pitch modification, which generates pitch patterns in a data-driven manner instead of following manually designed rules. The experimental results revealed that 1) the conversion of spectrograms was effective in improving the naturalness of singing voices, and 2) the statistical pitch control approach was able to achieve comparable results with the rule-based approach, which was very carefully designed to be specialized in singing.

Original languageEnglish
Title of host publicationProceedings of the 20th International Society for Music Information Retrieval Conference, ISMIR 2019
EditorsArthur Flexer, Geoffroy Peeters, Julian Urbano, Anja Volk
PublisherInternational Society for Music Information Retrieval
Pages784-790
Number of pages7
ISBN (Electronic)9781732729919
Publication statusPublished - 2019
Externally publishedYes
Event20th International Society for Music Information Retrieval Conference, ISMIR 2019 - Delft, Netherlands
Duration: 2019 Nov 42019 Nov 8

Publication series

NameProceedings of the 20th International Society for Music Information Retrieval Conference, ISMIR 2019

Conference

Conference20th International Society for Music Information Retrieval Conference, ISMIR 2019
Country/TerritoryNetherlands
CityDelft
Period19/11/419/11/8

ASJC Scopus subject areas

  • Music
  • Information Systems

Fingerprint

Dive into the research topics of 'Improving singing aid system for laryngectomees with statistical voice conversion and VAE-space'. Together they form a unique fingerprint.

Cite this