Cepstral smoothing of separated signals for underdetermined speech separation

Yumi Ansai, Shoko Araki, Shoji Makino, Tomohiro Nakatani, Takeshi Yamada, Atsushi Nakamura, Nobuhiko Kitawaki

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

Musical noise is a typical problem with blind source separation using a time-frequency mask. Recently, the cepstral smoothing of spectral masks (CSM) was proposed. Based on the idea of smoothing in the cepstral domain, this paper proposes the cepstral smoothing of separated signals (CSS) on the assumption that a cepstral representation better reflects the characteristics of speech signals than those of masks (or filter gains). We also report a comparative evaluation study of CSM and CSS with other musical noise reduction methods. Our experimental results show that CSM is effective for musical noise reduction, but the target speech was relatively distorted. On the other hand, our proposed CSS produced less distorted target signals with the same musical noise reduction as CSM.

Original languageEnglish
Title of host publicationISCAS 2010 - 2010 IEEE International Symposium on Circuits and Systems
Subtitle of host publicationNano-Bio Circuit Fabrics and Systems
Pages2506-2509
Number of pages4
DOIs
Publication statusPublished - 2010
Externally publishedYes
Event2010 IEEE International Symposium on Circuits and Systems: Nano-Bio Circuit Fabrics and Systems, ISCAS 2010 - Paris, France
Duration: 2010 May 302010 Jun 2

Publication series

NameISCAS 2010 - 2010 IEEE International Symposium on Circuits and Systems: Nano-Bio Circuit Fabrics and Systems

Conference

Conference2010 IEEE International Symposium on Circuits and Systems: Nano-Bio Circuit Fabrics and Systems, ISCAS 2010
CountryFrance
CityParis
Period10/5/3010/6/2

ASJC Scopus subject areas

  • Hardware and Architecture
  • Electrical and Electronic Engineering

Fingerprint Dive into the research topics of 'Cepstral smoothing of separated signals for underdetermined speech separation'. Together they form a unique fingerprint.

Cite this