Blind extraction of dominant target sources using ICA and time-frequency masking

Hiroshi Sawada*, Shoko Araki, Ryo Mukai, Shoji Makino

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

81 Citations (Scopus)

Abstract

This paper presents a method for enhancing target sources of interest and suppressing other interference sources. The target sources are assumed to be close to sensors, to have dominant powers at these sensors, and to have non-Gaussianity. The enhancement is performed blindly, i.e., without knowing the position and active time of each source. We consider a general case where the total number of sources is larger than the number of sensors, and neither the number of target sources nor the total number of sources is known. The method is based on a two-stage process where independent component analysis (ICA) is first employed in each frequency bin and then time-frequency masking is used to improve the performance further. We propose a new sophisticated method for deciding the number of target sources and then selecting their frequency components. We also propose a new criterion for specifying time-frequency masks. Experimental results for simulated cocktail party situations in a room, whose reverberation time was 130 ms, are presented to show the effectiveness and characteristics of the proposed method

Original languageEnglish
Article number1709904
Pages (from-to)2165-2173
Number of pages9
JournalIEEE Transactions on Audio, Speech and Language Processing
Volume14
Issue number6
DOIs
Publication statusPublished - 2006 Nov
Externally publishedYes

Keywords

  • Blind source extraction
  • Blind source separation (BSS)
  • Convolutive mixture
  • Frequency domain
  • Independent component analysis
  • Permutation problem
  • Time-frequency masking

ASJC Scopus subject areas

  • Acoustics and Ultrasonics
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Blind extraction of dominant target sources using ICA and time-frequency masking'. Together they form a unique fingerprint.

Cite this