Unsupervised Training of Sequential Neural Beamformer Using Coarsely-separated and Non-separated Signals

Kohei Saijo, Tetsuji Ogawa

Research output: Contribution to journalConference articlepeer-review

Abstract

We present an unsupervised training method of the sequential neural beamformer (Seq-BF) using coarsely-separated and non-separated supervisory signals. The signal coarsely separated by blind source separation (BSS) has been used for training neural separators in an unsupervised manner. However, the performance is limited due to distortions in the supervision. In contrast, remix-cycle-consistent learning (RCCL) enables a separator to be trained on distortion-free observed mixtures by making the remixed mixtures obtained by repeatedly separating and remixing the two different mixtures closer to the original mixtures. Still, training with RCCL from scratch often falls into a trivial solution, i.e., not separating signals. The present study provides a novel unsupervised learning algorithm for the Seq-BF with two stacked neural separators, in which the separators are pre-trained using the BSS outputs and then fine-tuned with RCCL. Such configuration compensates for the shortcomings of both approaches: the guiding mechanism in Seq-BF accelerates separation to exceed BSS performance, thereby stabilizing RCCL. Experimental comparisons demonstrated that the proposed unsupervised learning achieved performance comparable to supervised learning (0.4 point difference in word error rate).

Original languageEnglish
Pages (from-to)251-255
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume2022-September
DOIs
Publication statusPublished - 2022
Event23rd Annual Conference of the International Speech Communication Association, INTERSPEECH 2022 - Incheon, Korea, Republic of
Duration: 2022 Sept 182022 Sept 22

Keywords

  • remix-cycle-consistent learning
  • sequential neural beamforming
  • teacher-student learning
  • unsupervised speech separation

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Modelling and Simulation

Fingerprint

Dive into the research topics of 'Unsupervised Training of Sequential Neural Beamformer Using Coarsely-separated and Non-separated Signals'. Together they form a unique fingerprint.

Cite this