Auditory stream segregation in auditory scene analysis with a multi-agent system

Tomohiro Nakatani, Hiroshi G. Okuno, Takeshi Kawabata

Research output: Chapter in Book/Report/Conference proceedingConference contribution

14 Citations (Scopus)

Abstract

We propose a novel approach to auditory stream segregation which extracts individual sounds (auditory stream) from a mixture of sounds in auditory scene analysis. The HBSS (Harmonic-Based Stream Segregation) system is designed and developed by employing a multi-agent system. HBSS uses only harmonics as a clue to segregation and extracts auditory streams incrementally. When the tracer-generator agent detects a new sound, it spawns a tracer agent, which extracts an auditory stream by tracing its harmonic structure. The tracer sends a feedforward signal so that the generator and other tracers should not work on the same stream that is being traced. The quality of segregation may be poor due to redundant and ghost tracers. HBSS copes with this problem by introducing monitor agents, which detect and eliminate redundant and ghost tracers. HBSS can segregate two streams from a mixture of man's and woman's speech. It is easy to resynthesize speech or sounds from the corresponding streams. Additionally, HBSS can be easily extended by adding agents of a new capability. HBSS can be considered as the first step to computational auditory scene analysis.

Original languageEnglish
Title of host publicationProceedings of the National Conference on Artificial Intelligence
Place of PublicationMenlo Park, CA, United States
PublisherAAAI
Pages100-107
Number of pages8
Volume1
Publication statusPublished - 1994
Externally publishedYes
EventProceedings of the 12th National Conference on Artificial Intelligence. Part 1 (of 2) - Seattle, WA, USA
Duration: 1994 Jul 311994 Aug 4

Other

OtherProceedings of the 12th National Conference on Artificial Intelligence. Part 1 (of 2)
CitySeattle, WA, USA
Period94/7/3194/8/4

ASJC Scopus subject areas

  • Software

Fingerprint Dive into the research topics of 'Auditory stream segregation in auditory scene analysis with a multi-agent system'. Together they form a unique fingerprint.

Cite this