Various environmental sounds exist around us in our daily life. Recently, environmental sound recognition has drawn great attention for understanding our environment. However, because environmental sounds derive from multiple sound sources, it is difficult to recognize them accurately. If we were able to separate sound sources before sound recognition as a pre-process, then recognition would be easier and more accurate. We assume that monaural microphones are widely installed in mobile devices used as recording devices. This paper therefore presents a proposal for monaural sound source separation of environmental sounds. Two-phase clustering using non-negative matrix factorization (NMF) is proposed to separate monaural sound sources. In this proposal, the time-variant gain feature is used as an attribute of an environmental sound for more efficient sound separation.
ASJC Scopus subject areas