Challenges in deploying a microphone array to localize and separate sound sources in real auditory scenes

Yoshiaki Bando, Takuma Otsuka, Katsutoshi Itoyama, Kazuyoshi Yoshii, Yoko Sasaki, Satoshi Kagami, Hiroshi G. Okuno

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    3 Citations (Scopus)

    Abstract

    Analyzing the auditory scene of real environments is challenging partly because an unknown number and type of sound sources are observed at the same time and partly because these sounds are observed on a significantly different sound pressure level at the microphone. These are difficult problems even with state-of-the-art sound source localization and separation methods. In this paper, we exploit two such methods using a microphone array: (1) Bayesian nonparametric microphone array processing (BNP-MAP), which is capable of separating and localizing sound sources when the number of sound sources is unspecified, and (2) robot audition software 'HARK' is capable of separating and localizing in real time. Through experimentation, we found that BNP-MAP is more robust against differences in the sound pressure levels of the source signals and in the spatial closeness of source positions. Experiments analyzing real scenes of human conversations recorded in a big exhibition hall and bird calling recorded at a natural park demonstrate the efficacy and applicability of BNP-MAP.

    Original languageEnglish
    Title of host publicationICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
    PublisherInstitute of Electrical and Electronics Engineers Inc.
    Pages723-727
    Number of pages5
    Volume2015-August
    ISBN (Print)9781467369978
    DOIs
    Publication statusPublished - 2015 Aug 4
    Event40th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Brisbane, Australia
    Duration: 2014 Apr 192014 Apr 24

    Other

    Other40th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015
    CountryAustralia
    CityBrisbane
    Period14/4/1914/4/24

    Fingerprint

    Microphones
    Acoustic waves
    Array processing
    Birds
    Audition
    Robots
    Experiments

    Keywords

    • Auditory scene analysis
    • Bayesian nonparametrics
    • simultaneous sound source localization and separation
    • sounds of different volume
    • unknown time-varying number of sources

    ASJC Scopus subject areas

    • Signal Processing
    • Software
    • Electrical and Electronic Engineering

    Cite this

    Bando, Y., Otsuka, T., Itoyama, K., Yoshii, K., Sasaki, Y., Kagami, S., & Okuno, H. G. (2015). Challenges in deploying a microphone array to localize and separate sound sources in real auditory scenes. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings (Vol. 2015-August, pp. 723-727). [7178064] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICASSP.2015.7178064

    Challenges in deploying a microphone array to localize and separate sound sources in real auditory scenes. / Bando, Yoshiaki; Otsuka, Takuma; Itoyama, Katsutoshi; Yoshii, Kazuyoshi; Sasaki, Yoko; Kagami, Satoshi; Okuno, Hiroshi G.

    ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Vol. 2015-August Institute of Electrical and Electronics Engineers Inc., 2015. p. 723-727 7178064.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Bando, Y, Otsuka, T, Itoyama, K, Yoshii, K, Sasaki, Y, Kagami, S & Okuno, HG 2015, Challenges in deploying a microphone array to localize and separate sound sources in real auditory scenes. in ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. vol. 2015-August, 7178064, Institute of Electrical and Electronics Engineers Inc., pp. 723-727, 40th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015, Brisbane, Australia, 14/4/19. https://doi.org/10.1109/ICASSP.2015.7178064
    Bando Y, Otsuka T, Itoyama K, Yoshii K, Sasaki Y, Kagami S et al. Challenges in deploying a microphone array to localize and separate sound sources in real auditory scenes. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Vol. 2015-August. Institute of Electrical and Electronics Engineers Inc. 2015. p. 723-727. 7178064 https://doi.org/10.1109/ICASSP.2015.7178064
    Bando, Yoshiaki ; Otsuka, Takuma ; Itoyama, Katsutoshi ; Yoshii, Kazuyoshi ; Sasaki, Yoko ; Kagami, Satoshi ; Okuno, Hiroshi G. / Challenges in deploying a microphone array to localize and separate sound sources in real auditory scenes. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Vol. 2015-August Institute of Electrical and Electronics Engineers Inc., 2015. pp. 723-727
    @inproceedings{9c254450d0304abdb9fb344c1f0b3cbd,
    title = "Challenges in deploying a microphone array to localize and separate sound sources in real auditory scenes",
    abstract = "Analyzing the auditory scene of real environments is challenging partly because an unknown number and type of sound sources are observed at the same time and partly because these sounds are observed on a significantly different sound pressure level at the microphone. These are difficult problems even with state-of-the-art sound source localization and separation methods. In this paper, we exploit two such methods using a microphone array: (1) Bayesian nonparametric microphone array processing (BNP-MAP), which is capable of separating and localizing sound sources when the number of sound sources is unspecified, and (2) robot audition software 'HARK' is capable of separating and localizing in real time. Through experimentation, we found that BNP-MAP is more robust against differences in the sound pressure levels of the source signals and in the spatial closeness of source positions. Experiments analyzing real scenes of human conversations recorded in a big exhibition hall and bird calling recorded at a natural park demonstrate the efficacy and applicability of BNP-MAP.",
    keywords = "Auditory scene analysis, Bayesian nonparametrics, simultaneous sound source localization and separation, sounds of different volume, unknown time-varying number of sources",
    author = "Yoshiaki Bando and Takuma Otsuka and Katsutoshi Itoyama and Kazuyoshi Yoshii and Yoko Sasaki and Satoshi Kagami and Okuno, {Hiroshi G.}",
    year = "2015",
    month = "8",
    day = "4",
    doi = "10.1109/ICASSP.2015.7178064",
    language = "English",
    isbn = "9781467369978",
    volume = "2015-August",
    pages = "723--727",
    booktitle = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",
    publisher = "Institute of Electrical and Electronics Engineers Inc.",

    }

    TY - GEN

    T1 - Challenges in deploying a microphone array to localize and separate sound sources in real auditory scenes

    AU - Bando, Yoshiaki

    AU - Otsuka, Takuma

    AU - Itoyama, Katsutoshi

    AU - Yoshii, Kazuyoshi

    AU - Sasaki, Yoko

    AU - Kagami, Satoshi

    AU - Okuno, Hiroshi G.

    PY - 2015/8/4

    Y1 - 2015/8/4

    N2 - Analyzing the auditory scene of real environments is challenging partly because an unknown number and type of sound sources are observed at the same time and partly because these sounds are observed on a significantly different sound pressure level at the microphone. These are difficult problems even with state-of-the-art sound source localization and separation methods. In this paper, we exploit two such methods using a microphone array: (1) Bayesian nonparametric microphone array processing (BNP-MAP), which is capable of separating and localizing sound sources when the number of sound sources is unspecified, and (2) robot audition software 'HARK' is capable of separating and localizing in real time. Through experimentation, we found that BNP-MAP is more robust against differences in the sound pressure levels of the source signals and in the spatial closeness of source positions. Experiments analyzing real scenes of human conversations recorded in a big exhibition hall and bird calling recorded at a natural park demonstrate the efficacy and applicability of BNP-MAP.

    AB - Analyzing the auditory scene of real environments is challenging partly because an unknown number and type of sound sources are observed at the same time and partly because these sounds are observed on a significantly different sound pressure level at the microphone. These are difficult problems even with state-of-the-art sound source localization and separation methods. In this paper, we exploit two such methods using a microphone array: (1) Bayesian nonparametric microphone array processing (BNP-MAP), which is capable of separating and localizing sound sources when the number of sound sources is unspecified, and (2) robot audition software 'HARK' is capable of separating and localizing in real time. Through experimentation, we found that BNP-MAP is more robust against differences in the sound pressure levels of the source signals and in the spatial closeness of source positions. Experiments analyzing real scenes of human conversations recorded in a big exhibition hall and bird calling recorded at a natural park demonstrate the efficacy and applicability of BNP-MAP.

    KW - Auditory scene analysis

    KW - Bayesian nonparametrics

    KW - simultaneous sound source localization and separation

    KW - sounds of different volume

    KW - unknown time-varying number of sources

    UR - http://www.scopus.com/inward/record.url?scp=84946092424&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=84946092424&partnerID=8YFLogxK

    U2 - 10.1109/ICASSP.2015.7178064

    DO - 10.1109/ICASSP.2015.7178064

    M3 - Conference contribution

    AN - SCOPUS:84946092424

    SN - 9781467369978

    VL - 2015-August

    SP - 723

    EP - 727

    BT - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

    PB - Institute of Electrical and Electronics Engineers Inc.

    ER -