Auditory and visual integration based localization and tracking of multiple moving sounds in daily-life environments

Hyun Don Kim, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

This paper presents techniques that enable talker tracking for effective human-robot interaction. To track moving people in daily-life environments, localizing multiple moving sounds is necessary so that robots can locate talkers. However, the conventional method requires an array of microphones and impulse response data. Therefore, we propose a way to integrate a cross-power spectrum phase analysis (CSP) method and an expectation-maximization (EM) algorithm. The CSP can localize sound sources using only two microphones and does not need impulse response data. Moreover, the EM algorithm increases the system's effectiveness and allows it to cope with multiple sound sources. We confirmed that the proposed method performs better than the conventional method. In addition, we added a particle filter to the tracking process to produce a reliable tracking path and the particle filter is able to integrate audio-visual information effectively. Furthermore, the applied particle filter is able to track people while dealing with various noises that are even loud sounds in the daily-life environments.

Original languageEnglish
Title of host publication16th IEEE International Conference on Robot and Human Interactive Communication, RO-MAN
Pages399-404
Number of pages6
DOIs
Publication statusPublished - 2007 Dec 1
Externally publishedYes
Event16th IEEE International Conference on Robot and Human Interactive Communication, RO-MAN - Jeju, Korea, Republic of
Duration: 2007 Aug 262007 Aug 29

Publication series

NameProceedings - IEEE International Workshop on Robot and Human Interactive Communication

Conference

Conference16th IEEE International Conference on Robot and Human Interactive Communication, RO-MAN
CountryKorea, Republic of
CityJeju
Period07/8/2607/8/29

ASJC Scopus subject areas

  • Engineering(all)

Fingerprint Dive into the research topics of 'Auditory and visual integration based localization and tracking of multiple moving sounds in daily-life environments'. Together they form a unique fingerprint.

  • Cite this

    Kim, H. D., Komatani, K., Ogata, T., & Okuno, H. G. (2007). Auditory and visual integration based localization and tracking of multiple moving sounds in daily-life environments. In 16th IEEE International Conference on Robot and Human Interactive Communication, RO-MAN (pp. 399-404). [4415117] (Proceedings - IEEE International Workshop on Robot and Human Interactive Communication). https://doi.org/10.1109/ROMAN.2007.4415117