Applying Scattering Theory to Robot Audition System

Robust Sound Source Localization and Extraction

Kazuhiro Nakadai, Daisuke Matsuura, Hiroshi G. Okuno, Hiroaki Kitano

Research output: Chapter in Book/Report/Conference proceedingConference contribution

60 Citations (Scopus)

Abstract

Robot audition by its own ears (microphones) is essential for natural human-robot communication and interface. Since a microphone is embedded in the head of a robot, the head-related transfer function (HRTF) plays an important role in sound source localization and extraction. Usually, from binaural input, the interaural phase difference (IPD) and interaural intensity difference (IID) are calculated, and then the direction is determined by using IPD and IID with HRTF. The problem of HRTF-based sound source localization is that a HRTF should be measured for each robot in an anechoic chamber, because it depends on the shape of robot's head; HRTF should be interpolated to manipulate a moving talker, because it is available only for discrete azimuth and elevation. To cope with these problems of HRTF, we proposed the auditory epipolar geometry as a continuous function of IPD and IID to dispense with HRTF and have developed a real-time multiple-talker tracking system. This auditory epipolar geometry, however, does not give a good approximation to IID of all range and IPD of peripheral areas. In this paper, the scattering theory in physics is employed to take into consideration the diffraction of sounds around robot's head for better approximation of IID and IPD. The resulting system shows that it is efficient for localization and extraction of sound at higher frequency and from side directions.

Original languageEnglish
Title of host publicationIEEE International Conference on Intelligent Robots and Systems
Pages1147-1152
Number of pages6
Volume2
Publication statusPublished - 2003
Externally publishedYes
Event2003 IEEE/RSJ International Conference on Intelligent Robots and Systems - Las Vegas, NV
Duration: 2003 Oct 272003 Oct 31

Other

Other2003 IEEE/RSJ International Conference on Intelligent Robots and Systems
CityLas Vegas, NV
Period03/10/2703/10/31

Fingerprint

Audition
Transfer functions
Acoustic waves
Robots
Scattering
Microphones
Anechoic chambers
Geometry
Physics
Diffraction
Communication

ASJC Scopus subject areas

  • Control and Systems Engineering

Cite this

Nakadai, K., Matsuura, D., Okuno, H. G., & Kitano, H. (2003). Applying Scattering Theory to Robot Audition System: Robust Sound Source Localization and Extraction. In IEEE International Conference on Intelligent Robots and Systems (Vol. 2, pp. 1147-1152)

Applying Scattering Theory to Robot Audition System : Robust Sound Source Localization and Extraction. / Nakadai, Kazuhiro; Matsuura, Daisuke; Okuno, Hiroshi G.; Kitano, Hiroaki.

IEEE International Conference on Intelligent Robots and Systems. Vol. 2 2003. p. 1147-1152.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Nakadai, K, Matsuura, D, Okuno, HG & Kitano, H 2003, Applying Scattering Theory to Robot Audition System: Robust Sound Source Localization and Extraction. in IEEE International Conference on Intelligent Robots and Systems. vol. 2, pp. 1147-1152, 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems, Las Vegas, NV, 03/10/27.
Nakadai K, Matsuura D, Okuno HG, Kitano H. Applying Scattering Theory to Robot Audition System: Robust Sound Source Localization and Extraction. In IEEE International Conference on Intelligent Robots and Systems. Vol. 2. 2003. p. 1147-1152
Nakadai, Kazuhiro ; Matsuura, Daisuke ; Okuno, Hiroshi G. ; Kitano, Hiroaki. / Applying Scattering Theory to Robot Audition System : Robust Sound Source Localization and Extraction. IEEE International Conference on Intelligent Robots and Systems. Vol. 2 2003. pp. 1147-1152
@inproceedings{06fc160fc015443690353cfda114b091,
title = "Applying Scattering Theory to Robot Audition System: Robust Sound Source Localization and Extraction",
abstract = "Robot audition by its own ears (microphones) is essential for natural human-robot communication and interface. Since a microphone is embedded in the head of a robot, the head-related transfer function (HRTF) plays an important role in sound source localization and extraction. Usually, from binaural input, the interaural phase difference (IPD) and interaural intensity difference (IID) are calculated, and then the direction is determined by using IPD and IID with HRTF. The problem of HRTF-based sound source localization is that a HRTF should be measured for each robot in an anechoic chamber, because it depends on the shape of robot's head; HRTF should be interpolated to manipulate a moving talker, because it is available only for discrete azimuth and elevation. To cope with these problems of HRTF, we proposed the auditory epipolar geometry as a continuous function of IPD and IID to dispense with HRTF and have developed a real-time multiple-talker tracking system. This auditory epipolar geometry, however, does not give a good approximation to IID of all range and IPD of peripheral areas. In this paper, the scattering theory in physics is employed to take into consideration the diffraction of sounds around robot's head for better approximation of IID and IPD. The resulting system shows that it is efficient for localization and extraction of sound at higher frequency and from side directions.",
author = "Kazuhiro Nakadai and Daisuke Matsuura and Okuno, {Hiroshi G.} and Hiroaki Kitano",
year = "2003",
language = "English",
volume = "2",
pages = "1147--1152",
booktitle = "IEEE International Conference on Intelligent Robots and Systems",

}

TY - GEN

T1 - Applying Scattering Theory to Robot Audition System

T2 - Robust Sound Source Localization and Extraction

AU - Nakadai, Kazuhiro

AU - Matsuura, Daisuke

AU - Okuno, Hiroshi G.

AU - Kitano, Hiroaki

PY - 2003

Y1 - 2003

N2 - Robot audition by its own ears (microphones) is essential for natural human-robot communication and interface. Since a microphone is embedded in the head of a robot, the head-related transfer function (HRTF) plays an important role in sound source localization and extraction. Usually, from binaural input, the interaural phase difference (IPD) and interaural intensity difference (IID) are calculated, and then the direction is determined by using IPD and IID with HRTF. The problem of HRTF-based sound source localization is that a HRTF should be measured for each robot in an anechoic chamber, because it depends on the shape of robot's head; HRTF should be interpolated to manipulate a moving talker, because it is available only for discrete azimuth and elevation. To cope with these problems of HRTF, we proposed the auditory epipolar geometry as a continuous function of IPD and IID to dispense with HRTF and have developed a real-time multiple-talker tracking system. This auditory epipolar geometry, however, does not give a good approximation to IID of all range and IPD of peripheral areas. In this paper, the scattering theory in physics is employed to take into consideration the diffraction of sounds around robot's head for better approximation of IID and IPD. The resulting system shows that it is efficient for localization and extraction of sound at higher frequency and from side directions.

AB - Robot audition by its own ears (microphones) is essential for natural human-robot communication and interface. Since a microphone is embedded in the head of a robot, the head-related transfer function (HRTF) plays an important role in sound source localization and extraction. Usually, from binaural input, the interaural phase difference (IPD) and interaural intensity difference (IID) are calculated, and then the direction is determined by using IPD and IID with HRTF. The problem of HRTF-based sound source localization is that a HRTF should be measured for each robot in an anechoic chamber, because it depends on the shape of robot's head; HRTF should be interpolated to manipulate a moving talker, because it is available only for discrete azimuth and elevation. To cope with these problems of HRTF, we proposed the auditory epipolar geometry as a continuous function of IPD and IID to dispense with HRTF and have developed a real-time multiple-talker tracking system. This auditory epipolar geometry, however, does not give a good approximation to IID of all range and IPD of peripheral areas. In this paper, the scattering theory in physics is employed to take into consideration the diffraction of sounds around robot's head for better approximation of IID and IPD. The resulting system shows that it is efficient for localization and extraction of sound at higher frequency and from side directions.

UR - http://www.scopus.com/inward/record.url?scp=0346779077&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0346779077&partnerID=8YFLogxK

M3 - Conference contribution

VL - 2

SP - 1147

EP - 1152

BT - IEEE International Conference on Intelligent Robots and Systems

ER -