Exploiting auditory fovea in humanoid-human interaction

Kazuhiro Nakadai, Hiroshi G. Okuno, Hiroaki Kitano

Research output: Chapter in Book/Report/Conference proceedingConference contribution

14 Citations (Scopus)

Abstract

A robot's auditory perception of the real world should be able to cope with motor and other noises caused by the robot's own movements in addition to environment noises and reverberation. This paper presents the active direction-pass filter (ADPF) that separates sounds originating from a specified direction detected by a pair of microphones. Thus the ADPF is based on directional processing - a process used in visual processing. The ADPF is implemented by hierarchical integration of visual and auditory processing with hypothetical reasoning of interaural phase difference (IPD) and interaural intensity difference (IID) for each sub-band. The ADPF gives differences in resolution in sound localization and separation depending on where the sound comes from: the resolving power is much higher for sounds coming directly from the front of the humanoid than for sounds coming from the periphery. This directional resolving property is similar to that of the eye whereby the visual fovea at the center of the retina is capable of much higher resolution than is the periphery of the retina. To exploit the corresponding "auditory fovea", the ADPF controls the direction of the head. The human tracking and sound source separation based on the ADPF is implemented on the upper-torso of the humanoid and runs in real-time using distributed processing by 5 PCs networked via a gigabit ethernet. The signal-to-noise ratio (SNR) and noise reduction ratio of each sound separated by the ADPF from a mixture of two or three speeches of the same volume were increased by about 2.2 dB and 9 dB, respectively.

Original languageEnglish
Title of host publicationProceedings of the National Conference on Artificial Intelligence
Pages431-438
Number of pages8
Publication statusPublished - 2002
Externally publishedYes
Event18th National Conference on Artificial Intelligence (AAAI-02), 14th Innovative Applications of Artificial Intelligence Conference (IAAI-02) - Edmonton, Alta.
Duration: 2002 Jul 282002 Aug 1

Other

Other18th National Conference on Artificial Intelligence (AAAI-02), 14th Innovative Applications of Artificial Intelligence Conference (IAAI-02)
CityEdmonton, Alta.
Period02/7/2802/8/1

ASJC Scopus subject areas

  • Software

Fingerprint Dive into the research topics of 'Exploiting auditory fovea in humanoid-human interaction'. Together they form a unique fingerprint.

  • Cite this

    Nakadai, K., Okuno, H. G., & Kitano, H. (2002). Exploiting auditory fovea in humanoid-human interaction. In Proceedings of the National Conference on Artificial Intelligence (pp. 431-438)