Robot audition in real-world should cope with motor and other noises caused by the robot's own movements in addition to environmental noises and reverberation. This paper reports how auditory processing is improved by audio-visual integration with active movements. The key idea resides in hierarchical integration of auditory and visual streams to disambiguate auditory or visual processing. The system runs in real-time by using distributed processing on 4 PCs connected by Gigabit Ethernet. The system implemented in a upper-torso humanoid tracks multiple talkers and extracts speech from a mixture of sounds. The performance of epipolar geometry based sound source localization and sound source separation by active and adaptive direction-pass filtering is also reported.
|ホスト出版物のタイトル||Proceedings - IEEE International Conference on Robotics and Automation|
|出版ステータス||Published - 2002|
|イベント||2002 IEEE International Conference on Robotics and Automation - Washington, DC, United States|
継続期間: 2002 5 11 → 2002 5 15
|Other||2002 IEEE International Conference on Robotics and Automation|
|Period||02/5/11 → 02/5/15|
ASJC Scopus subject areas