Improved sound source localization and front-back disambiguation for humanoid robots with two ears

Ui Hyun Kim, Kazuhiro Nakadai, Hiroshi G. Okuno

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Citations (Scopus)

Abstract

An improved sound source localization (SSL) method has been developed that is based on the generalized cross-correlation (GCC) method weighted by the phase transform (PHAT) for use with humanoid robots equipped with two microphones inside artificial pinnae. The conventional SSL method based on the GCC-PHAT method has two main problems when used on a humanoid robot platform: 1) diffraction of sound waves with multipath interference caused by the shape of the robot head and 2) front-back ambiguity. The diffraction problem was overcome by incorporating a new time delay factor into the GCC-PHAT method under the assumption of a spherical robot head. The ambiguity problem was overcome by utilizing the amplification effect of the pinnae for localization over the entire azimuth. Experiments conducted using a humanoid robot showed that localization errors were reduced by 9.9° on average with the improved method and that the success rate for front-back disambiguation was 32.2% better on average over the entire azimuth than with a conventional HRTF-based method.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages282-291
Number of pages10
Volume7906 LNAI
DOIs
Publication statusPublished - 2013
Externally publishedYes
Event26th International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2013 - Amsterdam
Duration: 2013 Jun 172013 Jun 21

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume7906 LNAI
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other26th International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2013
CityAmsterdam
Period13/6/1713/6/21

Fingerprint

Source Localization
Humanoid Robot
Acoustic waves
Robots
Cross-correlation
Azimuth
Diffraction
Transform
Correlation methods
Robot
Microphones
Entire
Amplification
Time delay
Multipath
Sound
Time Delay
Interference
Experiments

Keywords

  • front-back disambiguation
  • human-robot interaction
  • Intelligent robot audition
  • sound source localization

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Kim, U. H., Nakadai, K., & Okuno, H. G. (2013). Improved sound source localization and front-back disambiguation for humanoid robots with two ears. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7906 LNAI, pp. 282-291). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7906 LNAI). https://doi.org/10.1007/978-3-642-38577-3_29

Improved sound source localization and front-back disambiguation for humanoid robots with two ears. / Kim, Ui Hyun; Nakadai, Kazuhiro; Okuno, Hiroshi G.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 7906 LNAI 2013. p. 282-291 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7906 LNAI).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Kim, UH, Nakadai, K & Okuno, HG 2013, Improved sound source localization and front-back disambiguation for humanoid robots with two ears. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 7906 LNAI, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 7906 LNAI, pp. 282-291, 26th International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2013, Amsterdam, 13/6/17. https://doi.org/10.1007/978-3-642-38577-3_29
Kim UH, Nakadai K, Okuno HG. Improved sound source localization and front-back disambiguation for humanoid robots with two ears. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 7906 LNAI. 2013. p. 282-291. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-642-38577-3_29
Kim, Ui Hyun ; Nakadai, Kazuhiro ; Okuno, Hiroshi G. / Improved sound source localization and front-back disambiguation for humanoid robots with two ears. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 7906 LNAI 2013. pp. 282-291 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{df092f0a09464443abb0704bb8c395e9,
title = "Improved sound source localization and front-back disambiguation for humanoid robots with two ears",
abstract = "An improved sound source localization (SSL) method has been developed that is based on the generalized cross-correlation (GCC) method weighted by the phase transform (PHAT) for use with humanoid robots equipped with two microphones inside artificial pinnae. The conventional SSL method based on the GCC-PHAT method has two main problems when used on a humanoid robot platform: 1) diffraction of sound waves with multipath interference caused by the shape of the robot head and 2) front-back ambiguity. The diffraction problem was overcome by incorporating a new time delay factor into the GCC-PHAT method under the assumption of a spherical robot head. The ambiguity problem was overcome by utilizing the amplification effect of the pinnae for localization over the entire azimuth. Experiments conducted using a humanoid robot showed that localization errors were reduced by 9.9° on average with the improved method and that the success rate for front-back disambiguation was 32.2{\%} better on average over the entire azimuth than with a conventional HRTF-based method.",
keywords = "front-back disambiguation, human-robot interaction, Intelligent robot audition, sound source localization",
author = "Kim, {Ui Hyun} and Kazuhiro Nakadai and Okuno, {Hiroshi G.}",
year = "2013",
doi = "10.1007/978-3-642-38577-3_29",
language = "English",
isbn = "9783642385766",
volume = "7906 LNAI",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "282--291",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - Improved sound source localization and front-back disambiguation for humanoid robots with two ears

AU - Kim, Ui Hyun

AU - Nakadai, Kazuhiro

AU - Okuno, Hiroshi G.

PY - 2013

Y1 - 2013

N2 - An improved sound source localization (SSL) method has been developed that is based on the generalized cross-correlation (GCC) method weighted by the phase transform (PHAT) for use with humanoid robots equipped with two microphones inside artificial pinnae. The conventional SSL method based on the GCC-PHAT method has two main problems when used on a humanoid robot platform: 1) diffraction of sound waves with multipath interference caused by the shape of the robot head and 2) front-back ambiguity. The diffraction problem was overcome by incorporating a new time delay factor into the GCC-PHAT method under the assumption of a spherical robot head. The ambiguity problem was overcome by utilizing the amplification effect of the pinnae for localization over the entire azimuth. Experiments conducted using a humanoid robot showed that localization errors were reduced by 9.9° on average with the improved method and that the success rate for front-back disambiguation was 32.2% better on average over the entire azimuth than with a conventional HRTF-based method.

AB - An improved sound source localization (SSL) method has been developed that is based on the generalized cross-correlation (GCC) method weighted by the phase transform (PHAT) for use with humanoid robots equipped with two microphones inside artificial pinnae. The conventional SSL method based on the GCC-PHAT method has two main problems when used on a humanoid robot platform: 1) diffraction of sound waves with multipath interference caused by the shape of the robot head and 2) front-back ambiguity. The diffraction problem was overcome by incorporating a new time delay factor into the GCC-PHAT method under the assumption of a spherical robot head. The ambiguity problem was overcome by utilizing the amplification effect of the pinnae for localization over the entire azimuth. Experiments conducted using a humanoid robot showed that localization errors were reduced by 9.9° on average with the improved method and that the success rate for front-back disambiguation was 32.2% better on average over the entire azimuth than with a conventional HRTF-based method.

KW - front-back disambiguation

KW - human-robot interaction

KW - Intelligent robot audition

KW - sound source localization

UR - http://www.scopus.com/inward/record.url?scp=84881379529&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84881379529&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-38577-3_29

DO - 10.1007/978-3-642-38577-3_29

M3 - Conference contribution

AN - SCOPUS:84881379529

SN - 9783642385766

VL - 7906 LNAI

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 282

EP - 291

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -