Real-time auditory and visual talker tracking through integrating EM algorithm and particle filter

Hyun Don Kim, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

This paper presents techniques that enable a talker tracking for effective human-robot interaction. We propose new way of integrating an EM algorithm and a particle filter to select an appropriate path for tracking the talker. It can easily adapt to new kinds of information for tracking the talker with our system. This is because our system estimates the position of the desired talker through means, variances, and weights calculated from EM training regardless of the numbers or kinds of information. In addition, to enhance a robot's ability to track a talker in real-world environments, we applied the particle filter to talker tracking after executing the EM algorithm. We also integrated a variety of auditory and visual information regarding sound localization, face localization, and the detection of lip movement. Moreover, we applied a sound classification function that allows our system to distinguish between voice, music, or noise. We also developed a vision module that can locate moving objects.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages280-290
Number of pages11
Volume4570 LNAI
Publication statusPublished - 2007
Externally publishedYes
Event20th International Conference on Industrial, Engineering, and Other Applications of Applied Intelligent Systems, lEA/AlE-2007 - Kyoto
Duration: 2007 Jun 262007 Jun 29

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4570 LNAI
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other20th International Conference on Industrial, Engineering, and Other Applications of Applied Intelligent Systems, lEA/AlE-2007
CityKyoto
Period07/6/2607/6/29

Fingerprint

Visual Tracking
Particle Filter
EM Algorithm
Acoustic waves
Sound Localization
Real-time
Human robot interaction
Aptitude
Music
Lip
Noise
Robots
Weights and Measures
Human-robot Interaction
Moving Objects
Robot
Face
Module
Path
Estimate

Keywords

  • EM
  • Human-robot interaction
  • Lip movement detection
  • Particle filter
  • Sound source localization
  • Talker tracking

ASJC Scopus subject areas

  • Computer Science(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Theoretical Computer Science

Cite this

Kim, H. D., Komatani, K., Ogata, T., & Okuno, H. G. (2007). Real-time auditory and visual talker tracking through integrating EM algorithm and particle filter. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4570 LNAI, pp. 280-290). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4570 LNAI).

Real-time auditory and visual talker tracking through integrating EM algorithm and particle filter. / Kim, Hyun Don; Komatani, Kazunori; Ogata, Tetsuya; Okuno, Hiroshi G.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4570 LNAI 2007. p. 280-290 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4570 LNAI).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Kim, HD, Komatani, K, Ogata, T & Okuno, HG 2007, Real-time auditory and visual talker tracking through integrating EM algorithm and particle filter. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 4570 LNAI, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 4570 LNAI, pp. 280-290, 20th International Conference on Industrial, Engineering, and Other Applications of Applied Intelligent Systems, lEA/AlE-2007, Kyoto, 07/6/26.
Kim HD, Komatani K, Ogata T, Okuno HG. Real-time auditory and visual talker tracking through integrating EM algorithm and particle filter. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4570 LNAI. 2007. p. 280-290. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
Kim, Hyun Don ; Komatani, Kazunori ; Ogata, Tetsuya ; Okuno, Hiroshi G. / Real-time auditory and visual talker tracking through integrating EM algorithm and particle filter. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4570 LNAI 2007. pp. 280-290 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{333b1f7f90794c2a9b4cbb54dac1e148,
title = "Real-time auditory and visual talker tracking through integrating EM algorithm and particle filter",
abstract = "This paper presents techniques that enable a talker tracking for effective human-robot interaction. We propose new way of integrating an EM algorithm and a particle filter to select an appropriate path for tracking the talker. It can easily adapt to new kinds of information for tracking the talker with our system. This is because our system estimates the position of the desired talker through means, variances, and weights calculated from EM training regardless of the numbers or kinds of information. In addition, to enhance a robot's ability to track a talker in real-world environments, we applied the particle filter to talker tracking after executing the EM algorithm. We also integrated a variety of auditory and visual information regarding sound localization, face localization, and the detection of lip movement. Moreover, we applied a sound classification function that allows our system to distinguish between voice, music, or noise. We also developed a vision module that can locate moving objects.",
keywords = "EM, Human-robot interaction, Lip movement detection, Particle filter, Sound source localization, Talker tracking",
author = "Kim, {Hyun Don} and Kazunori Komatani and Tetsuya Ogata and Okuno, {Hiroshi G.}",
year = "2007",
language = "English",
isbn = "9783540733225",
volume = "4570 LNAI",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "280--290",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - Real-time auditory and visual talker tracking through integrating EM algorithm and particle filter

AU - Kim, Hyun Don

AU - Komatani, Kazunori

AU - Ogata, Tetsuya

AU - Okuno, Hiroshi G.

PY - 2007

Y1 - 2007

N2 - This paper presents techniques that enable a talker tracking for effective human-robot interaction. We propose new way of integrating an EM algorithm and a particle filter to select an appropriate path for tracking the talker. It can easily adapt to new kinds of information for tracking the talker with our system. This is because our system estimates the position of the desired talker through means, variances, and weights calculated from EM training regardless of the numbers or kinds of information. In addition, to enhance a robot's ability to track a talker in real-world environments, we applied the particle filter to talker tracking after executing the EM algorithm. We also integrated a variety of auditory and visual information regarding sound localization, face localization, and the detection of lip movement. Moreover, we applied a sound classification function that allows our system to distinguish between voice, music, or noise. We also developed a vision module that can locate moving objects.

AB - This paper presents techniques that enable a talker tracking for effective human-robot interaction. We propose new way of integrating an EM algorithm and a particle filter to select an appropriate path for tracking the talker. It can easily adapt to new kinds of information for tracking the talker with our system. This is because our system estimates the position of the desired talker through means, variances, and weights calculated from EM training regardless of the numbers or kinds of information. In addition, to enhance a robot's ability to track a talker in real-world environments, we applied the particle filter to talker tracking after executing the EM algorithm. We also integrated a variety of auditory and visual information regarding sound localization, face localization, and the detection of lip movement. Moreover, we applied a sound classification function that allows our system to distinguish between voice, music, or noise. We also developed a vision module that can locate moving objects.

KW - EM

KW - Human-robot interaction

KW - Lip movement detection

KW - Particle filter

KW - Sound source localization

KW - Talker tracking

UR - http://www.scopus.com/inward/record.url?scp=37249024128&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=37249024128&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:37249024128

SN - 9783540733225

VL - 4570 LNAI

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 280

EP - 290

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -