TY - GEN
T1 - Realizing audio-visually triggered Eliza-like non-verbal behaviors
AU - Okuno, Hiroshi G.
AU - Nakadai, Kazuhiro
AU - Kitano, Hiroaki
PY - 2002
Y1 - 2002
N2 - We are studying how to create social physical agents, i.e., humanoids, that perform actions empowered by real-time audio-visual tracking of multiple talkers. Social skills require complex perceptual and motor capabilities as well as communicating ones. It is critical to identify primary features in designing building blocks for social skills, because performance of social interaction is usually evaluated as a whole system but not as each component.We investigate the minimum functionalities for social interaction, supposed that a humanoid is equipped with auditory and visual perception and simple motor control but not with sound output. Real-time audio-visual multiple-talker tracking system is implemented on the humanoid, SIG, by using sound source localization, stereo vision, face recognition, and motor control. It extracts either auditory or visual streams and associates audio and visual streams by the proximity in localization. Socially oriented attention control makes the best use of personality variations classified by the Interpersonal Theory of psychology. It also provides task-oriented funcitons with decaying factor of belief for each stream. We demonstrate that the resulting behavior of SIG invites the users’ participation in interaction and encourages the users to explore SIG’s behaviors. These demonstrations show that SIG behaves like a physical non-verbal Eliza.
AB - We are studying how to create social physical agents, i.e., humanoids, that perform actions empowered by real-time audio-visual tracking of multiple talkers. Social skills require complex perceptual and motor capabilities as well as communicating ones. It is critical to identify primary features in designing building blocks for social skills, because performance of social interaction is usually evaluated as a whole system but not as each component.We investigate the minimum functionalities for social interaction, supposed that a humanoid is equipped with auditory and visual perception and simple motor control but not with sound output. Real-time audio-visual multiple-talker tracking system is implemented on the humanoid, SIG, by using sound source localization, stereo vision, face recognition, and motor control. It extracts either auditory or visual streams and associates audio and visual streams by the proximity in localization. Socially oriented attention control makes the best use of personality variations classified by the Interpersonal Theory of psychology. It also provides task-oriented funcitons with decaying factor of belief for each stream. We demonstrate that the resulting behavior of SIG invites the users’ participation in interaction and encourages the users to explore SIG’s behaviors. These demonstrations show that SIG behaves like a physical non-verbal Eliza.
UR - http://www.scopus.com/inward/record.url?scp=77954272906&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77954272906&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:77954272906
SN - 3540440380
SN - 9783540440383
VL - 2417
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 552
EP - 562
BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
PB - Springer Verlag
T2 - 7th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2002
Y2 - 18 August 2002 through 22 August 2002
ER -