Multimodal integration learning of robot behavior using deep neural networks

Kuniaki Noda*, Hiroaki Arie, Yuki Suga, Tetsuya Ogata


研究成果: Article査読

106 被引用数 (Scopus)


For humans to accurately understand the world around them, multimodal integration is essential because it enhances perceptual precision and reduces ambiguity. Computational models replicating such human ability may contribute to the practical use of robots in daily human living environments; however, primarily because of scalability problems that conventional machine learning algorithms suffer from, sensory-motor information processing in robotic applications has typically been achieved via modal-dependent processes. In this paper, we propose a novel computational framework enabling the integration of sensory-motor time-series data and the self-organization of multimodal fused representations based on a deep learning approach. To evaluate our proposed model, we conducted two behavior-learning experiments utilizing a humanoid robot; the experiments consisted of object manipulation and bell-ringing tasks. From our experimental results, we show that large amounts of sensory-motor information, including raw RGB images, sound spectrums, and joint angles, are directly fused to generate higher-level multimodal representations. Further, we demonstrated that our proposed framework realizes the following three functions: (1) cross-modal memory retrieval utilizing the information complementation capability of the deep autoencoder; (2) noise-robust behavior recognition utilizing the generalization capability of multimodal features; and (3) multimodal causality acquisition and sensory-motor prediction based on the acquired causality.

ジャーナルRobotics and Autonomous Systems
出版ステータスPublished - 2014 6月

ASJC Scopus subject areas

  • 制御およびシステム工学
  • ソフトウェア
  • 数学 (全般)
  • コンピュータ サイエンスの応用


「Multimodal integration learning of robot behavior using deep neural networks」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。