A Bi-directional Multiple Timescales LSTM Model for Grounding of Actions and Verbs

Alexandre Antunes, Alban Laflaquiere, Tetsuya Ogata, Angelo Cangelosi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper we present a neural architecture to learn a bi-directional mapping between actions and language. We implement a Multiple Timescale Long Short-Term Memory (MT-LSTM) network comprised of 7 layers with different timescale factors, to connect actions to language without explicitly learning an intermediate representation. Instead, the model self-organizes such representations at the level of a slow-varying latent layer, linking action branch and language branch at the center. We train the model in a bi-directional way, learning how to produce a sentence from a certain action sequence input and, simultaneously, how to generate an action sequence given a sentence as input. Furthermore we show this model preserves some of the generalization behaviour of Multiple Timescale Recurrent Neural Networks (MTRNN) in generating sentences and actions that were not explicitly trained. We compare this model with a number of different baseline models, confirming the importance of both the bi-directional training and the multiple timescales architecture. Finally, the network was evaluated on motor actions performed by an iCub robot and their corresponding letter-based description. The results of these experiments are presented at the end of the paper.

Original languageEnglish
Title of host publication2019 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages2614-2621
Number of pages8
ISBN (Electronic)9781728140049
DOIs
Publication statusPublished - 2019 Nov
Event2019 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2019 - Macau, China
Duration: 2019 Nov 32019 Nov 8

Publication series

NameIEEE International Conference on Intelligent Robots and Systems
ISSN (Print)2153-0858
ISSN (Electronic)2153-0866

Conference

Conference2019 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2019
CountryChina
CityMacau
Period19/11/319/11/8

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Software
  • Computer Vision and Pattern Recognition
  • Computer Science Applications

Fingerprint Dive into the research topics of 'A Bi-directional Multiple Timescales LSTM Model for Grounding of Actions and Verbs'. Together they form a unique fingerprint.

  • Cite this

    Antunes, A., Laflaquiere, A., Ogata, T., & Cangelosi, A. (2019). A Bi-directional Multiple Timescales LSTM Model for Grounding of Actions and Verbs. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2019 (pp. 2614-2621). [8967799] (IEEE International Conference on Intelligent Robots and Systems). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/IROS40897.2019.8967799