Paired recurrent autoencoders for bidirectional translation between robot actions and linguistic descriptions

Tatsuro Yamada, Hiroyuki Matsunaga, Tetsuya Ogata

    研究成果: Article

    1 引用 (Scopus)

    抄録

    We propose a novel deep learning framework for bidirectional translation between robot actions and their linguistic descriptions. Our model consists of two recurrent autoencoders (RAEs). One RAE learns to encode action sequences as fixed-dimensional vectors in a way that allows the sequences to be reproduced from the vectors by its decoder. The other RAE learns to encode descriptions in a similar way. In the learning process, in addition to reproduction losses, we create another loss function whereby the representations of an action and its corresponding description approach each other in the latent vector space. Across the shared representation, the trained model can produce a linguistic description given a robot action. The model is also able to generate an appropriate action by receiving a linguistic instruction, conditioned on the current visual input. Visualization of the latent representations shows that the robot actions are embedded in a semantically compositional way in the vector space by being learned jointly with descriptions.

    元の言語English
    記事番号8403309
    ページ(範囲)3441-3448
    ページ数8
    ジャーナルIEEE Robotics and Automation Letters
    3
    発行部数4
    DOI
    出版物ステータスPublished - 2018 10 1

    Fingerprint

    Linguistics
    Robot
    Robots
    Vector spaces
    Vector space
    Visualization
    Loss Function
    Learning Process
    Model
    Deep learning

    ASJC Scopus subject areas

    • Control and Systems Engineering
    • Human-Computer Interaction
    • Biomedical Engineering
    • Mechanical Engineering
    • Control and Optimization
    • Artificial Intelligence
    • Computer Science Applications
    • Computer Vision and Pattern Recognition

    これを引用

    Paired recurrent autoencoders for bidirectional translation between robot actions and linguistic descriptions. / Yamada, Tatsuro; Matsunaga, Hiroyuki; Ogata, Tetsuya.

    :: IEEE Robotics and Automation Letters, 巻 3, 番号 4, 8403309, 01.10.2018, p. 3441-3448.

    研究成果: Article

    @article{08511956f82045739d5b854a211ff7a2,
    title = "Paired recurrent autoencoders for bidirectional translation between robot actions and linguistic descriptions",
    abstract = "We propose a novel deep learning framework for bidirectional translation between robot actions and their linguistic descriptions. Our model consists of two recurrent autoencoders (RAEs). One RAE learns to encode action sequences as fixed-dimensional vectors in a way that allows the sequences to be reproduced from the vectors by its decoder. The other RAE learns to encode descriptions in a similar way. In the learning process, in addition to reproduction losses, we create another loss function whereby the representations of an action and its corresponding description approach each other in the latent vector space. Across the shared representation, the trained model can produce a linguistic description given a robot action. The model is also able to generate an appropriate action by receiving a linguistic instruction, conditioned on the current visual input. Visualization of the latent representations shows that the robot actions are embedded in a semantically compositional way in the vector space by being learned jointly with descriptions.",
    keywords = "AI-based methods, Deep learning in robotics and automation, neurorobotics",
    author = "Tatsuro Yamada and Hiroyuki Matsunaga and Tetsuya Ogata",
    year = "2018",
    month = "10",
    day = "1",
    doi = "10.1109/LRA.2018.2852838",
    language = "English",
    volume = "3",
    pages = "3441--3448",
    journal = "IEEE Robotics and Automation Letters",
    issn = "2377-3766",
    publisher = "Institute of Electrical and Electronics Engineers Inc.",
    number = "4",

    }

    TY - JOUR

    T1 - Paired recurrent autoencoders for bidirectional translation between robot actions and linguistic descriptions

    AU - Yamada, Tatsuro

    AU - Matsunaga, Hiroyuki

    AU - Ogata, Tetsuya

    PY - 2018/10/1

    Y1 - 2018/10/1

    N2 - We propose a novel deep learning framework for bidirectional translation between robot actions and their linguistic descriptions. Our model consists of two recurrent autoencoders (RAEs). One RAE learns to encode action sequences as fixed-dimensional vectors in a way that allows the sequences to be reproduced from the vectors by its decoder. The other RAE learns to encode descriptions in a similar way. In the learning process, in addition to reproduction losses, we create another loss function whereby the representations of an action and its corresponding description approach each other in the latent vector space. Across the shared representation, the trained model can produce a linguistic description given a robot action. The model is also able to generate an appropriate action by receiving a linguistic instruction, conditioned on the current visual input. Visualization of the latent representations shows that the robot actions are embedded in a semantically compositional way in the vector space by being learned jointly with descriptions.

    AB - We propose a novel deep learning framework for bidirectional translation between robot actions and their linguistic descriptions. Our model consists of two recurrent autoencoders (RAEs). One RAE learns to encode action sequences as fixed-dimensional vectors in a way that allows the sequences to be reproduced from the vectors by its decoder. The other RAE learns to encode descriptions in a similar way. In the learning process, in addition to reproduction losses, we create another loss function whereby the representations of an action and its corresponding description approach each other in the latent vector space. Across the shared representation, the trained model can produce a linguistic description given a robot action. The model is also able to generate an appropriate action by receiving a linguistic instruction, conditioned on the current visual input. Visualization of the latent representations shows that the robot actions are embedded in a semantically compositional way in the vector space by being learned jointly with descriptions.

    KW - AI-based methods

    KW - Deep learning in robotics and automation

    KW - neurorobotics

    UR - http://www.scopus.com/inward/record.url?scp=85063304945&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=85063304945&partnerID=8YFLogxK

    U2 - 10.1109/LRA.2018.2852838

    DO - 10.1109/LRA.2018.2852838

    M3 - Article

    AN - SCOPUS:85063304945

    VL - 3

    SP - 3441

    EP - 3448

    JO - IEEE Robotics and Automation Letters

    JF - IEEE Robotics and Automation Letters

    SN - 2377-3766

    IS - 4

    M1 - 8403309

    ER -