Incorporating visual features into word embeddings: A bimodal autoencoder-based approach

研究成果: Paper査読

9 被引用数 (Scopus)

抄録

Multimodal semantic representation is an evolving area of research in natural language processing as well as computer vision. Combining or integrating perceptual information, such as visual features, with linguistic features is recently being actively studied. This paper presents a novel bimodal autoencoder model for multimodal representation learning: the autoencoder learns in order to enhance linguistic feature vectors by incorporating the corresponding visual features. During the runtime, owing to the trained neural network, visually enhanced multimodal representations can be achieved even for words for which direct visual-linguistic correspondences are not learned. The empirical results obtained with standard semantic relatedness tasks demonstrate that our approach is generally promising. We further investigate the potential efficacy of the enhanced word embeddings in discriminating antonyms and synonyms from vaguely related words.

本文言語English
出版ステータスPublished - 2017
イベント12th International Conference on Computational Semantics, IWCS 2017 - Montpellier, France
継続期間: 2017 9月 192017 9月 22

Conference

Conference12th International Conference on Computational Semantics, IWCS 2017
国/地域France
CityMontpellier
Period17/9/1917/9/22

ASJC Scopus subject areas

  • コンピュータ ネットワークおよび通信
  • コンピュータ サイエンスの応用
  • 情報システム

フィンガープリント

「Incorporating visual features into word embeddings: A bimodal autoencoder-based approach」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル