抄録
Multimodal semantic representation is an evolving area of research in natural language processing as well as computer vision. Combining or integrating perceptual information, such as visual features, with linguistic features is recently being actively studied. This paper presents a novel bimodal autoencoder model for multimodal representation learning: the autoencoder learns in order to enhance linguistic feature vectors by incorporating the corresponding visual features. During the runtime, owing to the trained neural network, visually enhanced multimodal representations can be achieved even for words for which direct visual-linguistic correspondences are not learned. The empirical results obtained with standard semantic relatedness tasks demonstrate that our approach is generally promising. We further investigate the potential efficacy of the enhanced word embeddings in discriminating antonyms and synonyms from vaguely related words.
本文言語 | English |
---|---|
出版ステータス | Published - 2017 |
イベント | 12th International Conference on Computational Semantics, IWCS 2017 - Montpellier, France 継続期間: 2017 9月 19 → 2017 9月 22 |
Conference
Conference | 12th International Conference on Computational Semantics, IWCS 2017 |
---|---|
国/地域 | France |
City | Montpellier |
Period | 17/9/19 → 17/9/22 |
ASJC Scopus subject areas
- コンピュータ ネットワークおよび通信
- コンピュータ サイエンスの応用
- 情報システム