Self-supervised learning for visual summary identification in scientific publications

Shintaro Yamamoto*, Anne Lauscher, Simone Paolo Ponzetto, Goran Glavaš, Shigeo Morishima

*この研究の対応する著者

研究成果: Conference article査読

抄録

Providing visual summaries of scientific publications can increase information access for readers and thereby help deal with the exponential growth in the number of scientific publications. Nonetheless, efforts in providing visual publication summaries have been few and far apart, primarily focusing on the biomedical domain. This is primarily because of the limited availability of annotated gold standards, which hampers the application of robust and high-performing supervised learning techniques. To address these problems we create a new benchmark dataset for selecting figures to serve as visual summaries of publications based on their abstracts, covering several domains in computer science. Moreover, we develop a self-supervised learning approach, based on heuristic matching of inline references to figures with figure captions. Experiments in both biomedical and computer science domains show that our model is able to outperform the state of the art despite being self-supervised and therefore not relying on any annotated training data.

本文言語English
ページ(範囲)5-19
ページ数15
ジャーナルCEUR Workshop Proceedings
2847
出版ステータスPublished - 2021
イベント11th International Workshop on Bibliometric-Enhanced Information Retrieval, BIR 2021 - Viruta, Lucca, Italy
継続期間: 2021 4 1 → …

ASJC Scopus subject areas

  • コンピュータ サイエンス(全般)

フィンガープリント

「Self-supervised learning for visual summary identification in scientific publications」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル