Link Prediction forWikipedia Articles based on Temporal Article Embedding

Jiaji Ma, Mizuho Iwaihara

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Wikipedia articles contain a vast number of hyperlinks (internal links) connecting subjects to other Wikipedia articles. It is useful to predict future links for newly created articles. Suggesting new links from/to existing articles can reduce editors' burdens, by prompting editors about necessary or missing links in their updates. In this paper, we discuss link prediction on linked and versioned articles. We propose new graph embeddings utilizing temporal random walk, which is biased by timestamp difference and semantic difference between linked and versioned articles. We generate article sequences by concatenating the article titles and category names on each random walk path. A pretrained language model is further trained to learn contextualized embeddings of article sequences. We design our link prediction experiments by predicting future links between new nodes and existing nodes. For evaluation, we compare our model's prediction results with three random walk-based graph embedding models DeepWalk, Node2vec, and CTDNE, through ROC AUC score, PRC AUC score, Precision@k, Recall@k, and F1@k as evaluation metrics. Our experimental results show that our proposed TLPRB outperforms these models in all the evaluation metrics.

Original languageEnglish
Title of host publication13th International Conference on Knowledge Discovery and Information Retrieval, KDIR 2021 as part of IC3K 2021 - Proceedings of the 13th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management
EditorsRita Cucchiara, Ana Fred, Joaquim Filipe
PublisherScience and Technology Publications, Lda
Pages87-94
Number of pages8
ISBN (Electronic)9789897585333
Publication statusPublished - 2021
Event13th International Conference on Knowledge Discovery and Information Retrieval, KDIR 2021 as part of 13th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, IC3K 2021 - Virtual, Online
Duration: 2022 Oct 252022 Oct 27

Publication series

NameInternational Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, IC3K - Proceedings
Volume1
ISSN (Electronic)2184-3228

Conference

Conference13th International Conference on Knowledge Discovery and Information Retrieval, KDIR 2021 as part of 13th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, IC3K 2021
CityVirtual, Online
Period22/10/2522/10/27

Keywords

  • Graph Embedding
  • Link Prediction
  • Temporal Random Walk

ASJC Scopus subject areas

  • Software
  • Management of Technology and Innovation
  • Strategy and Management

Fingerprint

Dive into the research topics of 'Link Prediction forWikipedia Articles based on Temporal Article Embedding'. Together they form a unique fingerprint.

Cite this