Finding co-occurring topics in wikipedia article segments

Renzhi Wang, Jianmin Wu, Mizuho Iwaihara

研究成果: Chapter

2 引用 (Scopus)

抄録

Wikipedia is the largest online encyclopedia, in which articles form knowledgeable and semantic resources. Identical topics in different articles indicate that the articles are related to each other about topics. Finding such co-occurring topics is useful to improve the accuracy of querying and clustering, and also to contrast related articles. Existing topic alignment work and topic relevance detection are based on term occurrence. In our research, we discuss incorporating latent topics existing in article segments by utilizing Latent Dirichlet Allocation (LDA), to detect topic relevance. We also study how segment proximities, arising from segment ordering and hyperlinks, shall be incorporated into topic detection and alignment. Experimental data show our method can find and distinguish three types of co-occurrence.

元の言語English
ホスト出版物のタイトルLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
出版者Springer Verlag
ページ252-259
ページ数8
8839
ISBN(印刷物)9783319128221
出版物ステータスPublished - 2014
イベント16th International Conference on Asia-Pacific Digital Libraries, ICADL 2014 - Chiang Mai
継続期間: 2014 11 52014 11 7

出版物シリーズ

名前Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
8839
ISSN(印刷物)03029743
ISSN(電子版)16113349

Other

Other16th International Conference on Asia-Pacific Digital Libraries, ICADL 2014
Chiang Mai
期間14/11/514/11/7

Fingerprint

Wikipedia
Alignment
Semantics
Proximity
Dirichlet
Experimental Data
Clustering
Resources
Term
Relevance

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

これを引用

Wang, R., Wu, J., & Iwaihara, M. (2014). Finding co-occurring topics in wikipedia article segments. : Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (巻 8839, pp. 252-259). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); 巻数 8839). Springer Verlag.

Finding co-occurring topics in wikipedia article segments. / Wang, Renzhi; Wu, Jianmin; Iwaihara, Mizuho.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 巻 8839 Springer Verlag, 2014. p. 252-259 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); 巻 8839).

研究成果: Chapter

Wang, R, Wu, J & Iwaihara, M 2014, Finding co-occurring topics in wikipedia article segments. : Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 巻. 8839, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 巻. 8839, Springer Verlag, pp. 252-259, 16th International Conference on Asia-Pacific Digital Libraries, ICADL 2014, Chiang Mai, 14/11/5.
Wang R, Wu J, Iwaihara M. Finding co-occurring topics in wikipedia article segments. : Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 巻 8839. Springer Verlag. 2014. p. 252-259. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
Wang, Renzhi ; Wu, Jianmin ; Iwaihara, Mizuho. / Finding co-occurring topics in wikipedia article segments. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 巻 8839 Springer Verlag, 2014. pp. 252-259 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inbook{cea015e50ec34aab881967a5f7bc4401,
title = "Finding co-occurring topics in wikipedia article segments",
abstract = "Wikipedia is the largest online encyclopedia, in which articles form knowledgeable and semantic resources. Identical topics in different articles indicate that the articles are related to each other about topics. Finding such co-occurring topics is useful to improve the accuracy of querying and clustering, and also to contrast related articles. Existing topic alignment work and topic relevance detection are based on term occurrence. In our research, we discuss incorporating latent topics existing in article segments by utilizing Latent Dirichlet Allocation (LDA), to detect topic relevance. We also study how segment proximities, arising from segment ordering and hyperlinks, shall be incorporated into topic detection and alignment. Experimental data show our method can find and distinguish three types of co-occurrence.",
keywords = "LDA, Link, MLE, Wikipedia",
author = "Renzhi Wang and Jianmin Wu and Mizuho Iwaihara",
year = "2014",
language = "English",
isbn = "9783319128221",
volume = "8839",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "252--259",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - CHAP

T1 - Finding co-occurring topics in wikipedia article segments

AU - Wang, Renzhi

AU - Wu, Jianmin

AU - Iwaihara, Mizuho

PY - 2014

Y1 - 2014

N2 - Wikipedia is the largest online encyclopedia, in which articles form knowledgeable and semantic resources. Identical topics in different articles indicate that the articles are related to each other about topics. Finding such co-occurring topics is useful to improve the accuracy of querying and clustering, and also to contrast related articles. Existing topic alignment work and topic relevance detection are based on term occurrence. In our research, we discuss incorporating latent topics existing in article segments by utilizing Latent Dirichlet Allocation (LDA), to detect topic relevance. We also study how segment proximities, arising from segment ordering and hyperlinks, shall be incorporated into topic detection and alignment. Experimental data show our method can find and distinguish three types of co-occurrence.

AB - Wikipedia is the largest online encyclopedia, in which articles form knowledgeable and semantic resources. Identical topics in different articles indicate that the articles are related to each other about topics. Finding such co-occurring topics is useful to improve the accuracy of querying and clustering, and also to contrast related articles. Existing topic alignment work and topic relevance detection are based on term occurrence. In our research, we discuss incorporating latent topics existing in article segments by utilizing Latent Dirichlet Allocation (LDA), to detect topic relevance. We also study how segment proximities, arising from segment ordering and hyperlinks, shall be incorporated into topic detection and alignment. Experimental data show our method can find and distinguish three types of co-occurrence.

KW - LDA

KW - Link

KW - MLE

KW - Wikipedia

UR - http://www.scopus.com/inward/record.url?scp=84909587341&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84909587341&partnerID=8YFLogxK

M3 - Chapter

AN - SCOPUS:84909587341

SN - 9783319128221

VL - 8839

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 252

EP - 259

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

PB - Springer Verlag

ER -