Weakly-Supervised Neural Categorization of Wikipedia Articles

Xingyu Chen, Mizuho Iwaihara

研究成果: Conference contribution

抄録

Deep neural models are gaining increasing popularity for many NLP tasks, due to their strong expressive power and less requirement for feature engineering. Neural models often need a large amount of labeled training documents. However, one category of Wikipedia does not contain enough articles for training. Weakly-supervised neural document classification can deal with situations even when only a small labeled document set is given. However, these RNN-based approaches often fail on long documents such as Wikipedia articles, due to hardness to retain memories on important parts of a long document. To overcome these challenges, we propose a text summarization method called WS-Rank, which extracts key sentences of documents with weighting based on class-related keywords and sentence positions in documents. After applying our WS-Rank to training and test documents to summarize then into key sentences, weakly-supervised neural classification shows remarkable improvement on classification results.

本文言語English
ホスト出版物のタイトルDigital Libraries at the Crossroads of Digital Information for the Future - 21st International Conference on Asia-Pacific Digital Libraries, ICADL 2019, Proceedings
編集者Adam Jatowt, Akira Maeda, Sue Yeon Syn
出版社Springer
ページ16-22
ページ数7
ISBN(印刷版)9783030340575
DOI
出版ステータスPublished - 2019
イベント21st International Conference on Asia-Pacific Digital Libraries, ICADL 2019 - Kuala Lumpur, Malaysia
継続期間: 2019 11 42019 11 7

出版物シリーズ

名前Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
11853 LNCS
ISSN(印刷版)0302-9743
ISSN(電子版)1611-3349

Conference

Conference21st International Conference on Asia-Pacific Digital Libraries, ICADL 2019
CountryMalaysia
CityKuala Lumpur
Period19/11/419/11/7

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

フィンガープリント 「Weakly-Supervised Neural Categorization of Wikipedia Articles」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル