TY - GEN
T1 - Integrating Semantic-Space Finetuning and Self-Training for Semi-Supervised Multi-label Text Classification
AU - Xu, Zhewei
AU - Iwaihara, Mizuho
N1 - Publisher Copyright:
© 2021, Springer Nature Switzerland AG.
PY - 2021
Y1 - 2021
N2 - To meet the challenge of lack of labeled data in document classification tasks, semi-supervised learning has been studied, in which unlabeled samples are also utilized for training. Self-training is one of the iconic strategies for semi-supervised learning, in which a classifier trains itself by its own predictions. However, self-training has been mostly applied to multi-class classification, and rarely applied to the multi-label scenario. In this paper, we propose a self-training-based approach for semi-supervised multi-label document classification, in which semantic-space finetuning is introduced and integrated into the self-training process. Newly discovered credible predictions are used not only for classifier finetuning, but also for semantic-space finetuning, which further benefit label propagation for exploring more credible predictions. Experimental results confirm the effectiveness of the proposed approach and show a satisfactory improvement over the baseline methods.
AB - To meet the challenge of lack of labeled data in document classification tasks, semi-supervised learning has been studied, in which unlabeled samples are also utilized for training. Self-training is one of the iconic strategies for semi-supervised learning, in which a classifier trains itself by its own predictions. However, self-training has been mostly applied to multi-class classification, and rarely applied to the multi-label scenario. In this paper, we propose a self-training-based approach for semi-supervised multi-label document classification, in which semantic-space finetuning is introduced and integrated into the self-training process. Newly discovered credible predictions are used not only for classifier finetuning, but also for semantic-space finetuning, which further benefit label propagation for exploring more credible predictions. Experimental results confirm the effectiveness of the proposed approach and show a satisfactory improvement over the baseline methods.
KW - Label propagation
KW - Multi-label classification
KW - Self-training
KW - Semantic-space finetuning
KW - Semi-supervised learning
UR - http://www.scopus.com/inward/record.url?scp=85121907327&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85121907327&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-91669-5_20
DO - 10.1007/978-3-030-91669-5_20
M3 - Conference contribution
AN - SCOPUS:85121907327
SN - 9783030916688
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 249
EP - 263
BT - Towards Open and Trustworthy Digital Societies - 23rd International Conference on Asia-Pacific Digital Libraries, ICADL 2021, Proceedings
A2 - Ke, Hao-Ren
A2 - Lee, Chei Sian
A2 - Sugiyama, Kazunari
PB - Springer Science and Business Media Deutschland GmbH
T2 - 23rd International Conference on Asia-Pacific Digital Libraries, ICADL 2021
Y2 - 1 December 2021 through 3 December 2021
ER -