Improving Text Classification Using Knowledge in Labels

Cheng Zhang, Hayato Yamana

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Various algorithms and models have been proposed to address text classification tasks; however, they rarely consider incorporating the additional knowledge hidden in class labels. We argue that hidden information in class labels leads to better classification accuracy. In this study, instead of encoding the labels into numerical values, we incorporated the knowledge in the labels into the original model without changing the model architecture. We combined the output of an original classification model with the relatedness calculated based on the embeddings of a sequence and a keyword set. A keyword set is a word set to represent knowledge in the labels. Usually, it is generated from the classes while it could also be customized by the users. The experimental results show that our proposed method achieved statistically significant improvements in text classification tasks. The source code and experimental details of this study can be found on Github11https://github.com/HeroadZ/KiL.

Original languageEnglish
Title of host publication2021 IEEE 6th International Conference on Big Data Analytics, ICBDA 2021
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages193-197
Number of pages5
ISBN (Electronic)9780738131672
DOIs
Publication statusPublished - 2021 Mar 5
Event6th IEEE International Conference on Big Data Analytics, ICBDA 2021 - Xiamen, China
Duration: 2021 Mar 52021 Mar 8

Publication series

Name2021 IEEE 6th International Conference on Big Data Analytics, ICBDA 2021

Conference

Conference6th IEEE International Conference on Big Data Analytics, ICBDA 2021
Country/TerritoryChina
CityXiamen
Period21/3/521/3/8

Keywords

  • bert
  • deep learning
  • natural language processing
  • text classification
  • text mining

ASJC Scopus subject areas

  • Information Systems
  • Information Systems and Management
  • Artificial Intelligence
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Improving Text Classification Using Knowledge in Labels'. Together they form a unique fingerprint.

Cite this