Multi-valued classification of text data based on an ECOC approach using a ternary orthogonal table

Leona Suzuki*, Kenta Mikawa, Masayuki Goto


研究成果: Article査読

1 被引用数 (Scopus)


Because of the advancements in information technology, a large number of document data has been accumulated on various databases and automatic multi-valued classification becomes highly relevant. This paper focuses on a multivalued classification technique that is based on Error-Correcting Output Codes (ECOC) and which combines several binary classifiers. When predicting the category of a new document data, the outputs of the binary classifiers are combined to produce a predicted value. It is a known problem that if two category sets have an imbalanced amount of training data, the prediction accuracy of a binary classifier is low. To solve this problem, a previous study proposed to employ the Reed-Muller (RM) codes in the context an ECOC approach for resolving the imbalance in the cardinality of the training data sets. However, RM codes can equalize the amount of between training data of two category sets only for a specific number of categories. We want to provide a method that can be employed for a multi-valued classification with an arbitrary number of categories. In this paper, we propose a new configuration method combining binary classifiers with categories, which are not used for classification. This method allows us to reduce the amount of training data for each binary classifier while improving the balance of the training data between two category sets for each binary classifier. As a result, the computational complexity can be decreased. We verify the effectiveness of our proposed method by conducting a document classification experiment.

ジャーナルIndustrial Engineering and Management Systems
出版ステータスPublished - 2017 6月

ASJC Scopus subject areas

  • 社会科学(全般)
  • 経済学、計量経済学および金融学(全般)


「Multi-valued classification of text data based on an ECOC approach using a ternary orthogonal table」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。