Multi-valued classification of text data based on an ECOC approach using a ternary orthogonal table

Leona Suzuki*, Kenta Mikawa, Masayuki Goto

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)


Because of the advancements in information technology, a large number of document data has been accumulated on various databases and automatic multi-valued classification becomes highly relevant. This paper focuses on a multivalued classification technique that is based on Error-Correcting Output Codes (ECOC) and which combines several binary classifiers. When predicting the category of a new document data, the outputs of the binary classifiers are combined to produce a predicted value. It is a known problem that if two category sets have an imbalanced amount of training data, the prediction accuracy of a binary classifier is low. To solve this problem, a previous study proposed to employ the Reed-Muller (RM) codes in the context an ECOC approach for resolving the imbalance in the cardinality of the training data sets. However, RM codes can equalize the amount of between training data of two category sets only for a specific number of categories. We want to provide a method that can be employed for a multi-valued classification with an arbitrary number of categories. In this paper, we propose a new configuration method combining binary classifiers with categories, which are not used for classification. This method allows us to reduce the amount of training data for each binary classifier while improving the balance of the training data between two category sets for each binary classifier. As a result, the computational complexity can be decreased. We verify the effectiveness of our proposed method by conducting a document classification experiment.

Original languageEnglish
Pages (from-to)155-164
Number of pages10
JournalIndustrial Engineering and Management Systems
Issue number2
Publication statusPublished - 2017 Jun


  • Error-correcting output codes
  • Multi-valued classification
  • Ternary code table
  • Text data

ASJC Scopus subject areas

  • Social Sciences(all)
  • Economics, Econometrics and Finance(all)


Dive into the research topics of 'Multi-valued classification of text data based on an ECOC approach using a ternary orthogonal table'. Together they form a unique fingerprint.

Cite this