Binary document classification based on fast flux discriminant with similarity measure on word set

Keisuke Okubo, Gendo Kumoi, Masayuki Goto

研究成果: Article査読

抄録

Fast Flux Discriminant (FFD) is known as one of the high-performance nonlinear binary classifiers, and it is possible to construct a classification model considering the interaction between variables. In order to take account of the interaction between variables, FFD introduces the histogram-based kernel smoothing using subspaces including variable combinations. However, when creating a subspace, the original FFD should cover all variables including combinations of variables with low interaction. Therefore, the disadvantage is that the calculation amount increases exponentially as the dimension increases. In this study, we calculate the similarity between variables by using KL divergence. Then, among the obtained similarities, divisions are performed for each subspace with similar variables. Through this method, we try to reduce the amount of calculation while maintaining classification accuracy by using only combinations of variables that are likely to take high interaction. Through the simulation experiments with Japanese newspaper articles, the effectiveness of our proposed method is clarified.

本文言語English
ページ(範囲)245-251
ページ数7
ジャーナルIndustrial Engineering and Management Systems
18
2
DOI
出版ステータスPublished - 2019 1 1

ASJC Scopus subject areas

  • Social Sciences(all)
  • Economics, Econometrics and Finance(all)

フィンガープリント 「Binary document classification based on fast flux discriminant with similarity measure on word set」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル