Learning misclassification costs for imbalanced datasets, application in gene expression data classification

Huijuan Lu, Yige Xu, Minchao Ye, Ke Yan, Qun Jin, Zhigang Gao

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Cost-sensitive algorithms have been widely used to solve imbalanced classification problem. However, the misclassification costs are usually determined empirically, leading to uncertain performance. Hence an effective method is desired to automatically calculate the optimal cost weights. Targeting at the highest weighted classification accuracy (WCA), we propose two approaches to search for the optimal cost weights, including grid searching and function fitting. In experiments, we classify imbalanced gene expression data using extreme learning machine to test the cost weights obtained by the two approaches. Comprehensive experimental results show that the function fitting is more efficient which can well find the optimal cost weights with acceptable WCA.

Original languageEnglish
Title of host publicationIntelligent Computing Theories and Application - 14th International Conference, ICIC 2018, Proceedings
EditorsPrashan Premaratne, Phalguni Gupta, De-Shuang Huang, Vitoantonio Bevilacqua
PublisherSpringer Verlag
Pages513-519
Number of pages7
ISBN (Print)9783319959290
DOIs
Publication statusPublished - 2018
Event14th International Conference on Intelligent Computing, ICIC 2018 - Wuhan, China
Duration: 2018 Aug 152018 Aug 18

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10954 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other14th International Conference on Intelligent Computing, ICIC 2018
CountryChina
CityWuhan
Period18/8/1518/8/18

Keywords

  • Correct classification rate
  • Cost-sensitive
  • Misclassification cost
  • Parameter fitting

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint Dive into the research topics of 'Learning misclassification costs for imbalanced datasets, application in gene expression data classification'. Together they form a unique fingerprint.

Cite this