Learning misclassification costs for imbalanced datasets, application in gene expression data classification

Huijuan Lu, Yige Xu, Minchao Ye, Ke Yan, Qun Jin, Zhigang Gao

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    Cost-sensitive algorithms have been widely used to solve imbalanced classification problem. However, the misclassification costs are usually determined empirically, leading to uncertain performance. Hence an effective method is desired to automatically calculate the optimal cost weights. Targeting at the highest weighted classification accuracy (WCA), we propose two approaches to search for the optimal cost weights, including grid searching and function fitting. In experiments, we classify imbalanced gene expression data using extreme learning machine to test the cost weights obtained by the two approaches. Comprehensive experimental results show that the function fitting is more efficient which can well find the optimal cost weights with acceptable WCA.

    Original languageEnglish
    Title of host publicationIntelligent Computing Theories and Application - 14th International Conference, ICIC 2018, Proceedings
    EditorsPrashan Premaratne, Phalguni Gupta, De-Shuang Huang, Vitoantonio Bevilacqua
    PublisherSpringer-Verlag
    Pages513-519
    Number of pages7
    ISBN (Print)9783319959290
    DOIs
    Publication statusPublished - 2018 Jan 1
    Event14th International Conference on Intelligent Computing, ICIC 2018 - Wuhan, China
    Duration: 2018 Aug 152018 Aug 18

    Publication series

    NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    Volume10954 LNCS
    ISSN (Print)0302-9743
    ISSN (Electronic)1611-3349

    Other

    Other14th International Conference on Intelligent Computing, ICIC 2018
    CountryChina
    CityWuhan
    Period18/8/1518/8/18

    Keywords

    • Correct classification rate
    • Cost-sensitive
    • Misclassification cost
    • Parameter fitting

    ASJC Scopus subject areas

    • Theoretical Computer Science
    • Computer Science(all)

    Fingerprint Dive into the research topics of 'Learning misclassification costs for imbalanced datasets, application in gene expression data classification'. Together they form a unique fingerprint.

  • Cite this

    Lu, H., Xu, Y., Ye, M., Yan, K., Jin, Q., & Gao, Z. (2018). Learning misclassification costs for imbalanced datasets, application in gene expression data classification. In P. Premaratne, P. Gupta, D-S. Huang, & V. Bevilacqua (Eds.), Intelligent Computing Theories and Application - 14th International Conference, ICIC 2018, Proceedings (pp. 513-519). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 10954 LNCS). Springer-Verlag. https://doi.org/10.1007/978-3-319-95930-6_47