An improved SSO algorithm for cyber-enabled tumor risk analysis based on gene selection

Chaochao Ye, Julong Pan, Qun Jin

    Research output: Contribution to journalArticle

    Abstract

    With the emergence and wide application of cyber technologies, the process of medical informatization has progressed rapidly in recent years. The collection of gene expression data and cyber-enabled tumor risk analysis has matured and is becoming more common. In the case of tumor risk analysis, identification of the distinct genes that contribute the most to the occurrence of tumors has become an increasingly important issue. In this paper, based on gene selection, an improved SSO (Simplified Swarm Optimization) algorithm is developed for data-driven tumor risk analysis that is able to obtain a higher classification accuracy with fewer selected genes. The proposed algorithm is called iSSO-HF&LSS (improved SSO with a hybrid filter and local search strategy) and utilizes information gain and the Pearson correlation coefficient as a hybrid filter method to select a small number of distinct and discriminative genes. Moreover, to select an optimal gene subset, a new local search strategy is applied. The proposed local search strategy selects informative but fewer correlated genes by considering their correlation information. To evaluate the efficiency of the algorithm, a series of experiments is conducted using ten tumor gene expression datasets, and a comparison is made between the performance of this proposed method and nine well-known benchmark classification methods as well as methods used in six referenced studies. As evaluated by several statistical analyses, the proposed method outperforms the existing methods with significant differences and efficiently simplifies the number of gene expression levels.

    Original languageEnglish
    Pages (from-to)407-418
    Number of pages12
    JournalFuture Generation Computer Systems
    Volume92
    DOIs
    Publication statusPublished - 2019 Mar 1

    Fingerprint

    Risk analysis
    Tumors
    Genes
    Gene expression
    Experiments

    Keywords

    • Gene selection
    • Information gain
    • Pearson correlation coefficient
    • Simplified swarm optimization
    • Tumor

    ASJC Scopus subject areas

    • Software
    • Hardware and Architecture
    • Computer Networks and Communications

    Cite this

    An improved SSO algorithm for cyber-enabled tumor risk analysis based on gene selection. / Ye, Chaochao; Pan, Julong; Jin, Qun.

    In: Future Generation Computer Systems, Vol. 92, 01.03.2019, p. 407-418.

    Research output: Contribution to journalArticle

    @article{711e7e83e3b943e98de757b6f137be09,
    title = "An improved SSO algorithm for cyber-enabled tumor risk analysis based on gene selection",
    abstract = "With the emergence and wide application of cyber technologies, the process of medical informatization has progressed rapidly in recent years. The collection of gene expression data and cyber-enabled tumor risk analysis has matured and is becoming more common. In the case of tumor risk analysis, identification of the distinct genes that contribute the most to the occurrence of tumors has become an increasingly important issue. In this paper, based on gene selection, an improved SSO (Simplified Swarm Optimization) algorithm is developed for data-driven tumor risk analysis that is able to obtain a higher classification accuracy with fewer selected genes. The proposed algorithm is called iSSO-HF&LSS (improved SSO with a hybrid filter and local search strategy) and utilizes information gain and the Pearson correlation coefficient as a hybrid filter method to select a small number of distinct and discriminative genes. Moreover, to select an optimal gene subset, a new local search strategy is applied. The proposed local search strategy selects informative but fewer correlated genes by considering their correlation information. To evaluate the efficiency of the algorithm, a series of experiments is conducted using ten tumor gene expression datasets, and a comparison is made between the performance of this proposed method and nine well-known benchmark classification methods as well as methods used in six referenced studies. As evaluated by several statistical analyses, the proposed method outperforms the existing methods with significant differences and efficiently simplifies the number of gene expression levels.",
    keywords = "Gene selection, Information gain, Pearson correlation coefficient, Simplified swarm optimization, Tumor",
    author = "Chaochao Ye and Julong Pan and Qun Jin",
    year = "2019",
    month = "3",
    day = "1",
    doi = "10.1016/j.future.2018.10.008",
    language = "English",
    volume = "92",
    pages = "407--418",
    journal = "Future Generation Computer Systems",
    issn = "0167-739X",
    publisher = "Elsevier",

    }

    TY - JOUR

    T1 - An improved SSO algorithm for cyber-enabled tumor risk analysis based on gene selection

    AU - Ye, Chaochao

    AU - Pan, Julong

    AU - Jin, Qun

    PY - 2019/3/1

    Y1 - 2019/3/1

    N2 - With the emergence and wide application of cyber technologies, the process of medical informatization has progressed rapidly in recent years. The collection of gene expression data and cyber-enabled tumor risk analysis has matured and is becoming more common. In the case of tumor risk analysis, identification of the distinct genes that contribute the most to the occurrence of tumors has become an increasingly important issue. In this paper, based on gene selection, an improved SSO (Simplified Swarm Optimization) algorithm is developed for data-driven tumor risk analysis that is able to obtain a higher classification accuracy with fewer selected genes. The proposed algorithm is called iSSO-HF&LSS (improved SSO with a hybrid filter and local search strategy) and utilizes information gain and the Pearson correlation coefficient as a hybrid filter method to select a small number of distinct and discriminative genes. Moreover, to select an optimal gene subset, a new local search strategy is applied. The proposed local search strategy selects informative but fewer correlated genes by considering their correlation information. To evaluate the efficiency of the algorithm, a series of experiments is conducted using ten tumor gene expression datasets, and a comparison is made between the performance of this proposed method and nine well-known benchmark classification methods as well as methods used in six referenced studies. As evaluated by several statistical analyses, the proposed method outperforms the existing methods with significant differences and efficiently simplifies the number of gene expression levels.

    AB - With the emergence and wide application of cyber technologies, the process of medical informatization has progressed rapidly in recent years. The collection of gene expression data and cyber-enabled tumor risk analysis has matured and is becoming more common. In the case of tumor risk analysis, identification of the distinct genes that contribute the most to the occurrence of tumors has become an increasingly important issue. In this paper, based on gene selection, an improved SSO (Simplified Swarm Optimization) algorithm is developed for data-driven tumor risk analysis that is able to obtain a higher classification accuracy with fewer selected genes. The proposed algorithm is called iSSO-HF&LSS (improved SSO with a hybrid filter and local search strategy) and utilizes information gain and the Pearson correlation coefficient as a hybrid filter method to select a small number of distinct and discriminative genes. Moreover, to select an optimal gene subset, a new local search strategy is applied. The proposed local search strategy selects informative but fewer correlated genes by considering their correlation information. To evaluate the efficiency of the algorithm, a series of experiments is conducted using ten tumor gene expression datasets, and a comparison is made between the performance of this proposed method and nine well-known benchmark classification methods as well as methods used in six referenced studies. As evaluated by several statistical analyses, the proposed method outperforms the existing methods with significant differences and efficiently simplifies the number of gene expression levels.

    KW - Gene selection

    KW - Information gain

    KW - Pearson correlation coefficient

    KW - Simplified swarm optimization

    KW - Tumor

    UR - http://www.scopus.com/inward/record.url?scp=85055671515&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=85055671515&partnerID=8YFLogxK

    U2 - 10.1016/j.future.2018.10.008

    DO - 10.1016/j.future.2018.10.008

    M3 - Article

    VL - 92

    SP - 407

    EP - 418

    JO - Future Generation Computer Systems

    JF - Future Generation Computer Systems

    SN - 0167-739X

    ER -