An improved SSO algorithm for cyber-enabled tumor risk analysis based on gene selection

Chaochao Ye, Julong Pan, Qun Jin

Research output: Contribution to journalArticle

Abstract

With the emergence and wide application of cyber technologies, the process of medical informatization has progressed rapidly in recent years. The collection of gene expression data and cyber-enabled tumor risk analysis has matured and is becoming more common. In the case of tumor risk analysis, identification of the distinct genes that contribute the most to the occurrence of tumors has become an increasingly important issue. In this paper, based on gene selection, an improved SSO (Simplified Swarm Optimization) algorithm is developed for data-driven tumor risk analysis that is able to obtain a higher classification accuracy with fewer selected genes. The proposed algorithm is called iSSO-HF&LSS (improved SSO with a hybrid filter and local search strategy) and utilizes information gain and the Pearson correlation coefficient as a hybrid filter method to select a small number of distinct and discriminative genes. Moreover, to select an optimal gene subset, a new local search strategy is applied. The proposed local search strategy selects informative but fewer correlated genes by considering their correlation information. To evaluate the efficiency of the algorithm, a series of experiments is conducted using ten tumor gene expression datasets, and a comparison is made between the performance of this proposed method and nine well-known benchmark classification methods as well as methods used in six referenced studies. As evaluated by several statistical analyses, the proposed method outperforms the existing methods with significant differences and efficiently simplifies the number of gene expression levels.

LanguageEnglish
Pages407-418
Number of pages12
JournalFuture Generation Computer Systems
Volume92
DOIs
Publication statusPublished - 2019 Mar 1

Fingerprint

Risk analysis
Tumors
Genes
Gene expression
Experiments

Keywords

  • Gene selection
  • Information gain
  • Pearson correlation coefficient
  • Simplified swarm optimization
  • Tumor

ASJC Scopus subject areas

  • Software
  • Hardware and Architecture
  • Computer Networks and Communications

Cite this

An improved SSO algorithm for cyber-enabled tumor risk analysis based on gene selection. / Ye, Chaochao; Pan, Julong; Jin, Qun.

In: Future Generation Computer Systems, Vol. 92, 01.03.2019, p. 407-418.

Research output: Contribution to journalArticle

@article{711e7e83e3b943e98de757b6f137be09,
title = "An improved SSO algorithm for cyber-enabled tumor risk analysis based on gene selection",
abstract = "With the emergence and wide application of cyber technologies, the process of medical informatization has progressed rapidly in recent years. The collection of gene expression data and cyber-enabled tumor risk analysis has matured and is becoming more common. In the case of tumor risk analysis, identification of the distinct genes that contribute the most to the occurrence of tumors has become an increasingly important issue. In this paper, based on gene selection, an improved SSO (Simplified Swarm Optimization) algorithm is developed for data-driven tumor risk analysis that is able to obtain a higher classification accuracy with fewer selected genes. The proposed algorithm is called iSSO-HF&LSS (improved SSO with a hybrid filter and local search strategy) and utilizes information gain and the Pearson correlation coefficient as a hybrid filter method to select a small number of distinct and discriminative genes. Moreover, to select an optimal gene subset, a new local search strategy is applied. The proposed local search strategy selects informative but fewer correlated genes by considering their correlation information. To evaluate the efficiency of the algorithm, a series of experiments is conducted using ten tumor gene expression datasets, and a comparison is made between the performance of this proposed method and nine well-known benchmark classification methods as well as methods used in six referenced studies. As evaluated by several statistical analyses, the proposed method outperforms the existing methods with significant differences and efficiently simplifies the number of gene expression levels.",
keywords = "Gene selection, Information gain, Pearson correlation coefficient, Simplified swarm optimization, Tumor",
author = "Chaochao Ye and Julong Pan and Qun Jin",
year = "2019",
month = "3",
day = "1",
doi = "10.1016/j.future.2018.10.008",
language = "English",
volume = "92",
pages = "407--418",
journal = "Future Generation Computer Systems",
issn = "0167-739X",
publisher = "Elsevier",

}

TY - JOUR

T1 - An improved SSO algorithm for cyber-enabled tumor risk analysis based on gene selection

AU - Ye, Chaochao

AU - Pan, Julong

AU - Jin, Qun

PY - 2019/3/1

Y1 - 2019/3/1

N2 - With the emergence and wide application of cyber technologies, the process of medical informatization has progressed rapidly in recent years. The collection of gene expression data and cyber-enabled tumor risk analysis has matured and is becoming more common. In the case of tumor risk analysis, identification of the distinct genes that contribute the most to the occurrence of tumors has become an increasingly important issue. In this paper, based on gene selection, an improved SSO (Simplified Swarm Optimization) algorithm is developed for data-driven tumor risk analysis that is able to obtain a higher classification accuracy with fewer selected genes. The proposed algorithm is called iSSO-HF&LSS (improved SSO with a hybrid filter and local search strategy) and utilizes information gain and the Pearson correlation coefficient as a hybrid filter method to select a small number of distinct and discriminative genes. Moreover, to select an optimal gene subset, a new local search strategy is applied. The proposed local search strategy selects informative but fewer correlated genes by considering their correlation information. To evaluate the efficiency of the algorithm, a series of experiments is conducted using ten tumor gene expression datasets, and a comparison is made between the performance of this proposed method and nine well-known benchmark classification methods as well as methods used in six referenced studies. As evaluated by several statistical analyses, the proposed method outperforms the existing methods with significant differences and efficiently simplifies the number of gene expression levels.

AB - With the emergence and wide application of cyber technologies, the process of medical informatization has progressed rapidly in recent years. The collection of gene expression data and cyber-enabled tumor risk analysis has matured and is becoming more common. In the case of tumor risk analysis, identification of the distinct genes that contribute the most to the occurrence of tumors has become an increasingly important issue. In this paper, based on gene selection, an improved SSO (Simplified Swarm Optimization) algorithm is developed for data-driven tumor risk analysis that is able to obtain a higher classification accuracy with fewer selected genes. The proposed algorithm is called iSSO-HF&LSS (improved SSO with a hybrid filter and local search strategy) and utilizes information gain and the Pearson correlation coefficient as a hybrid filter method to select a small number of distinct and discriminative genes. Moreover, to select an optimal gene subset, a new local search strategy is applied. The proposed local search strategy selects informative but fewer correlated genes by considering their correlation information. To evaluate the efficiency of the algorithm, a series of experiments is conducted using ten tumor gene expression datasets, and a comparison is made between the performance of this proposed method and nine well-known benchmark classification methods as well as methods used in six referenced studies. As evaluated by several statistical analyses, the proposed method outperforms the existing methods with significant differences and efficiently simplifies the number of gene expression levels.

KW - Gene selection

KW - Information gain

KW - Pearson correlation coefficient

KW - Simplified swarm optimization

KW - Tumor

UR - http://www.scopus.com/inward/record.url?scp=85055671515&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85055671515&partnerID=8YFLogxK

U2 - 10.1016/j.future.2018.10.008

DO - 10.1016/j.future.2018.10.008

M3 - Article

VL - 92

SP - 407

EP - 418

JO - Future Generation Computer Systems

T2 - Future Generation Computer Systems

JF - Future Generation Computer Systems

SN - 0167-739X

ER -