TY - JOUR
T1 - Solving the imbalanced data classification problem with the particle swarm optimization based support vector machine
AU - Xu, Zhenyuan
AU - Watada, Juilzo
AU - Wu, Mingnan
AU - Ibrahim, Zuwarie
AU - Khalid, Marzuki
PY - 2014
Y1 - 2014
N2 - A database contains a wealth of hidden knowledge that can be used in decision making to support commerce, business, management, research and other activities. Classification analysis plays a pivotal role in the pattern recognition field, where it is considered as a core method. Algorithms such as support vector machine (SVM) and artificial neural network (ANN) have been proposed to solve the problem of binary classification according to data distributions. But these traditional classification algorithms are unable to provide satisfying results for an imbalanced dataset with special characters. In this paper, we propose a model based on particle swarm optimization (PSO) and support vector machine (SVM) for using in the classification of a large, imbalanced dataset. This model is referred to as the PSO-SVM (particle swarm optimization-based support vector machine) model. PSO was recently proposed as a metaheuristic framework for large, imbalanced dataset classification. The SVM algorithm also exhibits a high level of performance in handling balanced binary classification. Therefore, the novel model proposed here is introduced to improve classification accuracy by combining support vector classification (SVC) with an imbalanced PSO. The G-mean is used to evaluate the final results. In the final section of this paper, the proposed method is compared with some conventional heuristic models. The experimental results demonstrate that the proposed method exhibits a high level of performance for imbalanced dataset classification.
AB - A database contains a wealth of hidden knowledge that can be used in decision making to support commerce, business, management, research and other activities. Classification analysis plays a pivotal role in the pattern recognition field, where it is considered as a core method. Algorithms such as support vector machine (SVM) and artificial neural network (ANN) have been proposed to solve the problem of binary classification according to data distributions. But these traditional classification algorithms are unable to provide satisfying results for an imbalanced dataset with special characters. In this paper, we propose a model based on particle swarm optimization (PSO) and support vector machine (SVM) for using in the classification of a large, imbalanced dataset. This model is referred to as the PSO-SVM (particle swarm optimization-based support vector machine) model. PSO was recently proposed as a metaheuristic framework for large, imbalanced dataset classification. The SVM algorithm also exhibits a high level of performance in handling balanced binary classification. Therefore, the novel model proposed here is introduced to improve classification accuracy by combining support vector classification (SVC) with an imbalanced PSO. The G-mean is used to evaluate the final results. In the final section of this paper, the proposed method is compared with some conventional heuristic models. The experimental results demonstrate that the proposed method exhibits a high level of performance for imbalanced dataset classification.
KW - Imbalanced dataset classification
KW - Particle swarm optimization (PSO)
KW - Particle swarm optimization-based support vector machine (PSO-SVM)
KW - Support vector classification (SVC)
UR - http://www.scopus.com/inward/record.url?scp=84901784027&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84901784027&partnerID=8YFLogxK
U2 - 10.1541/ieejeiss.134.788
DO - 10.1541/ieejeiss.134.788
M3 - Article
AN - SCOPUS:84901784027
SN - 0385-4221
VL - 134
SP - 788
EP - 795
JO - IEEJ Transactions on Electronics, Information and Systems
JF - IEEJ Transactions on Electronics, Information and Systems
IS - 6
ER -