TY - GEN
T1 - A quasi-linear SVM combined with assembled SMOTE for imbalanced data classification
AU - Zhou, Bo
AU - Yang, Cheng
AU - Guo, Haixiang
AU - Hu, Jinglu
PY - 2013
Y1 - 2013
N2 - This paper focuses on imbalanced dataset classification problem by using SVM and oversampling method. Traditional oversampling method increases the occurrence of over-lapping between classes, which leads to poor generalization of SVM classification. To solve this problem this paper proposes a combined method of quasi-linear SVM and assembled SMOTE. The quasi-linear SVM is an SVM with quasi-linear kernel function. It realizes an approximate nonlinear separation boundary by mulit-local linear boundaries with interpolation. The assembled SMOTE implements oversampling with considering of the data distribution information and avoids occurrence of overlapping between classes. Firstly, a partition method based on Minimal Spanning Tree is proposed to obtain local linear partitions, each of which can be separated with one linear separation boundary. Secondly, using the information of local linear partitions, the assembled SMOTE generates synthetic minority class samples. Finally, the quasi-linear SVM realizes a classification of oversampled datasets in the same way as a standard SVM by using a composite quasi-linear kernel function. Experiment results on artificial data and benchmark datasets show that the proposed method is effective and improves classification performances.
AB - This paper focuses on imbalanced dataset classification problem by using SVM and oversampling method. Traditional oversampling method increases the occurrence of over-lapping between classes, which leads to poor generalization of SVM classification. To solve this problem this paper proposes a combined method of quasi-linear SVM and assembled SMOTE. The quasi-linear SVM is an SVM with quasi-linear kernel function. It realizes an approximate nonlinear separation boundary by mulit-local linear boundaries with interpolation. The assembled SMOTE implements oversampling with considering of the data distribution information and avoids occurrence of overlapping between classes. Firstly, a partition method based on Minimal Spanning Tree is proposed to obtain local linear partitions, each of which can be separated with one linear separation boundary. Secondly, using the information of local linear partitions, the assembled SMOTE generates synthetic minority class samples. Finally, the quasi-linear SVM realizes a classification of oversampled datasets in the same way as a standard SVM by using a composite quasi-linear kernel function. Experiment results on artificial data and benchmark datasets show that the proposed method is effective and improves classification performances.
UR - http://www.scopus.com/inward/record.url?scp=84893527782&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84893527782&partnerID=8YFLogxK
U2 - 10.1109/IJCNN.2013.6707035
DO - 10.1109/IJCNN.2013.6707035
M3 - Conference contribution
AN - SCOPUS:84893527782
SN - 9781467361293
T3 - Proceedings of the International Joint Conference on Neural Networks
BT - 2013 International Joint Conference on Neural Networks, IJCNN 2013
T2 - 2013 International Joint Conference on Neural Networks, IJCNN 2013
Y2 - 4 August 2013 through 9 August 2013
ER -