Ensemble classification algorithm based improved SMOTE for imbalanced data

User Rating:  / 0
PoorBest 

Authors:

Liu Ning, Shangluo University, Shangluo, China

Abstract:

Purpose. In practical application, the accuracy of the minority class is very important and the research on imbalanced data has become one of the most popular topics. In order to improve the classification performance for imbalanced data, the classification algorithm based on data sampling and integration technology for imbalanced data was proposed.

Methodology. Firstly, the traditional SMOTE algorithm was improved to K-SMOTE (an over-sampling method based on SMOTE and K-means). In K-SMOTE, the dataset was to perform clustering operation, and the interpolation operation was performed on the connection of the cluster center and the original data point. Secondly, ECA-IBD (an ensemble classification algorithm based improved SMOTE for imbalanced data) was proposed. In ECA-IBD, over-sampling was conducted by K-SMOTE, and random under-sampling was carried out to reduce the problem scale to form a new dataset. A number of weak classifiers were generated and integration techniques were used to form the final strong classifier.

Findings. Experiment was carried out on the UCI imbalanced dataset. The results showed that the proposed algorithm was effective by using the F-value and G-mean value as the evaluation indexes.

Originality. In the paper, we improved the SMOTE algorithm and combined over-sampling technology, under-sampling technology and boosting technology to solve the classification problem for imbalanced data.

Practical value. The proposed algorithm has important value in imbalanced data classification. It can be applied in the field of different kinds of imbalanced data classification, such as fault detection, intrusion detection, etc.

Список літератури / References

1. Napierała, K. and Stefanowski, J., 2015.Addressing imbalanced data with argument based rule learning. Expert Systems with Applications, vol.24, no.24, pp. 9468‒9481.

2. Ditzler, G. and Polikar, R.,2013. Incremental learning of concept drift from streaming imbalanced data. IEEE Transactions on Knowledge & Data Engineering, vol.25, no.10, pp. 2283‒2301.

3. Maldonado, S.andLópez, J., 2014. Imbalanced data classification using second-order cone programming support vector machines. Pattern Recognition, vol.47, no.5, pp.2070‒2079.

4. Barua, S., Islam, M.M. and Yao, X., 2014. MWMOTE-majority weighted minority-oversampling technique for imbalanced dataset learning. IEEE Transactions on Knowledge & Data Engineering, vol.26, no.2, pp.405‒425.

5. Castro, C.L.and Braga, A.P.,2013. Novel cost-sensitive approach to improve the multilayer perceptron performance on imbalanced data. IEEE Transactions on Neural Networks & Learning Systems, vol.24, no.6, pp.888‒899.

6. Maratea, A., Petrosino, A.and Manzo, M., 2014. Adjusted F-measure and kernel scaling for imbalanced data learning. Information Sciences, vol.257, no.257, pp.331–341.

7. Sun, Z., Song, Q. and Zhu, X., 2015.A novel ensemble method for classifying imbalanced data. Pattern Recognition, vol.48, no.5, pp.1623‒1637.

8. Galar, M., Fernández, A.andBarrenechea, E., 2013. EUSBoost: Enhancing ensembles for highly imbalanced datasets by evolutionary undersampling. Pattern Recognition, vol.46, no.12, pp.460‒3471.

9. Khoshgoftaar, T.M., Van Hulse, J. and Napolitano, A., 2011. Comparing boosting and bagging techniques with noisy and imbalanced data. IEEE Transactions on Systems Man and Cybernetics - Part a Systems and Humans, vol.41, no.3, pp.552‒568.

10. Ghazikhani, A., Monsefi, R. and Yazdi, H.S., 2013. Ensemble of online neural networks for non-stationary and imbalanced data streams.Neurocomputing, vol.122, pp.535‒544.

 

Files:
2016_02_Liu
Date 2016-06-21 Filesize 831.52 KB Download 774

Visitors

6235757
Today
This Month
All days
211
62434
6235757

Guest Book

If you have questions, comments or suggestions, you can write them in our "Guest Book"

Registration data

ISSN (print) 2071-2227,
ISSN (online) 2223-2362.
Journal was registered by Ministry of Justice of Ukraine.
Registration number КВ No.17742-6592PR dated April 27, 2011.

Contacts

D.Yavornytskyi ave.,19, pavilion 3, room 24-а, Dnipro, 49005
Tel.: +38 (056) 746 32 79.
e-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.
You are here: Home Archive by field of science IT technologies Ensemble classification algorithm based improved SMOTE for imbalanced data