FINANCIAL STATEMENT FRAUD DETECTION THROUGH MULTIPLE INSTANCE LEARNING
- Details
- Category: Economy and management
- Last Updated on 04 August 2016
- Published on 04 August 2016
- Hits: 4044
Authors:
Lingbing Tang, Mobile E-business Collaborative Innovation Center of Hunan Province, Hunan University of Commerce, Changsha, Hunan, China, Key Laboratory of Hunan Province for Mobile Business Intelligence, Changsha, Hunan, China
Pin Peng, School of Finance and Banking, Hunan University of Commerce, Changsha, Hunan, China
Changqing Luo, Mobile E-business Collaborative Innovation Center of Hunan Province, Hunan University of Commerce, Changsha, Hunan, China, Key Laboratory of Hunan Province for Mobile Business Intelligence, Changsha, Hunan, China
Abstract:
Purpose. Financial statement fraud detection (FSFD) based on machine learning is a very important problem for avoiding financial risk and maintaining an orderly market. The purpose of this research was to develop a multiple instance learning model that is capable of detecting and predicting the risk of fraudulent financial reporting.
Methodology. Each pair was composed of a singe-instance learning algorithm and its corresponding multiple instance learning algorithm, which were trained using a data set of 484 fraud companies as well as 902 normal companies with forming 4158 instances from Item 8 of the U.S. Securities and Exchange Commission (SEC) Form 10-K.
Findings. Empirical study shows that MIBoost, miGraph and CKNN are superior compared to AdaBoostM1, SVM and KNN correspondingly in accuracy, F1 score and area under receiver operating characteristics curve (AUC), which prove that multiple instance learning algorithms can fit FSFD better, especially under class-imbalance and few training data.
Originality. When a detecting label which corresponds to temporally local Financial Statement is attached collectively to groups of Financial Statements for one company without presenting the data to which Financial Statement this label is assigned, it is a multiple instance problem. The research presents a multiple instance learning model for FSFD originally.
Practical value. We have also considered the fact that some auditors are dissatisfied with the single label learning algorithms because there are many instances in one company without label. Our model is more reasonable and accurate.
References / Список літератури
1. Sharma, A. and Panigrahi, P.K., 2012. A Review of Financial Accounting fraud detection based on Data Mining Techniques. International Journal of Computer Applications, Vol. 39, No. 1, pp. 37–47.
2. Song, X.P., Hu, Z.H., Du, J.G. and Seng, Z.H., 2014. Application of machine learning methods to risk assessment of financial statement fraud: evidence from China. Journal of Forecasting, Vol. 33, No. 8, pp. 611–626.
3. Perols, J., 2011. FSFD: An Analysis of Statistical and Machine Learning Algorithms. Auditing: A Journal of Practice and Theory, Vol. 30, No. 2, pp. 19– 50.
4. Salama, A.S. and Omar, A.A., 2014. A Back Propagation Artificial Neural Network based Model for Detecting and Predicting Fraudulent Financial Reporting. International Journal of Computer Applications, Vol. 106, No. 2, pp. 1–8.
5. Lin, C.C., Chiu, A.A., Huang, S.Y. and Yen, D.C., 2015. Detecting the financial statement fraud: The analysis of the differences between data mining techniques and experts’ judgments. Knowledge-Based Systems, Vol. 89, pp. 459–470.
6. Hong, R.C., Wang, M., Gao, Y., Tao, D.C., Li, X.L. and Wu, X.D., 2014. “Image annotation by multiple-instance learning with discriminative feature mapping and selection”, IEEE Transactions on Cybernetics, Vol. 44, No. 5, pp. 669–680.
7. Jiang, L.X., Cai, Z.H., Wang, D.H. and Zhang, H., 2014. Bayesian citation-KNN with distance weighting. International Journal of Machine Learning and Cybernetics, Vol. 5, No. 2, pp. 193–199.
8. Zhang, Q., Tian, Y.J. and Liu, D.L., 2013. Nonparallel support vector machines for multiple-instance learning. Procedia Computer Science, Vol. 17, pp. 1063– 1072.
9. Nguyen, D.T., Nguyen, C.D., Hargraves, R., Kurgan, L. A. and Cios, K.J., 2013, mi-DS: Multiple-instance learning algorithm. IEEE Transactions on Cybernetics, Vol. 43, No. 1, pp. 143–154.
10. Ali, K. and Saenko, K., 2014. Confidence-rated multiple instance boosting for object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2433–2440.
03_2016_Lingbing | |
2016-07-29 578.98 KB 894 |
Older news items:
- Third party logistics provider service performance evaluation based on triangular fuzzy topsis - 04/08/2016 09:53
- Forming mining students’ professional competences while studying humanities - 04/08/2016 09:51
- Strategic management training of future specialists in the system of higher education: conceptual basis - 04/08/2016 09:48