Науковий вісник НГУ

FINANCIAL STATEMENT FRAUD DETECTION THROUGH MULTIPLE INSTANCE LEARNING

User Rating:

/ 0

Details: Category: Economy and management; Last Updated on 04 August 2016; Published on 04 August 2016; Hits: 4172

SocButtons v1.4

Authors:

Lingbing Tang, Mobile E-business Collaborative Innovation Center of Hunan Province, Hunan University of Commerce, Changsha, Hunan, China, Key Laboratory of Hunan Province for Mobile Business Intelligence, Changsha, Hunan, China

Pin Peng, School of Finance and Banking, Hunan University of Commerce, Changsha, Hunan, China

Changqing Luo, Mobile E-business Collaborative Innovation Center of Hunan Province, Hunan University of Commerce, Changsha, Hunan, China, Key Laboratory of Hunan Province for Mobile Business Intelligence, Changsha, Hunan, China

Abstract:

Purpose. Financial statement fraud detection (FSFD) based on machine learning is a very important problem for avoiding financial risk and maintaining an orderly market. The purpose of this research was to develop a multiple instance learning model that is capable of detecting and predicting the risk of fraudulent financial reporting.

Methodology. Each pair was composed of a singe-instance learning algorithm and its corresponding multiple instance learning algorithm, which were trained using a data set of 484 fraud companies as well as 902 normal companies with forming 4158 instances from Item 8 of the U.S. Securities and Exchange Commission (SEC) Form 10-K.

Findings. Empirical study shows that MIBoost, miGraph and CKNN are superior compared to AdaBoostM1, SVM and KNN correspondingly in accuracy, F1 score and area under receiver operating characteristics curve (AUC), which prove that multiple instance learning algorithms can fit FSFD better, especially under class-imbalance and few training data.

Originality. When a detecting label which corresponds to temporally local Financial Statement is attached collectively to groups of Financial Statements for one company without presenting the data to which Financial Statement this label is assigned, it is a multiple instance problem. The research presents a multiple instance learning model for FSFD originally.

Practical value. We have also considered the fact that some auditors are dissatisfied with the single label learning algorithms because there are many instances in one company without label. Our model is more reasonable and accurate.

References / Список літератури

1. Sharma, A. and Panigrahi, P.K., 2012. A Review of Financial Accounting fraud detection based on Data Mining Techniques. International Journal of Computer Applications, Vol. 39, No. 1, pp. 37–47.

2. Song, X.P., Hu, Z.H., Du, J.G. and Seng, Z.H., 2014. Application of machine learning methods to risk assessment of financial statement fraud: evidence from China. Journal of Forecasting, Vol. 33, No. 8, pp. 611–626.

3. Perols, J., 2011. FSFD: An Analysis of Statistical and Machine Learning Algorithms. Auditing: A Journal of Practice and Theory, Vol. 30, No. 2, pp. 19– 50.

4. Salama, A.S. and Omar, A.A., 2014. A Back Propagation Artificial Neural Network based Model for Detecting and Predicting Fraudulent Financial Reporting. International Journal of Computer Applications, Vol. 106, No. 2, pp. 1–8.

5. Lin, C.C., Chiu, A.A., Huang, S.Y. and Yen, D.C., 2015. Detecting the financial statement fraud: The analysis of the differences between data mining techniques and experts’ judgments. Knowledge-Based Systems, Vol. 89, pp. 459–470.

6. Hong, R.C., Wang, M., Gao, Y., Tao, D.C., Li, X.L. and Wu, X.D., 2014. “Image annotation by multiple-instance learning with discriminative feature mapping and selection”, IEEE Transactions on Cybernetics, Vol. 44, No. 5, pp. 669–680.

7. Jiang, L.X., Cai, Z.H., Wang, D.H. and Zhang, H., 2014. Bayesian citation-KNN with distance weighting. International Journal of Machine Learning and Cybernetics, Vol. 5, No. 2, pp. 193–199.

8. Zhang, Q., Tian, Y.J. and Liu, D.L., 2013. Nonparallel support vector machines for multiple-instance learning. Procedia Computer Science, Vol. 17, pp. 1063– 1072.

9. Nguyen, D.T., Nguyen, C.D., Hargraves, R., Kurgan, L. A. and Cios, K.J., 2013, mi-DS: Multiple-instance learning algorithm. IEEE Transactions on Cybernetics, Vol. 43, No. 1, pp. 143–154.

10. Ali, K. and Saenko, K., 2014. Confidence-rated multiple instance boosting for object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2433–2440.

Files:

03_2016_Lingbing


2016-07-29 578.98 KB 992

Tags: financial statement • fraud detection • machine learning • multiple instance learning miboosting • miGraph CKNN

Older news items:

< Prev

Visitors

8438344

Today

This Month

All days

11993

481550

8438344

Visitors Counter

Guest Book

If you have questions, comments or suggestions, you can write them in our "Guest Book"

Registration data

ISSN (print) 2071-2227,
ISSN (online) 2223-2362.
Journal was registered by Ministry of Justice of Ukraine.
Registration number КВ No.17742-6592PR dated April 27, 2011.

Contacts

D.Yavornytskyi ave.,19, pavilion 3, room 24-а, Dnipro, 49005

Tel.: +38 (066) 379 72 44.

e-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.

You are here: Home