SSLBM: A New Fraud Detection Method Based on Semi- Supervised Learning

Document Type : Machine learning-Sadoghi


Alzahra University


The increment of computer technology usage and rapid development of the Internet and electronic business lead to an increase in financial transactions. With the increase of these banking activities, fraudsters also use different methods to boost their fraudulent activities. One of the ways to cope their damages is fraud detection. Although, in this field, some methods have been proposed, there are essential challenges on the way. For example, it is necessary to propose methods that detect fraud accurately and fast, simultaneously. Lack of non-fraud labeled data and little fraud labeled data for learning is another challenge in this field particularly in banking. Therefore, we propose a new fraud detection method for bank accounts called SSLBM. In this method, after preprocessing phase, a helpful learning method called SSEV is used that is based on semi-supervised learning and evolutionary algorithm. The results imply improvement of detection by using SSLBM with 68% accuracy and acceptable speed.


[1] J. H. Wang, Y. L. Liao, T. M. Tsai, and G. Hung, “Technology-based financial frauds in taiwan: issues and approaches,” in International Conference on Systems, Man and Cybernetics, IEEE, Vol. 2, pp. 1120–1124 (2006).
[2] Z. Karimi Zandian and M. Keyvanpour, “Systematic identification and analysis of different fraud detection approaches based on the strategy ahead,” International Journal of Knowledge-based and Intelligent Engineering Systems, Vol. 21, No. 2, pp. 123–134 (2017).
[3] M. Moradi and M. Keyvanpour, “Captcha and its alternatives: A review,” Security and Communication Networks, Vol. 8, No. 12, pp. 2135–2156 (2015).
[4] M. Krivko, “A hybrid model for plastic card fraud detection systems,” Expert Systems with Applications, Vol. 37, No. 8, pp. 6070–6076 (2010).
[5] S. B. E. Raj and A. A. Portia, “Analysis on credit card fraud detection methods,” in International Conference on Computer, Communication and Electrical Technology (ICCCET). IEEE, pp. 152–156 (2011).
[6] K. Seeja and M. Zareapoor, “Fraudminer: A novel credit card fraud detection model based on frequent itemset mining,” The Scientific World Journal, Vol. 20, (2014).
[7] A. Daneshpazhouh and A. Sami, “Semi-supervised outlier detection with only positive and unlabeled data based on fuzzy clustering,” International Journal on Artificial Intelligence Tools, Vol. 24, No. 3 (2015).
[8] L. Xie and R. Yan, “Extracting semantics from multimedia content: challenges and solutions,” Multimedia Content Analysis. Springer, pp. 1–31 (2008).
[9] S. Panigrahi, A. Kundu, S. Sural, and A. K. Majumdar, “Credit card fraud detection: A fusion approach using dempster–shafer theory and bayesian learning,” Information Fusion, Vol. 10, No. 4, pp. 354–363 (2009).
[10] W. H. Chang and J. S. Chang, “A novel two-stage phased modeling framework for early fraud detection in online auctions,” Expert Systems with Applications, Vol. 38, No. 9, pp. 11244–11260 (2011).
[11] A. Awad, “Collective framework for fraud detection using behavioral biometrics,” in Information Security Practices. Springer, pp. 29–37 (2017).
[12] N. Jain and V. Khan, “Credit card fraud detection using recurrent attributes,” People, Vol. 5, No. 2 (2018).
[13] P. Ram and A. G. Gray, “Fraud detection with density estimation trees,” in KDD 2017 Workshop on Anomaly Detection in Finance, pp. 85–94 (2018).
[14] R. Sarno, R. D. Dewandono, T. Ahmad, M. F. Naufal, and F. Sinaga, “Hybrid association rule learning and process mining for fraud detection.” IAENG International Journal of Computer Science, Vol. 42, No. 2 (2015).
[15] G. Baader and H. Krcmar, “Reducing false positives in fraud detection: Combining the red flag approach with process mining,” International Journal of Accounting Information Systems, Vol. 31, pp. 1–16 (2018).
[16] A. Kundu, S. Panigrahi, S. Sural, and A. K. Majumdar, “Blast-ssaha hybridization for credit card fraud detection,” IEEE transactions on dependable and Secure Computing, Vol. 6, No. 4, pp. 309–315 (2009).
[17] K. Fu, D. Cheng, Y. Tu, and L. Zhang, “Credit card fraud detection using convolutional neural networks,” in International Conference on Neural Information Processing. Springer, pp. 483–490 (2016).
[18] T. K. Behera and S. Panigrahi, “Credit card fraud detection using a neuro-fuzzy expert system,” in Computational Intelligence in Data Mining. Springer, pp. 835–843 (2017).
[19] M. Khodabakhshi and M. Fartash, “Fraud detection in banking using knn (k-nearest neighbor) algorithm,” in International Conf. on Research in Science and Technology (2016).
[20] Y.-J. Chen, C.-H. Wu, Y.-M. Chen, H.-Y. Li, and H.-K. Chen, “Enhancement of fraud detection for narratives in annual reports,” International Journal of Accounting Information Systems, Vol. 26, pp. 32–45 (2017).
[21] N. Carneiro, G. Figueira, and M. Costa, “A data mining based system for credit-card fraud detection in e-tail,” Decision Support Systems, Vol. 95, pp. 91–101 (2017).
[22] S. M. Zoldi, H. Li, and X. Xue, “Fraud detection based on efficient frequent-behavior sorted lists,” Google Patents (2012).
[23] M. Dadfarnia, F. Adibnia, M. Abadi, and A. Dorri, “Incremental collusive fraud detection in large-scale online auction networks,” The Journal of Supercomputing, pp. 1–22 (2020).
[24] A. A. Taha and S. J. Malebary, “An intelligent approach to credit card fraud detection using an optimized light gradient boosting machine,” IEEE Access, Vol. 8, pp. 25579–25587 (2020).
[25] S. Beigi and M. Aminnaseri, “Credit card fraud detection using data mining and statistical methods,” Journal of AI and Data Mining, Vol. 8, No. 2, pp. 149–160 (2019).
[26] L. Subelj, S. Furlan, and M. Bajec, “An expert system for detecting automobile insurance fraud using social network analysis,” Expert Systems with Applications, Vol. 38, No. 1, pp. 1039–1052 (2011).
[27] S.-J. Lin, Y.-Y. Jheng, and C.-H. Yu, “Combining ranking concept and social network analysis to detect collusive groups in online auctions,” Expert Systems with Applications, Vol. 39, No. 10, pp. 9079–9086 (2012).
[28] Y. Sylla, P. Morizet-Mahoudeaux, and S. Brobst, “Fraud detection on large scale social networks,” in 2nd International Congress on Big Data, pp. 413–414 (2013).
[29] S. Jamshidi and M. R. Hashemi, “An efficient data enrichment scheme for fraud detection using social network analysis,” in Sixth International Symposium on Telecommunications (IST). IEEE, pp. 1082–1087 (2012).
[30] V. Van Vlasselaer, T. Eliassi-Rad, L. Akoglu, M. Snoeck, and B. Baesens, “Gotcha! network-based fraud detection for social security fraud,” Management Science, Vol. 63, No. 9, pp. 3090–3110 (2016).
[31] J. Jiang, J. Chen, W. Huang, and P. Mohapatra, “Anomaly detection with graph convolutional networks for insider threat and fraud detection.” IEEE Military Communications Conference (MILCOM), pp. 109–114 (2019).
[32] J.-L. Lin and L. Khomnotai, “Using neighbor diversity to detect fraudsters in online auctions,” Entropy, Vol. 16, No. 5, pp. 2629–2641 (2014).
[33] C.-H. Yu and S.-J. Lin, “Fuzzy rule optimization for online auction frauds detection based on genetic algorithm,” Electronic Commerce Research, Vol. 13, No. 2, pp. 169–182 (2013).
[34] V. Van Vlasselaer, C. Bravo, O. Caelen, T. Eliassi-Rad, L. Akoglu, M. Snoeck, and B. Baesens, “Apate: A novel approach for automated credit card transaction fraud detection using network-based extensions,” Decision Support Systems, Vol. 75, pp. 38–48 (2015).
[35] B. Lebichot, F. Braun, O. Caelen, and M. Saerens, “A graph-based, semi-supervised, credit card fraud detection system,” in International Workshop on Complex Networks and their Applications. Springer, pp. 721–733 (2016).
[36] C. Chiu, Y. Ku, T. Lie, and Y. Chen, “Internet auction fraud detection using social network analysis and classification tree approaches,” Inter-national Journal of Electronic Commerce, Vol. 15, No. 3, pp. 123–147 (2011).
[37] H. Lin, G. Liu, J. Wu, Y. Zuo, X. Wan, and H. Li, “Fraud detection in dynamic interaction network,” IEEE Transactions on Knowledge and Data Engineering (2019).
[38] Z. Karimi Zandian and M. Keyvanpour, “Helpful and Efficient Framework for Classification and Analysis of various Fraud Detection Approaches from the perspective of Time and Features,” in 4th International Conference on Applied Research in Computer Engineering and Signal Processing (2016).
[39] Z. Karimi Zandian and M. R. Keyvanpour, “Feature extraction method based on social network analysis,” Applied Artificial Intelligence, Vol. 33, No. 8, pp. 1–20 (2019).
[40] A. Daneshpazhouh and A. Sami, “Entropy-based outlier detection using semi-supervised approach with few positive examples,” Pattern Recognition Letters, Vol. 49, pp. 77–84 (2014).
[41] J. Hroza, J. Zizka, B. Pouliquen, C. Ignat, and R. Steinberger, “Mining relevant text documents using ranking-based k-nn algorithms trained by only positive examples,” in Proceedings of the Fourth Czech-Slovak Conference Knowledge, pp. 29–40 (2005).
[42] O. Chapelle, B. Scholkopf, and A. Zien, “Semi-supervised learning,” IEEE Transactions on Neural Networks, Vol. 20, No. 3, pp. 542–542 (2009).
[43] H. Hassanzadeh and M. Keyvanpour, “A variance based active learning approach for named entity recognition,” in Intelligent computing and information science. Springer, pp. 347–352 (2011).
[44] H. Hassanzadeh and M. Keyvanpour, “A two-phase hybrid of semi-supervised and active learning approach for sequence labeling,” Intelligent Data Analysis, Vol. 17, No. 2, pp. 251–270 (2013).
[45] M. R. Keyvanpour and M. B. Imani, “Semi-supervised text categorization: Exploiting unlabeled data using ensemble learning algorithms,” Intelligent Data Analysis, Vol. 17, No. 3, pp. 367–385 (2013).
[46] B. Scholkopf, R. C. Williamson, A. J. Smola, J. Shawe-Taylor, and J. C. Platt, “Support vector method for novelty detection,” in Advances in neural information processing systems, pp. 582–588 (2000).
[47] A. K. Jain, M. N. Murty, and P. J. Flynn, “Data clustering: a review,” ACM computing surveys (CSUR), Vol. 31, No. 3, pp. 264–323 (1999).
[48] M. Koohzadi, “Event mining in video data with semi-supervised learning,” Ph.D. dissertation, Alzahra University, Tehran (2012).
[49] E. Atashpaz-Gargari and C. Lucas, “Imperialist competitive algorithm: an algorithm for optimization inspired by imperialistic competition,” in Congress on Evolutionary computation, CEC 2007. IEEE, pp. 4661–4667 (2007).
[50] P. Berka, “Pkdd’99 discovery challenge guide to the financial data set,” berka/challenge/pkdd1999/berka.htm9 (1999).
[51] T. S. Buda, T. Cerqueus, C. Grava, and J. Murphy, “Rex: Representative extrapolating relational databases,” Information Systems, Vol. 67, pp. 83– 99 (2017).
[52] R. Frank, F. Moser, and M. Ester, “A method for multi-relational classification using single and multi-feature aggregation functions,” in European Conference on Principles of Data Mining and Knowledge Discovery. Springer, pp. 430–437 (2007).
[53] R. Zall, “A semi-supervised learning based method for classification of multi-relational data,” Ph.D. dissertation, Alzahra University, Tehran (2015).
[54] J. Zhang and Y. Tay, “Dscaler: Synthetically scaling a given relational database,” Proceedings of the VLDB Endowment, Vol. 9, No. 14, pp. 1671–1682 (2016).
[55] S. Jamshidi, “Developing a dynamic multi-level model for creating a behavioral profile to detect fraud in electronic payments,” Ph.D. dissertation, Tehran University, Tehran (2014).