Fraud Detection Using temporal sequence embedding and Cluster-Weighted LSTM Models in Card Transactions

Document Type : Original Article

Authors

1 Faculty of Industrial and systems engineering,Tarbiat Modares University, Tehran, Iran

2 Faculty of Industrial and Systems Engineering, Tarbiat Modares University, Tehran, Iran

3 Faculty of Management, Imam Sadiq University, Tehran, Iran

4 Faculty of Industrial and systems engineering,Tarbiat Modares University

Abstract

Traditional fraud detection models often overlook the sequential and temporal relationships between transactions, which can be crucial for identifying fraudulent activities. To address this, a new data-to-graph mapping approach is proposed, transforming user data into a transaction graph and constructing a bipartite graph with source and target nodes. The main goal is to leverage the temporal order of transactions to capture changes effectively and identify distinct fraud patterns. The method begins by creating a weighted graph based on transaction amounts and their temporal sequence. For feature extraction, the Probabilistic FraudWalk method—an advanced version of the traditional FraudWalk algorithm—is used. This method enhances the random walk process by incorporating probability-based neighbor selection, dynamically choosing the next node based on the probability distribution of common neighbors. To balance the dataset, the Synthetic Minority Over-sampling Technique (SMOTE) is combined with the Edited Nearest Neighbors (ENN) method, forming SMOTE-ENN. To reduce information conflict, data is clustered using K-means clustering. A weighted Long Short-Term Memory (LSTM) model is then trained on each cluster, with weights determined by the minimum distance between samples of different classes within the same cluster. The proposed LSTM model demonstrates superior performance on benchmark datasets, effectively detecting fraud in real-world card-to-card transactions. This approach enhances the security of financial information for banks and financial institutions, showing that incorporating temporal and sequential data significantly improves fraud detection accuracy and reliability.

Keywords

Main Subjects


CAPTCHA Image