CN112270548B

CN112270548B - Credit card fraud detection method based on deep learning

Info

Publication number: CN112270548B
Application number: CN202011283215.0A
Authority: CN
Inventors: 程光权; 黄亭飞; 黄魁华; 杜航; 成清; 胡星辰
Original assignee: National University of Defense Technology
Current assignee: National University of Defense Technology
Priority date: 2020-11-17
Filing date: 2020-11-17
Publication date: 2022-09-20
Anticipated expiration: 2040-11-17
Also published as: CN112270548A

Abstract

The invention discloses a credit card fraud detection method based on deep learning, which comprises the following steps: obtaining an original data set of a credit card fraudulent transaction, and preprocessing the data set; dividing data into two parts, wherein one part is used as a training set, and the other part is used as a test set; inputting the training set into a deep learning model for training, optimizing model parameters, and adjusting hyper-parameters to enable the performance of the model to be optimal; and inputting the test set into the trained model to obtain a classification label. The method provides a feature engineering framework comprising two neural networks to generate feature variables for a fraud detection model, then fuses the extracted features with raw data, and inputs the fused features into a classifier to obtain good fraud detection performance.

Description

Credit card fraud detection method based on deep learning

Technical Field

The invention belongs to the technical field of credit card fraud detection, and particularly relates to a credit card fraud detection method based on Deep learning (Deep learning).

Background

In recent years, the number of credit card transactions has increased dramatically with the spread of mobile payment. With the corresponding problem that with the large-scale use of credit cards, the problem of credit card fraud is becoming more and more prevalent and significant losses occur.

Some machine learning has been applied to the relevant fraud detection problem and achieved excellent performance. It should be noted that these methods all belong to supervised learning and are statistical models with shallow structures. The shallow layer here refers to a model that contains only one layer of non-linear variation. The function of this structure is to map the input data from the original space to the feature space for feature extraction. In contrast, a deep structure refers to a structure having multiple layers of non-linear variations. These structures are connected layer by layer, with the output of the previous layer serving as the input to the next layer. The deep structure may extract high-level features of the data, which is a recombination of the extracted features, while the high-level features are a high generalization of the original data properties. In recent years, the depth structure model has achieved great results in the fields of image and voice coding, image and voice recognition, information retrieval and the like, and the results are superior to those of the traditional machine learning method.

Disclosure of Invention

In view of the above, the present invention provides a credit card fraud detection method based on deep learning, which proposes a feature engineering framework including two neural networks to generate feature variables for a fraud detection model, then fuses the extracted features with raw data, and then inputs the fused features into a classifier to obtain good fraud detection performance.

The invention is realized in such a way that a credit card fraud detection method based on deep learning comprises the following steps:

step 1, obtaining an original data set of credit card fraudulent transactions, and preprocessing the data set;

step 2, dividing the data into two parts, wherein one part is used as a training set, and the other part is used as a test set;

step 3, inputting the training set into a deep learning model for training, optimizing model parameters, and adjusting hyper-parameters to enable the performance of the model to be optimal;

and 4, inputting the test set into the trained model to obtain a classification label.

Specifically, the training of the deep learning model in step 3 includes the following steps:

step 301, the original data characteristic data firstly passes through a fully-connected neural network fc1, wherein the neural network comprises 29 neurons, and data characteristic data1 is obtained;

step 302, the data characteristic data1 passes through a fully connected neural network fc2, the neural network has 116 neurons, and the data characteristic data2 is obtained;

step 303, the data characteristic data2 passes through a fully connected neural network fc3, the neural network has 99 neurons, and the data characteristic data3 is obtained;

step 304, fusing the data characteristic data3 with the original data characteristic data, and then passing through a fully-connected neural network fc4, wherein the neural network has 128 neurons, so as to obtain data characteristic data 4;

305, the data characteristic data4 passes through a fully connected neural network fc5, the neural network has 64 neurons, and data characteristic data5 is obtained;

and step 306, the data characteristic data5 passes through a fully connected neural network fc6, the neural network has 2 neurons, and a data tag is obtained.

Preferably, the training set is two thirds of the total number of samples, and the testing set is one third of the total number of samples.

Specifically, the data tag is 0 or 1, 0 indicates that the transaction is a normal transaction, and 1 indicates that the transaction is a fraudulent transaction.

Furthermore, the classification label is subjected to related calculation to obtain the numerical value of the corresponding index, and the index comprises the accuracy and the recall rate.

The method can detect the online transaction data after historical data training so as to identify the fraudulent transaction, extracts deep features of the original data by using a full-connection network and fuses the deep features with the original data into new features, provides a new credit card fraud detection model, can more effectively simulate transaction behaviors and obtain excellent performance on sensitivity.

Drawings

FIG. 1 is a schematic overall flow diagram of the process of the present invention;

FIG. 2 is a flowchart illustrating a deep learning method according to an embodiment of the present invention;

FIG. 3 is a graph comparing results of accuracy in examples of the present invention;

FIG. 4 is a graph comparing recall results according to an embodiment of the present invention;

FIG. 5 is a graph comparing the results of F1 scores in examples of the present invention;

FIG. 6 is a graph showing a comparison of the results of the specificity in the examples of the present invention;

FIG. 7 is a graph comparing results of accuracy in the examples of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail with reference to the accompanying drawings, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

Fig. 1 shows a flow chart of an embodiment of the invention, and a deep learning-based credit card fraud detection method comprises the following steps:

step 1, acquiring an original data set of credit card fraudulent transactions, and preprocessing the data set;

Specifically, as shown in fig. 2, the training of the deep learning model in step 3 includes the following steps:

step 301, the original data feature data passes through a fully-connected neural network fc1, the neural network has 29 neurons, and data feature data1 is obtained;

Furthermore, the classification labels are subjected to related calculation to obtain the numerical values of corresponding indexes, wherein the indexes comprise accuracy and recall rate.

The data set is from the well-known contest website kaggle, which contains credit card transaction data for two days of September 2012. The data contains only digital information, and for privacy reasons, most of the data attributes are data after conversion via PCA, and these attributes are 27 in total. The only unprocessed data portion is the transaction amount and transaction time, so there are 29 attribute features after data preprocessing.

The tags in the dataset are binary class tags, classified according to whether fraudulent transactions are occurring. The data has 284807 cases of common transactions, 492 of which are fraudulent transactions, which are typical unbalanced classification data. The data is divided into a training set and a testing set, wherein the training set comprises 184694 transactions, and 314 fraudulent transactions account for about two thirds; the test set contained 90969 transactions and 159, approximately one-third, fraudulent transactions.

In the experiment, a plurality of independent repeated experiments are carried out, and the results are averaged. The model training has 70 iterations, and the model starts to converge at the 10 th iteration, so that the performance of the two algorithms is analyzed and compared by taking 10 iterations later.

Fig. 3 shows the accuracy of both algorithms, the solid line showing the accuracy of our proposed method and the dashed line the traditional deep learning method. It can be seen that after the model is basically converged, the accuracy of the method of the present invention is always above that of the conventional method until the iteration is finished. And the accuracy of the two algorithms always keeps a steady ascending trend, and the method is characterized in that the accuracy of the two algorithms is increased sharply from the 10 th iteration to the 30 th iteration, then is increased slowly in the 10 th iteration, and is increased gradually from the 40 th iteration to the end.

Fig. 4 shows the recall ratio of both algorithms, the solid line shows the recall ratio of our proposed method and the dashed line shows the conventional deep learning method. It can be seen that after the model has substantially converged, the recall rate of the method of the present invention is above that of the conventional method until the end of 60 iterations, and then the conventional method performs better. During 20 to 50 iterations, the accuracy of the two algorithms always keeps a steady ascending trend, the difference value of the two algorithms is large, and then the two algorithms are gradually reduced; at 60 to 70 iterations, the recall rate of the method of the present invention begins to decrease, while the recall rate of the conventional algorithm is still increasing. From the above, the recall rate convergence rate of the method is superior to that of the traditional algorithm, and the recall rate is high.

Fig. 5 shows the F1 scores for both algorithms, the solid line shows the F1 score for our proposed method, and the dashed line is the traditional deep learning method. It can be seen that after the model has substantially converged, the F1 score of the method of the present invention is above that of the conventional method until the end of the iteration. And the accuracy of the two algorithms always maintains a steady rising trend, wherein the difference between the two algorithms is large in about 30 to 50 iterations and then gradually decreases.

Fig. 6 shows the specificity of the two algorithms, the solid line shows the specificity of our proposed method, and the dotted line shows the traditional deep learning method. It can be seen that after the model has substantially converged, the specificity of the method of the present invention is still above that of the conventional method until the end of the iteration. And the accuracy of the two algorithms always keeps a steady ascending trend, and the accuracy of the two algorithms is sharply increased from the 10 th iteration to the 30 th iteration, then is kept steady and unchanged in the 10 th iteration, and is gradually increased from the 40 th iteration to the end.

Fig. 7 shows the accuracy of both algorithms, the solid line shows the accuracy of our proposed method and the dashed line is the traditional deep learning method. It can be seen that after the model has substantially converged, the accuracy of the method of the present invention is still above that of the conventional method, and is not alleviated until 60 iterations. The reason for this is presumed to be model overfitting, which causes the accuracy of both algorithms to start to decline. During which the accuracy of both algorithms always maintains a steady rising trend, growing faster before 40 iterations, and then growing slowly until the end.

TABLE 1 Algorithm comparison results

Table 1 shows the mean of the two algorithms, i.e. the mean of 10 independent repeated experiments from the start of the iteration to the end of the iteration. The improvement of the accuracy and the specificity is small because more negative samples exist in the data, and the number of the corrected samples is too small relative to the number of the negative samples; and the improvement of the other three measures is obvious and reaches about 2 percent.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

Claims

1. A credit card fraud detection method based on deep learning is characterized by comprising the following steps:

step 4, inputting the test set into the trained model to obtain a classification label;

the training of the deep learning model in the step 3 comprises the following steps:

2. The method of claim 1, wherein the training set is two-thirds of the total number of samples and the testing set is one-third of the total number of samples.

3. The method of claim 1 or 2, wherein the data tag is 0 or 1, 0 indicating that the transaction is a normal transaction, and 1 indicating that the transaction is a fraudulent transaction.

4. The credit card fraud detection method of claim 3, wherein the classification tags are correlated to derive values for corresponding indicators, including accuracy and recall.