CN116703568A - Credit card abnormal transaction identification method and device - Google Patents

Credit card abnormal transaction identification method and device Download PDF

Info

Publication number
CN116703568A
CN116703568A CN202310745200.9A CN202310745200A CN116703568A CN 116703568 A CN116703568 A CN 116703568A CN 202310745200 A CN202310745200 A CN 202310745200A CN 116703568 A CN116703568 A CN 116703568A
Authority
CN
China
Prior art keywords
information
transaction
credit card
characteristic information
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310745200.9A
Other languages
Chinese (zh)
Inventor
林得有
朱秋臻
侯杰锋
李凤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN202310745200.9A priority Critical patent/CN116703568A/en
Publication of CN116703568A publication Critical patent/CN116703568A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2135Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2433Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange

Abstract

The application provides a credit card abnormal transaction identification method and device, which can be used in the financial field or other technical fields. The method comprises the following steps: extracting effective information from credit card transaction information; generating transaction characteristic information of the credit card transaction information according to the credit card transaction information; inputting the effective information and the transaction characteristic information into a pre-trained convolutional neural network model to obtain an abnormal transaction prediction result output by the convolutional neural network model, wherein the convolutional neural network model is obtained by training according to historical credit card transaction information and transaction characteristic information of the historical credit card transaction information. The credit card abnormal transaction identification method and device provided by the embodiment of the application can efficiently extract the characteristic information in the transaction information, and more accurately analyze and judge the transaction information, thereby improving the accuracy of abnormal transaction identification.

Description

Credit card abnormal transaction identification method and device
Technical Field
The application relates to the technical field of computers, in particular to a credit card abnormal transaction identification method and device.
Background
In the modern financial field, traditional credit card abnormal transaction detection methods are mainly based on rules, statistics and data mining technologies, but the methods have some problems. While rule methods are vulnerable, statistical methods have high requirements on data, and the quality and quantity of data have a large impact on them. Data mining techniques, while capable of analyzing large amounts of data, are difficult to model for complex nonlinear relationships.
Disclosure of Invention
Aiming at the problems in the prior art, the embodiment of the application provides a credit card abnormal transaction identification method and device, which can at least partially solve the problems in the prior art.
In one aspect, the present application provides a credit card abnormal transaction identification method, including:
extracting valid information in credit card transaction information, the credit card transaction information including transaction amount, transaction time, transaction location, transaction type, account balance, cardholder information, and/or merchant information;
generating transaction characteristic information of the credit card transaction information according to the credit card transaction information;
inputting the effective information and the transaction characteristic information into a pre-trained convolutional neural network model to obtain an abnormal transaction prediction result output by the convolutional neural network model, wherein the convolutional neural network model is obtained by training according to historical credit card transaction information and transaction characteristic information of the historical credit card transaction information.
In some embodiments, the generating transaction characteristic information of the credit card transaction information from the credit card transaction information includes:
calculating the duty ratio information of the transaction amount relative to the account balance according to the transaction amount and the account balance in the credit card transaction information; and/or
Extracting first time characteristic information from transaction time in the credit card transaction information according to a preset time dimension; and/or
Acquiring transaction interval rule characteristic information associated with the credit card transaction information according to the credit card transaction information and the historical transaction information of the credit card; and/or
Generating transaction statistics characteristic information associated with the credit card transaction information according to the credit card transaction information and the historical transaction information of the credit card; and/or
Generating first combination characteristic information according to the transaction amount and the transaction type in the credit card transaction information; and/or
Generating second combination characteristic information according to the transaction amount in the credit card transaction information and the account type of the credit card; and/or
And determining the transaction frequency characteristic information associated with the credit card transaction information according to the transaction times of the credit card in the preset time length.
In some embodiments, the convolutional neural network model comprises:
an input layer for receiving the effective information and the transaction characteristic information;
the convolution layer is used for extracting first characteristic information from the effective information and the transaction characteristic information;
the pooling layer is used for extracting second characteristic information from the local characteristic information output by the convolution layer;
the full-connection layer is used for fusing the first characteristic information extracted by the convolution layer and the second characteristic information extracted by the pooling layer to generate fused characteristic information;
output layer: and the fusion characteristic information output by the full connection layer is mapped into the [0,1] interval by using an activation function, and the probability that the credit card transaction is abnormal is represented.
In some embodiments, the method further comprises:
analyzing credit card transaction information in the sample set by using a principal component analysis method, and determining effective information in the credit card transaction information;
extracting effective information in each credit card transaction information in the training set;
generating transaction characteristic information of each piece of credit card transaction information in the training set;
and training a preset convolutional neural network model by using the effective information in each piece of credit card transaction information in the training set, the transaction characteristic information of the piece of credit card transaction information and the transaction label of the piece of credit card transaction information to obtain a trained convolutional neural network model.
In some embodiments, the analyzing the credit card transaction information in the sample set using the principal component analysis method, and determining valid information in the credit card transaction information includes:
analyzing credit card transaction information in a sample set by using a principal component analysis method, and determining important characteristic information in the credit card transaction information;
and analyzing the important characteristic information by using an analysis of variance method to obtain effective characteristic information in the important characteristic information.
In some embodiments, the analyzing the important feature information by using an analysis of variance method, and obtaining valid information in the important feature information includes:
calculating a variance value of each piece of important characteristic information in the sample set, and determining the important characteristic information with the variance value meeting a preset condition as the characteristic information to be selected;
and screening effective characteristic information among the characteristic information to be selected according to the correlation among the characteristic information to be selected.
In some embodiments, training the preset convolutional neural network model by using the effective information in each piece of credit card transaction information in the training set, the transaction characteristic information of the piece of credit card transaction information and the transaction tag of the piece of credit card transaction information, and obtaining the trained convolutional neural network model includes:
S1, inputting effective information in credit card transaction information and transaction characteristic information of the credit card transaction information into a convolutional neural network model to obtain an abnormal transaction prediction result output by the convolutional neural network model;
s2, calculating a prediction error of the neural network model according to the transaction label of the credit card transaction information and the prediction result by using a preset loss function;
s3, updating parameters of the neural network model according to the prediction error to obtain a neural network model with updated parameters;
and S4, continuing to execute the steps S1 to S3 until the prediction error of the neural network model is smaller than a first preset value or the iteration times of the steps S1 to S3 reach a second preset value, so as to obtain a trained convolutional neural network model.
In another aspect, the present application provides a credit card abnormal transaction recognition apparatus, comprising:
the extraction module is used for extracting effective information from credit card transaction information, wherein the credit card transaction information comprises transaction amount, transaction time, transaction position, transaction type, account balance, cardholder information and/or merchant information;
The generation module is used for generating transaction characteristic information of the credit card transaction information according to the credit card transaction information;
the prediction module is used for inputting the effective information and the transaction characteristic information into a pre-trained convolutional neural network model to obtain an abnormal transaction prediction result output by the convolutional neural network model, wherein the convolutional neural network model is obtained through training according to historical credit card transaction information and transaction characteristic information of the historical credit card transaction information.
The embodiment of the application also provides an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the steps of the credit card abnormal transaction identification method in any embodiment when executing the program.
The embodiment of the present application also provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the credit card abnormal transaction identification method described in any of the above embodiments.
Compared with the traditional machine learning method and rule engine method, the credit card abnormal transaction identification method and device provided by the embodiment of the application have the advantages that the convolutional neural network can efficiently extract the characteristic information in the transaction information, and the transaction information is analyzed and judged more accurately, so that the accuracy of abnormal transaction identification is improved.
Drawings
In order to more clearly illustrate the embodiments of the application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art. In the drawings:
fig. 1 is a flowchart of a credit card abnormal transaction identification method according to an embodiment of the application.
Fig. 2 is a schematic flow chart of a part of a credit card abnormal transaction identification method according to an embodiment of the application.
Fig. 3 is a schematic flow chart of a part of a credit card abnormal transaction identification method according to an embodiment of the application.
Fig. 4 is a schematic flow chart of a part of a credit card abnormal transaction identification method according to an embodiment of the application.
Fig. 5 is a schematic flow chart of a part of a credit card abnormal transaction identification method according to an embodiment of the application.
Fig. 6 is a schematic structural diagram of a credit card abnormal transaction recognition device according to an embodiment of the present application.
Fig. 7 is a schematic physical structure of an electronic device according to an embodiment of the application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the embodiments of the present application will be described in further detail with reference to the accompanying drawings. The exemplary embodiments of the present application and their descriptions herein are for the purpose of explaining the present application, but are not to be construed as limiting the application. It should be noted that, without conflict, the embodiments of the present application and features in the embodiments may be arbitrarily ordered with respect to each other.
The terms "first," "second," … …, and the like, as used herein, do not denote a particular order or sequence, nor are they intended to be limiting of the application, but rather are merely used to distinguish one element or operation from another in the same technical terms.
As used herein, the terms "comprising," "including," "having," "containing," and the like are intended to be inclusive and mean an inclusion, but not limited to.
As used herein, "and/or" includes any or all ordering of such things.
The data acquisition, processing, use and the like in the application all meet the national legal regulation.
For a better understanding of the present application, a detailed description of the research background of the application is provided below.
In recent years, with the development of deep learning technology, a credit card abnormal transaction identification method based on deep learning has been widely studied and applied. Convolutional neural networks are an important algorithm in deep learning, which has better feature extraction and classification capabilities. Therefore, the convolutional neural network is applied to the field of credit card abnormal transaction identification, abnormal transaction data can be better processed, and the accuracy and reliability of abnormal transaction identification are improved.
The application provides a card-using abnormal transaction identification method and device based on a convolutional neural network, which aim to solve the defects of the traditional abnormal transaction identification method, such as low accuracy, easy attack, low efficiency and the like. By using the convolutional neural network to conduct abnormal transaction identification, the characteristics of credit card transaction data can be automatically extracted, the trouble that the characteristics are required to be manually extracted in the traditional method is avoided, and the accuracy and the speed of detection can be continuously improved through continuous training and optimization. The method adopts various technologies such as data preprocessing, characteristic engineering, convolutional neural network model construction, training, evaluation and the like, can effectively distinguish abnormal transactions from normal transactions, improves the accuracy and the efficiency of credit card abnormal transaction identification, and is expected to be widely applied to the fields of financial institutions, electronic commerce and the like.
The execution subject of the card-using abnormal transaction identification method provided by the application comprises, but is not limited to, a computer.
Fig. 1 is a flow chart of a credit card abnormal transaction identification method according to an embodiment of the present application, and as shown in fig. 1, the credit card abnormal transaction identification method according to the embodiment of the present application includes:
s101, extracting effective information from credit card transaction information, wherein the credit card transaction information comprises transaction amount, transaction time, transaction position, transaction type, account balance, cardholder information and/or merchant information;
in step S101, the data of the present application is derived from the actual credit card transaction record and may be obtained from various banks, payment companies or financial institutions. The collected data includes transaction amount, transaction time, transaction location, transaction type, account balance, cardholder information, merchant information, and/or the like.
Before effective information is extracted from the credit card transaction information, data cleaning can be performed on the credit card transaction information, wherein the data cleaning is an important step of data preprocessing, and comprises data deduplication, missing value filling, outlier processing and the like, so that the data quality is ensured, and the modeling and prediction accuracy is improved. Specific:
The collected raw data may have various problems such as duplicate data, missing data, abnormal data, and the like. Therefore, data cleaning and preprocessing are required, including deduplication, missing value filling, outlier processing, and the like. In the present application, various data cleaning methods may be employed, for example, interpolation is used to fill in missing values, and an abnormal value detection method is used to process abnormal data.
(1) Data deduplication: data deduplication refers to removing duplicate data in a dataset, avoiding the duplicate data from negatively affecting modeling and prediction. In credit card abnormal transaction identification, since some abnormal transaction actions may occur multiple times, it is necessary to preserve all abnormal transaction data while the data is deduplicated. The data may be de-duplicated using the drop_copies () method in the pandas library.
(2) Missing value filling: during data acquisition, there may be some cases of data loss. To avoid negative effects of these missing values on modeling and prediction, the missing values need to be filled in, which may be filled in with fixed values, means, median, mode, etc., here with means. A filena () function in the pandas library may be employed, with the argument set to mean ().
(3) Outlier processing: during data acquisition, there may be some outliers, such as data entry errors or data acquisition equipment failures, etc. These outliers may negatively impact modeling and prediction and therefore require processing of outliers. Here, the line processing where the outlier is located is deleted directly.
Extracting effective information refers to selecting information which is helpful to model prediction results from transaction information, and removing useless or redundant information. In credit card abnormal transaction identification, extraction of effective information plays an important role in improving accuracy and robustness of a model.
S102, generating transaction characteristic information of the credit card transaction information according to the credit card transaction information;
in step S102, feature representation capabilities of the model are enriched by combining credit card transaction information or generating new features. These combined or generated features may better capture the non-linear relationship between the input features, providing more information to the model, thereby improving the performance and generalization ability of the model.
The feature combination and generation are based on the existing features to derive new features, and different features are logically combined to form higher-level feature expression, so that richer information can be provided. Such a combination may be implemented in various ways, such as addition, multiplication, subtraction, division, logical operations, etc. Feature combinations can have multiple levels, ranging from simple two feature additions, multiplications, to more complex multiple feature combinations to form new features.
S103, inputting the effective information and the transaction characteristic information into a pre-trained convolutional neural network model to obtain an abnormal transaction prediction result output by the convolutional neural network model, wherein the convolutional neural network model is obtained through training according to historical credit card transaction information and transaction characteristic information of the historical credit card transaction information.
In step S103, the convolutional neural network (Convolutional Neural Network, CNN) is a feedforward neural network, which has the characteristics of local connection, weight sharing, pooling, and the like, and can effectively process two-dimensional data such as images. In the design of a network structure, proper parameters such as convolution kernel size, sliding step length, pooling kernel size and the like are required to be selected according to actual conditions. Training the convolutional neural network model by utilizing historical credit card transaction information and transaction characteristic information generated according to the historical credit card transaction information, and predicting whether credit card transaction is abnormal transaction or not by utilizing the trained convolutional neural network model based on the effective information and the transaction characteristic information after training.
Compared with the traditional machine learning method and rule engine method, the credit card abnormal transaction identification method provided by the application has the advantages that the convolutional neural network can efficiently extract the characteristic information in the transaction information, and the transaction information is analyzed and judged more accurately, so that the accuracy of abnormal transaction identification is improved.
In some embodiments, the generating transaction characteristic information of the credit card transaction information from the credit card transaction information includes:
calculating the duty ratio information of the transaction amount relative to the account balance according to the transaction amount and the account balance in the credit card transaction information; and/or
Extracting first time characteristic information from transaction time in the credit card transaction information according to a preset time dimension; and/or
Acquiring transaction interval rule characteristic information associated with the credit card transaction information according to the credit card transaction information and the historical transaction information of the credit card; and/or
Generating transaction statistics characteristic information associated with the credit card transaction information according to the credit card transaction information and the historical transaction information of the credit card; and/or
Generating first combination characteristic information according to the transaction amount and the transaction type in the credit card transaction information; and/or
Generating second combination characteristic information according to the transaction amount in the credit card transaction information and the account type of the credit card; and/or
And determining the transaction frequency characteristic information associated with the credit card transaction information according to the transaction times of the credit card in the preset time length.
Specifically, the following are some specific feature examples that may be used for feature combination and generation:
(1) Transaction amount to account balance ratio: and dividing the transaction amount characteristic and the account balance characteristic to obtain a proportion characteristic representing the transaction amount relative to the account balance. This feature may reflect the relative amount of the transaction, helping to capture abnormal transactions.
(2) Transaction timestamp extraction hours and minutes: the hour and minute information is extracted from the transaction timestamp, generating two new features. These temporal features may help the model discover patterns and trends of transactions over different time periods.
(3) Combination of time features: the transaction time stamp is divided into different time units of year, month, day, hour and the like, and the time units are combined. For example, month and hour are combined to obtain a new feature that represents the time of occurrence of the transaction. This feature may help the model discover unusual patterns in transaction time.
(4) Combination of geographic location features: combining the latitude and longitude features of the transaction location may generate a new feature that represents the specific location. This feature may help identify abnormal transactions in geographic locations.
(5) Historical transaction statistics generation: based on the customer's historical transaction records, statistics are generated such as average per transaction amount, variance of amounts of recent transactions, etc. These features may provide custom transaction habit and stability information that helps identify abnormal transactions that do not conform to conventional transaction patterns.
(6) Transaction amount to transaction count ratio: and dividing the transaction amount characteristic with the transaction number characteristic to obtain the characteristic representing the average transaction amount. This feature may reflect the average size of the transaction, helping to determine if there are abnormally large transactions.
(7) Combination of transaction amount and transaction type: the transaction amount characteristic is combined with the transaction type characteristic, such as multiplying the transaction amount by the transaction type, to generate a new characteristic. This feature may help the model distinguish between monetary patterns under different transaction types.
(8) Combination of transaction amount and account type: the transaction amount feature is combined with the account type feature. For example, the transaction amount is divided by the account type to generate a proportional feature that indicates the transaction amount relative to the account type. This feature may help the model discover abnormal transactions under different account types.
(9) Generating recent transaction frequency characteristics: a characteristic representing the frequency of transactions is generated based on the transaction records of the customer over the last period of time. For example, the number of transactions in the last week of the customer or the average number of transactions per day may be counted. This feature may help the model discover unusual high frequency transaction behavior.
(10) Feature combinations for specific transaction types: for a particular transaction type, the relevant features may be combined to extract more discriminative features. For example, for credit card cash out transactions, features such as cash out amount to account balance ratio, cash out time to last cash out time difference, etc. may be combined to better distinguish cash out behavior.
(11) Combination of time interval features: the time interval features between adjacent transactions are combined. For example, the time interval between the previous transaction and the subsequent transaction may be multiplied by the amount of the current transaction to generate a new characteristic. This feature may capture temporal patterns and spacing laws between transactions.
In summary, feature combination and generation enriches the feature representation capability of the model by logically combining existing features or generating new features. Through reasonable selection and design of feature combination and generation modes, features with more discriminant can be extracted, and the recognition capability of the model on credit card abnormal transactions is enhanced. In the application, the feature combination and generation method can be adjusted and optimized according to the data characteristics and domain knowledge so as to improve the performance and generalization capability of the model.
In some embodiments, the convolutional neural network model comprises:
an input layer for receiving the effective information and the transaction characteristic information;
the convolution layer is used for extracting first characteristic information from the effective information and the transaction characteristic information;
the pooling layer is used for extracting second characteristic information from the local characteristic information output by the convolution layer;
the full-connection layer is used for fusing the first characteristic information extracted by the convolution layer and the second characteristic information extracted by the pooling layer to generate fused characteristic information;
output layer: and the fusion characteristic information output by the full connection layer is mapped into the [0,1] interval by using an activation function, and the probability that the credit card transaction is abnormal is represented.
In particular, model construction is a core step in convolutional neural networks, the purpose of which is to design a suitable network structure and parameters to enable the network to efficiently conduct credit card abnormal transaction identification. In the application, a model is constructed by adopting a multi-layer convolutional neural network structure, and the method concretely comprises the following steps:
(1) Input layer: the layer accepts raw data, each of which is represented as a one-dimensional vector, the length of which is the number of features of the input data. Assuming that there are n features, m pieces of data, the input data is denoted as X, and the size thereof is mxn.
(2) Convolution layer: the convolutional layer is the core hierarchy in the convolutional neural network, whose purpose is to extract useful feature information from the input layer. In the application, the convolution layer adopts a plurality of layers of convolution kernels and convolution kernels with different sizes to carry out convolution calculation so as to capture local characteristic information in input data. The specific formula is as follows:
wherein h is i Represents the output result, ω, of the ith convolution kernel j The weight, x, representing the jth convolution kernel i+j The i+j th eigenvalue of the input layer, b the bias term, and k the size of the convolution kernel. relu is an activation function, defined as follows:
relu=max(0,x)。
(3) Pooling layer: the pooling layer is used for performing the reduction and compression on the characteristics output by the convolution layer so as to reduce the number of parameters and the calculated amount of the model. In the application, the pooling layer adopts a maximum pooling method to extract the most remarkable characteristic information output by the convolution layer. The specific formula is as follows:
y i =max(h i×s:i×s+f );
wherein y is i Represents the output result of the pooling layer, h i The output result of the ith convolution kernel in the convolution layer is represented, s represents the step size of the pooling layer, and f represents the size of the pooling layer.
(4) Full tie layer: the full-connection layer fuses the features extracted by the convolution layer and the pooling layer, and the full-connection layer is used for converting the feature information output by the pooling layer into a classification result. In the application, the fully-connected layer adopts a plurality of layers of fully-connected neurons to capture the characteristic information of different layers. The specific formula is as follows:
Wherein y is i Represents the output result of the full connection layer omega j Representing weight parameters, x j The output result of the previous layer is represented, b represents the bias term, and n represents the number of neurons in the fully connected layer.
(5) Output layer: the output layer maps the output of the full-join layer into the [0,1] interval using a Sigmoid function, representing the probability that the sample is determined to be an abnormal transaction.
Wherein P represents the probability value, ω, that the output result is an abnormal transaction j Representing weight parameters, x j Representing the output result of the previous layer, b representing the bias term, m representing the number of neurons in the output layer.
As shown in fig. 2, in some embodiments, the method further comprises:
s201, analyzing credit card transaction information in a sample set by using a principal component analysis method, and determining effective information in the credit card transaction information;
in step S201, principal component analysis (Principal Component Analysis, PCA) can transform the raw data into a new set of principal components, thereby reducing the number of features and the dimensionality of the data. In credit card abnormal transaction identification, PCA may be used to extract features of the transaction information that are important in describing the transaction data, and then determine valid features, i.e., the valid information, among the important features to facilitate subsequent modeling and prediction.
S202, extracting effective information in each credit card transaction information in a training set;
in step S202, after determining valid information in the credit card transaction information using the credit card transaction information in the sample set, valid information in each piece of credit card transaction information in the training set is extracted for subsequent modeling.
S203, generating transaction characteristic information of each piece of credit card transaction information for each piece of credit card transaction information in the training set;
in step S203, as with the type of step S102, for each piece of credit card transaction information in the training set, transaction characteristic information of the piece of credit card transaction information is generated.
S204, training a preset convolutional neural network model by using the effective information in each piece of credit card transaction information in the training set, the transaction characteristic information of the piece of credit card transaction information and the transaction label of the piece of credit card transaction information to obtain a trained convolutional neural network model.
In step S204, for each piece of credit card transaction information in the training set, the effective information and the transaction characteristic information of the credit card transaction information are used as the input of the model, and the preset convolutional neural network model is trained according to the transaction label of the credit card transaction information until the convolutional neural network model converges or reaches the preset iteration number.
As shown in fig. 3, in some embodiments, the analyzing the credit card transaction information in the sample set by using the principal component analysis method, and determining valid information in the credit card transaction information includes:
s2011, analyzing credit card transaction information in a sample set by using a principal component analysis method, and determining important characteristic information in the credit card transaction information;
in step S2011, specific implementation steps may be as follows;
(1) In Python, the PCA method can be implemented using PCA classes in the sklearn library. The main parameters of the PCA class include n_ components, whiten and the like, where n_components indicate the number of retained principal components, while indicates whether whitening processing is performed.
(2) After the PCA is completed, the data projected onto the principal component may be visually presented. In Python, the data can be visualized using a matplotlib library to facilitate the observation of the relationships between different features. In the visual map, points with different colors represent different categories, so that the correlation strength between different features can be judged.
S2012, analyzing the important characteristic information by using an analysis of variance method to obtain effective characteristic information in the important characteristic information.
In step S2012, the variance analysis is a commonly used feature selection method, which calculates variance values of different features in abnormal transaction recognition, and determines the contribution of each feature to the model, so as to obtain valid features in the importance features.
As shown in fig. 4, in some embodiments, the analyzing the important feature information by using an analysis of variance method, obtaining valid information in the important feature information includes:
s20121, calculating a variance value of each piece of important characteristic information in the sample set, and determining the important characteristic information with the variance value meeting a preset condition as the characteristic information to be selected;
in step S20121, assuming that a certain important feature is x and the variance value is Var (x), the following formula may be used for calculation:
where n represents the number of samples of the important feature, x i The value of x of the sample i is represented, and x represents the average value of x.
After calculating the variance value, the feature with larger variance value can be selected as the input variable of the model. The characteristic with larger variance value shows that the value of the characteristic is changed greatly between different samples, and the characteristic has stronger distinction and uniqueness.
S20122, screening effective characteristic information among the candidate characteristic information according to the correlation among the candidate characteristic information.
In step S20122, after selecting the feature having a large variance value, it is necessary to analyze the correlation between the features. If the features with strong correlation exist, the training speed and accuracy of the model can be improved by reducing the number of redundant features. The method of analyzing the correlation between features in the present application may take the form of a correlation matrix.
The correlation coefficient matrix is a matrix for exhibiting correlation between different features, and has a size of n×n for n features, where an ith row and a jth column represent correlation coefficients between the ith feature and the jth feature. The correlation coefficient matrix may be calculated using the corr () method in the pandas library in Python, and then visualized using thermodynamic diagrams or the like to facilitate viewing of the correlation between different features. In Python, a thermodynamic diagram may be plotted using the hemmap () method of the seaborn library. In the thermodynamic diagram, the closer the correlation coefficient is to 1, the stronger the positive correlation between the two features; the closer the correlation coefficient is to-1, the stronger the negative correlation between the two features; the closer the correlation coefficient is to 0, the no linear correlation between the two features. The correlation strength between different features can be judged according to the colors and the numerical values in the thermodynamic diagram, so that the feature with strong correlation is selected as an input variable of the model.
As shown in fig. 5, in some embodiments, training the preset convolutional neural network model by using the effective information in each piece of credit card transaction information in the training set, the transaction characteristic information of the piece of credit card transaction information, and the transaction tag of the piece of credit card transaction information, to obtain a trained convolutional neural network model includes:
s1, inputting effective information in credit card transaction information and transaction characteristic information of the credit card transaction information into a convolutional neural network model to obtain an abnormal transaction prediction result output by the convolutional neural network model;
s2, calculating a prediction error of the neural network model according to the transaction label of the credit card transaction information and the prediction result by using a preset loss function;
s3, updating parameters of the neural network model according to the prediction error to obtain a neural network model with updated parameters;
and S4, continuing to execute the steps S1 to S3 until the prediction error of the neural network model is smaller than a first preset value or the iteration times of the steps S1 to S3 reach a second preset value, so as to obtain a trained convolutional neural network model.
Specifically, the training process of the convolutional neural network mainly comprises two processes of forward propagation and backward propagation. The forward propagation is used for calculating the prediction result of the model, and the backward propagation is used for calculating the loss function of the model and updating parameters. Specifically, the convolutional neural network training steps are as follows:
1. initializing parameters
Before training begins, parameters in the convolutional neural network need to be initialized. The parameters such as convolution kernel weight, full connection layer weight, bias and the like are initialized by adopting a random initialization method.
2. Forward propagation
For an input sample x, the forward propagation process calculates from the input layer to the output layer according to the network structure, and obtains the prediction result of the modelThe specific process is as follows:
(1) Input layer: the sample x is converted into an input tensor and input into the convolutional layer.
(2) Convolution layer: and extracting characteristic information in x by adopting convolution operation, and performing nonlinear processing on a convolution result by using an activation function.
(3) Pooling layer: and downsampling and compressing the convolution result to reduce the number of model parameters and the calculated amount.
(4) Full tie layer: flattening the pooling result and inputting the flattened pooling result into a full-connection layer for linear transformation and nonlinear processing.
(5) Output layer: converting the output result of the fully connected layer into a probability value using a Softmax function, representing the predicted result
3. Calculating a loss function
The loss function is a function in the neural network that measures the difference between the predicted value and the actual value. In the present application, a Cross Entropy (Cross Entropy) loss function can be used, which can effectively measure the difference between the predicted value and the true value in the two-classification problem. For sample x, using loss function to calculate the prediction resultAnd the real tag y. In the application, a cross entropy loss function is adopted, and the specific formula is as follows:
wherein n represents the number of neurons of the output layer, y i An i-th element representing a real tag,the i-th element representing the prediction result.
4. Counter-propagation
Back propagation is used to calculate the gradient of the loss function with respect to the model parameters to facilitate parameter updates using a gradient descent algorithm. The specific process is as follows:
(1) Output layer: the gradient of the output layer is calculated and passed to the fully connected layer.
(2) Full tie layer: the gradient of the fully connected layer is calculated and passed to the pooling layer.
(3) Pooling layer: the gradient of the pooling layer is calculated and passed to the convolution layer.
(4) Convolution layer: the gradient of the convolution layer is calculated and passed to the input layer.
5. Parameter update
Parameters in the model are updated using a gradient descent algorithm using the back-propagation calculated gradients to minimize the loss function. The specific process is as follows:
(1) Calculating the gradient: and updating parameters in the model according to the gradient information calculated by back propagation. Random gradient descent methods are commonly employed.
(2) Parameter updating: and updating parameters in the model according to parameters such as gradient information, learning rate and the like. The specific formula is as follows:
wherein W is (t) Representing the parameter value at the t-th iteration, L representing the loss function,represents the gradient with respect to the parameter W, and α represents the learning rate. />
6. Training model
Repeating the steps of forward propagation, backward propagation, parameter updating and the like until the model converges or reaches the preset iteration times.
In order to better understand the present application, a detailed description will be given below of a credit card abnormal transaction identification method provided by the present application through a specific embodiment.
The technical scheme of the application is a credit card abnormal transaction identification method based on a convolutional neural network, which comprises the following steps:
1. and (3) data collection: the data of the present application is derived from actual credit card transaction records and may be obtained from various banks, payment companies or financial institutions. The data collected should include transaction amount, transaction time, transaction location, transaction type, cardholder information, merchant information, and the like.
2. Data cleaning: data cleaning is an important step of data preprocessing, and comprises data deduplication, missing value filling, outlier processing and the like, so as to ensure data quality and improve modeling and prediction accuracy.
3. Feature combination and generation: in this step we enrich the feature representation capabilities of the model by combining existing features or generating new features. These combined or generated features may better capture the non-linear relationship between the input features, providing more information to the model, thereby improving the performance and generalization ability of the model.
4. Characteristic engineering: feature engineering is a very important ring in machine learning, and the purpose of the feature engineering is to convert raw data into features which can be used for training a machine learning model so as to improve the accuracy and generalization capability of the model. In the present application, feature engineering is critical to credit card abnormal transaction identification.
5. Building a convolutional neural network model: and designing a convolutional neural network model structure, and determining parameters such as the quantity of neurons and an activation function of each layer, so that the convolutional neural network model structure is suitable for credit card abnormal transaction identification tasks.
6. Model training: after the model is built, the model needs to be trained using training data. The training data includes known fraudulent and non-fraudulent transaction data, and the model is made more detectable for fraudulent transactions by optimizing model parameters. In the training process, proper loss functions and optimization algorithms need to be selected to improve the training efficiency and performance of the model.
7. Model evaluation: after model training is complete, the model needs to be evaluated using test data. The test data includes unknown fraudulent and non-fraudulent transaction data, and the performance of the model is evaluated by calculating the indexes such as the accuracy, recall, precision, F1 value, and the like of the model.
8. Model deployment and application: after the model evaluation is completed, the model needs to be deployed into an actual credit card transaction system for real-time abnormal transaction identification. Specifically, the model can be embedded into the flow of the transaction system to detect the transaction data in real time and give out a corresponding risk assessment result. In addition, the model can be continuously optimized and updated to improve the detection capability and performance of the model.
In general, the credit card abnormal transaction identification method based on the convolutional neural network can improve the accuracy and efficiency of abnormal transaction identification. Each step is described in detail below.
2. Data cleansing
The collected raw data may have various problems such as duplicate data, missing data, abnormal data, and the like. Therefore, data cleaning and preprocessing are required, including deduplication, missing value filling, outlier processing, and the like. In the present application, various data cleaning methods may be employed, for example, interpolation is used to fill in missing values, and an abnormal value detection method is used to process abnormal data.
(1) Data deduplication refers to removing duplicate data in a dataset, avoiding negative impact of the duplicate data on modeling and prediction. In credit card abnormal transaction identification, since some fraud may occur multiple times, it is desirable to preserve all fraudulent transaction data while the data is deduplicated. The data may be de-duplicated using the drop_copies () method in the pandas library.
(2) Missing value padding, during data acquisition, there may be some cases of data missing. To avoid negative effects of these missing values on modeling and prediction, the missing values need to be filled in, which may be filled in with fixed values, means, median, mode, etc., here with means. A filena () function in the pandas library may be employed, with the argument set to mean ().
(3) Outlier handling, during data acquisition, there may be some outliers, such as data entry errors or data acquisition equipment failures, etc. These outliers may negatively impact modeling and prediction and therefore require processing of outliers. Here, the line processing where the outlier is located is deleted directly.
3. Feature combination and generation
The feature combination and generation are based on the existing features to derive new features, and different features are logically combined to form higher-level feature expression, so that richer information can be provided. Such a combination may be implemented in various ways, such as addition, multiplication, subtraction, division, logical operations, etc. Feature combinations can have multiple levels, ranging from simple two feature additions, multiplications, to more complex multiple feature combinations to form new features. In the present application, the following are some specific feature examples that may be used for feature combination and generation:
(1) Transaction amount to account balance ratio: and dividing the transaction amount characteristic and the account balance characteristic to obtain a proportion characteristic representing the transaction amount relative to the account balance. This feature may reflect the relative amount of the transaction, helping to capture abnormal transactions.
(2) Transaction timestamp extraction hours and minutes: the hour and minute information is extracted from the transaction timestamp, generating two new features. These temporal features may help the model discover patterns and trends of transactions over different time periods.
(3) Combination of time features: the transaction time stamp is divided into different time units of year, month, day, hour and the like, and the time units are combined. For example, month and hour are combined to obtain a new feature that represents the time of occurrence of the transaction. This feature may help the model discover unusual patterns in transaction time.
(4) Combination of geographic location features: combining the latitude and longitude features of the transaction location may generate a new feature that represents the specific location. This feature may help identify abnormal transactions in geographic locations.
(5) Historical transaction statistics generation: based on the customer's historical transaction records, statistics are generated such as average per transaction amount, variance of amounts of recent transactions, etc. These features may provide custom transaction habit and stability information that helps identify abnormal transactions that do not conform to conventional transaction patterns.
(6) Transaction amount to transaction count ratio: and dividing the transaction amount characteristic with the transaction number characteristic to obtain the characteristic representing the average transaction amount. This feature may reflect the average size of the transaction, helping to determine if there are abnormally large transactions.
(7) Combination of transaction amount and transaction type: the transaction amount characteristic is combined with the transaction type characteristic, such as multiplying the transaction amount by the transaction type, to generate a new characteristic. This feature may help the model distinguish between monetary patterns under different transaction types.
(8) Combination of transaction amount and account type: the transaction amount feature is combined with the account type feature. For example, the transaction amount is divided by the account type to generate a proportional feature that indicates the transaction amount relative to the account type. This feature may help the model discover abnormal transactions under different account types.
(9) Generating recent transaction frequency characteristics: a characteristic representing the frequency of transactions is generated based on the transaction records of the customer over the last period of time. For example, the number of transactions in the last week of the customer or the average number of transactions per day may be counted. This feature may help the model discover unusual high frequency transaction behavior.
(10) Feature combinations for specific transaction types: for a particular transaction type, the relevant features may be combined to extract more discriminative features. For example, for credit card cash out transactions, features such as cash out amount to account balance ratio, cash out time to last cash out time difference, etc. may be combined to better distinguish cash out behavior.
(11) Combination of time interval features: the time interval features between adjacent transactions are combined. For example, the time interval between the previous transaction and the subsequent transaction may be multiplied by the amount of the current transaction to generate a new characteristic. This feature may capture temporal patterns and spacing laws between transactions.
In summary, feature combination and generation enriches the feature representation capability of the model by logically combining existing features or generating new features. Through reasonable selection and design of feature combination and generation modes, features with more discriminant can be extracted, and the credit card fraud detection capability of the model is enhanced. In the application, the feature combination and generation method can be adjusted and optimized according to the data characteristics and domain knowledge so as to improve the performance and generalization capability of the model.
4. Feature engineering
The feature engineering refers to performing operations such as feature extraction, feature selection, feature conversion and the like on data so as to improve the performance of a training model. In credit card abnormal transaction identification, the purpose of feature engineering is to extract features related to fraudulent transactions so that the model can better identify fraudulent transactions. Common characteristics include transaction time, transaction amount, transaction location, merchant type, etc. The goal of feature engineering is to improve the accuracy and reliability of the model by selecting the appropriate features.
(1) Feature extraction
Feature extraction is the process of extracting and converting useful information in the raw data into a representation of the feature. In credit card abnormal transaction identification, due to the high dimension and complexity of data, feature extraction is required to be performed on the original data, and important features capable of describing transaction data are extracted so as to facilitate subsequent modeling and prediction.
The feature extraction method of the present method is principal component analysis (Principal Component Analysis, PCA) which can convert raw data into a new set of principal components, thereby reducing the number of features and the dimensionality of the data. In credit card abnormal transaction identification, PCA may be used to extract features of the transaction data that are important in describing the transaction data for subsequent modeling and prediction. The implementation steps are as follows;
1) In Python, the PCA method can be implemented using PCA classes in the sklearn library. The main parameters of the PCA class include n_ components, whiten and the like, where n_components indicate the number of retained principal components, while indicates whether whitening processing is performed.
2) After the PCA is completed, the data projected onto the principal component may be visually presented. In Python, the data can be visualized using a matplotlib library to facilitate the observation of the relationships between different features. In the visual map, points with different colors represent different categories, so that the correlation strength between different features can be judged.
(2) Feature selection
Feature selection refers to selecting features that contribute to model predictions from among all features, removing useless or redundant features. In credit card abnormal transaction recognition, feature selection plays an important role in improving the accuracy and robustness of a model, and in the method, an analysis of variance method is adopted, wherein the analysis of variance is a common feature selection method, and the contribution of each feature to the model is determined by calculating variance values of different features in abnormal transaction recognition. The following are the detailed steps of the analysis of variance:
1) Calculating variance values
First, the variance value of the different features in the abnormal transaction identification needs to be calculated. Assuming that a certain feature is x and its variance value is Var (x), the following formula can be used for calculation:
where n represents the number of samples, x i The value of x representing the sample i is taken,the average value of x is shown.
2) Selecting features with larger variances
After calculating the variance value, the feature with larger variance value can be selected as the input variable of the model. The characteristic with larger variance value shows that the value of the characteristic is changed greatly between different samples, and the characteristic has stronger distinction and uniqueness.
3) Analyzing correlation between features
After selecting features with larger variance values, the correlation between the features needs to be analyzed. If the features with strong correlation exist, the training speed and accuracy of the model can be improved by reducing the number of redundant features. The method of analyzing the correlation between features in the present application may take the form of a correlation matrix.
The correlation coefficient matrix is a matrix for exhibiting correlation between different features, and has a size of n×n for n features, where an ith row and a jth column represent correlation coefficients between the ith feature and the jth feature. The correlation coefficient matrix may be calculated using the corr () method in the pandas library in Python, and then visualized using thermodynamic diagrams or the like to facilitate viewing of the correlation between different features. In Python, a thermodynamic diagram may be plotted using the hemmap () method of the seaborn library. In the thermodynamic diagram, the closer the correlation coefficient is to 1, the stronger the positive correlation between the two features; the closer the correlation coefficient is to-1, the stronger the negative correlation between the two features; the closer the correlation coefficient is to 0, the no linear correlation between the two features. The correlation strength between different features can be judged according to the colors and the numerical values in the thermodynamic diagram, so that the feature with strong correlation is selected as an input variable of the model.
(4) Feature transformation
Feature transformation is the process of transforming raw data into features that can be used for model training. In the present application, feature transformation refers to transforming the original features (e.g., transaction time, transaction amount, etc.) in credit card transaction data into a feature representation that can be used for convolutional neural network training. For deriving new features based on existing features, feature conversion can be directly performed without the need of feature extraction and feature selection; or extracting and selecting the derived new features and the existing features.
5. Convolutional neural network model construction
Model construction is a core step in convolutional neural networks, whose purpose is to design a suitable network structure and parameters to enable the network to efficiently conduct credit card abnormal transaction identification and recognition. In the application, a model is constructed by adopting a multi-layer convolutional neural network structure, and the method comprises the following specific steps:
the convolutional neural network (Convolutional Neural Network, CNN) is a feedforward neural network, has the characteristics of local connection, weight sharing, pooling and the like, and can effectively process two-dimensional data such as images and the like. In the design of a network structure, proper parameters such as convolution kernel size, sliding step length, pooling kernel size and the like are required to be selected according to actual conditions.
(1) Input layer: the layer accepts raw data, i.e., credit card transaction data, each represented as a one-dimensional vector, the length of which is a characteristic of the input data. Assuming that there are n features, m pieces of data, the input data is denoted as X, and the size thereof is mxn.
(2) Convolution layer: the convolutional layer is the core hierarchy in the convolutional neural network, whose purpose is to extract useful feature information from the input layer. In the application, the convolution layer adopts a plurality of layers of convolution kernels and convolution kernels with different sizes to carry out convolution calculation so as to capture local characteristic information in transaction data. The specific formula is as follows:
wherein h is i Represents the output result, ω, of the ith convolution kernel j The weight, x, representing the jth convolution kernel i+j The i+j th eigenvalue of the input layer, b the bias term, and k the size of the convolution kernel. relu is an activation function, defined as follows:
relu=max(0,x);
pooling layer: the pooling layer is used for performing the reduction and compression on the characteristics output by the convolution layer so as to reduce the number of parameters and the calculated amount of the model. In the application, the pooling layer adopts a maximum pooling method to extract the most remarkable characteristic information output by the convolution layer. The specific formula is as follows:
y i =max(h i*s:i*s+f );
wherein y is i Represents the output result of the pooling layer, h i The output result of the ith convolution kernel in the convolution layer is represented, s represents the step size of the pooling layer, and f represents the size of the pooling layer.
(4) Full tie layer: the full-connection layer fuses the features extracted by the convolution layer and the pooling layer, and the full-connection layer is used for converting the feature information output by the pooling layer into a classification result. In the application, the fully-connected layer adopts a plurality of layers of fully-connected neurons to capture the characteristic information of different layers. The specific formula is as follows:
wherein y is i Representing the output result of the full connection layer,ω j Representing weight parameters, x j The output result of the previous layer is represented, b represents the bias term, and n represents the number of neurons in the fully connected layer.
(5) Output layer: the output layer maps the output of the full connection layer into the [0,1] interval using the Sigmoid function, representing the probability that the sample is determined to be fraudulent.
Wherein P represents a probability value, ω, that the output result is a fraudulent transaction j Representing weight parameters, x j Representing the output result of the previous layer, b representing the bias term, m representing the number of neurons in the output layer.
(6) Convolutional neural network model improvement-residual connection
In conventional convolutional neural network structures, the output of each convolutional layer is required to pass through a nonlinear activation function, such as a ReLU function. Such an activation function may unnecessarily compress and filter information in the convolutional layer, resulting in loss and loss of information. The occurrence of residual connection can effectively solve this problem.
Specifically, the residual connection refers to directly connecting adjacent convolution layers in the network and directly transmitting an input signal to a subsequent layer so as to increase the information circulation and information transmission capability between the network layers. In the residual connection, the output of each convolutional layer is added to its input to obtain a residual value. And then, adding the residual value with the input of the next adjacent layer in the network to obtain an output result. The equation for the residual connection is expressed as:
Output=Input+Residual;
where Input represents the Input of the network, output represents the Output of the network, and Residual represents the Residual value between the convolutional layer Output and Input. The implementation method of the residual connection will be described in further detail in the following steps:
1) Defining residual blocks
A residual block is a basic unit for implementing a residual connection, which consists of two or more convolutional layers and one or more residual connections that span the convolutional layers. Typically, the input and output of each residual block have the same dimensions. Specifically, the formula of one residual block is as follows:
Output=Input+Conv2D(ReLU(Conv2D(Input)));
where Input represents Input, output represents Output, conv2D represents a two-dimensional convolution layer, and ReLU represents a ReLU activation function.
2) Defining a residual network
The network architecture using residual connections is called residual network (ResNet). In defining the residual network, a plurality of residual blocks need to be stacked in order to construct a deep convolutional neural network.
3) Calculating residual errors
For a convolutional layer, assuming that its input is x and its output is y, its residual value R can be expressed as:
R=y-x;
where y represents the output of the convolutional layer and x represents the input of the convolutional layer. When the input and output of the convolution layer have the same dimension, x and y can be directly added to obtain the output of the residual block, namely:
Output=x+y;
otherwise, the dimension of x needs to be increased or decreased so that it has the same dimension as y. In ResNet, a 1x1 convolution operation is typically used to achieve the purpose of lifting dimensions.
4) Adding the residual value to the input
Adding the residual value R to the input x of the convolution layer to obtain the output y of the convolution layer, namely:
y=F(x)+R;
where F (x) represents the convolution operation and R represents the residual value. The residual value R is added with the input x of the convolution layer, so that important information in the input can be reserved, and meanwhile, the information circulation and information transmission capacity between network layers are improved through the transmission of the residual value.
In summary, the residual connection is an effective network structure improvement method, and training and optimization of the deep convolutional neural network can be achieved by designing a residual block structure, calculating a residual value and adding the residual value to the input. Through residual connection, the problems of gradient disappearance, gradient explosion and the like in the deep convolutional neural network training process can be relieved, and the performance and generalization capability of the model are improved.
6. Convolutional neural network model training
The training process of the convolutional neural network mainly comprises two processes of forward propagation and backward propagation. The forward propagation is used for calculating the prediction result of the model, and the backward propagation is used for calculating the loss function of the model and updating parameters. Specifically, the convolutional neural network training steps are as follows:
(1) Initializing parameters
Before training begins, parameters in the convolutional neural network need to be initialized. The parameters such as convolution kernel weight, full connection layer weight, bias and the like are initialized by adopting a random initialization method.
(2) Forward propagation
For an input sample x, the forward propagation process calculates from the input layer to the output layer according to the network structure, and obtains the prediction result of the modelThe specific process is as follows:
1) Input layer: the sample x is converted into an input tensor and input into the convolutional layer.
2) Convolution layer: and extracting characteristic information in x by adopting convolution operation, and performing nonlinear processing on a convolution result by using an activation function.
3) Pooling layer: and downsampling and compressing the convolution result to reduce the number of model parameters and the calculated amount.
4) Full tie layer: flattening the pooling result and inputting the flattened pooling result into a full-connection layer for linear transformation and nonlinear processing.
5) Output layer: converting the output result of the fully connected layer into a probability value using a Softmax function, representing the predicted result
(3) Calculating a loss function
The loss function is a function in the neural network that measures the difference between the predicted value and the actual value. In the present application, a Cross Entropy (Cross Entropy) loss function can be used, which can effectively measure the difference between the predicted value and the true value in the two-classification problem. For sample x, using loss function to calculate the prediction resultAnd the real tag y. In the application, a cross entropy loss function is adopted, and the specific formula is as follows:
wherein n represents the number of neurons of the output layer, y i An i-th element representing a real tag,the i-th element representing the prediction result.
(4) Counter-propagation
Back propagation is used to calculate the gradient of the loss function with respect to the model parameters to facilitate parameter updates using a gradient descent algorithm. The specific process is as follows:
1) Output layer: the gradient of the output layer is calculated and passed to the fully connected layer.
2) Full tie layer: the gradient of the fully connected layer is calculated and passed to the pooling layer.
3) Pooling layer: the gradient of the pooling layer is calculated and passed to the convolution layer.
4) Convolution layer: the gradient of the convolution layer is calculated and passed to the input layer.
(5) Parameter update
Parameters in the model are updated using a gradient descent algorithm using the back-propagation calculated gradients to minimize the loss function. The specific process is as follows:
1) Calculating the gradient: and updating parameters in the model according to the gradient information calculated by back propagation. Random gradient descent methods are commonly employed.
2) Parameter updating: and updating parameters in the model according to parameters such as gradient information, learning rate and the like. The specific formula is as follows:
wherein W is (t) Representing the parameter value at the t-th iteration, L representing the loss function,represents the gradient with respect to the parameter W, and α represents the learning rate.
(6) Repeating the steps of forward propagation, backward propagation, parameter updating and the like until the model converges or reaches the preset iteration times.
7. Model evaluation
Model evaluation typically uses some metrics to measure the performance of the model, such as accuracy, recall, precision, and F1 score. In this method we use the accuracy and F1 score to evaluate the performance of the model.
(1) Accuracy rate of
The accuracy refers to the proportion of model correctly predicted samples, and the calculation formula is as follows:
Where TP represents a true example (the number of samples the model correctly predicts as a positive example), TN represents a true counterexample (the number of samples the model correctly predicts as a counterexample), FP represents a false positive example (the number of samples the model incorrectly predicts as a positive example), and FN represents a false counterexample (the number of samples the model incorrectly predicts as a counterexample).
(2) F1 fraction
The F1 score is a weighted average based on accuracy and recall, and can measure the balance between accuracy and recall of the model. The calculation formula is as follows:
precision is the precision rate, and represents the proportion of real examples in samples predicted as positive examples by a model, and the calculation formula is as follows:
the recall is a recall rate, which represents the proportion of the true case which is correctly predicted as the positive case by the model, and the calculation formula is as follows:
during model evaluation, we split the dataset into training, validation and test sets. The training set is used for training the model, the verification set is used for adjusting the super parameters of the model, and the optimal model is selected. Finally, we use the test set to evaluate the performance of the model on the new data.
8. Model deployment and application
After model training and evaluation is completed, the trained model needs to be exported. In general, the model may be exported in a common model format, keras, tensorFlow, etc., for subsequent deployment and application. And deploying the exported model into environments such as a server, a cloud platform or a mobile terminal and the like so as to realize real-time prediction of new data. Common deployment modes include Web services, docker containers, kubernetes clusters, and the like. In the application, proper model deployment and application modes can be selected according to actual conditions, and proper parameter adjustment and optimization can be performed to improve the accuracy and robustness of the model. Meanwhile, attention is paid to the performance, safety and other problems of the model so as to ensure the practicability and reliability of the model.
The technical scheme of the application provides a credit card abnormal transaction identification method based on a convolutional neural network, which has the following beneficial effects compared with the traditional machine learning method and rule engine method:
(1) The convolutional neural network can efficiently extract the characteristic information in the transaction data, and more accurately analyze and judge the transaction data, so that the accuracy of abnormal transaction identification is improved.
(2) The technical scheme of the application can rapidly analyze and judge the credit card transaction, thereby greatly reducing the waiting time of the user and improving the user experience.
(3) The technical scheme of the application adopts the deep learning technology, and can carry out comprehensive feature extraction and analysis on transaction data, thereby identifying more fraudulent transactions and enhancing the security of a credit card system.
(4) The technical scheme of the application adopts an automatic machine learning method to realize the automatic processing of credit card abnormal transaction identification, thereby reducing the operation cost and risk.
In summary, the technical scheme of the application has higher practicability and economic benefit, can improve the safety and user experience of the credit card system, and provides a new abnormal transaction identification solution for the credit card industry.
Fig. 6 is a flow chart of a credit card abnormal transaction identification method according to an embodiment of the application, and as shown in fig. 6, the credit card abnormal transaction identification method according to the embodiment of the application includes:
an extracting module 21 for extracting valid information in credit card transaction information including transaction amount, transaction time, transaction location, transaction type, account balance, cardholder information, and/or merchant information;
a generating module 22, configured to generate transaction characteristic information of the credit card transaction information according to the credit card transaction information;
the prediction module 23 is configured to input the effective information and the transaction characteristic information into a pre-trained convolutional neural network model, so as to obtain an abnormal transaction prediction result output by the convolutional neural network model, where the convolutional neural network model is obtained by training according to historical credit card transaction information and transaction characteristic information of the historical credit card transaction information.
Compared with the traditional machine learning method and rule engine method, the credit card abnormal transaction identification device provided by the application has the advantages that the convolutional neural network can efficiently extract the characteristic information in the transaction information, and the transaction information is analyzed and judged more accurately, so that the accuracy of abnormal transaction identification is improved.
In some embodiments, the generating module is specifically configured to:
calculating the duty ratio information of the transaction amount relative to the account balance according to the transaction amount and the account balance in the credit card transaction information; and/or
Extracting first time characteristic information from transaction time in the credit card transaction information according to a preset time dimension; and/or
Acquiring transaction interval rule characteristic information associated with the credit card transaction information according to the credit card transaction information and the historical transaction information of the credit card; and/or
Generating transaction statistics characteristic information associated with the credit card transaction information according to the credit card transaction information and the historical transaction information of the credit card; and/or
Generating first combination characteristic information according to the transaction amount and the transaction type in the credit card transaction information; and/or
Generating second combination characteristic information according to the transaction amount in the credit card transaction information and the account type of the credit card; and/or
And determining the transaction frequency characteristic information associated with the credit card transaction information according to the transaction times of the credit card in the preset time length.
In some embodiments, the convolutional neural network model comprises:
An input layer for receiving the effective information and the transaction characteristic information;
the convolution layer is used for extracting first characteristic information from the effective information and the transaction characteristic information;
the pooling layer is used for extracting second characteristic information from the local characteristic information output by the convolution layer;
the full-connection layer is used for fusing the first characteristic information extracted by the convolution layer and the second characteristic information extracted by the pooling layer to generate fused characteristic information;
output layer: and the fusion characteristic information output by the full connection layer is mapped into the [0,1] interval by using an activation function, and the probability that the credit card transaction is abnormal is represented.
In some embodiments, the apparatus further comprises:
the analysis module is used for analyzing the credit card transaction information in the sample set by utilizing a principal component analysis method and determining effective information in the credit card transaction information;
the information extraction module is used for extracting effective information in each credit card transaction information in the training set;
the generating module is further configured to: generating transaction characteristic information of each piece of credit card transaction information in the training set;
the model training module is used for training a preset convolutional neural network model by utilizing the effective information in each piece of credit card transaction information in the training set, the transaction characteristic information of the piece of credit card transaction information and the transaction label of the piece of credit card transaction information to obtain a trained convolutional neural network model.
In some embodiments, the analysis module is specifically configured to:
analyzing credit card transaction information in a sample set by using a principal component analysis method, and determining important characteristic information in the credit card transaction information;
and analyzing the important characteristic information by using an analysis of variance method to obtain effective characteristic information in the important characteristic information.
In some embodiments, the analyzing module analyzes the important feature information by using an analysis of variance method, and obtaining valid information in the important feature information includes:
calculating a variance value of each piece of important characteristic information in the sample set, and determining the important characteristic information with the variance value meeting a preset condition as the characteristic information to be selected;
and screening effective characteristic information among the characteristic information to be selected according to the correlation among the characteristic information to be selected.
In some embodiments, the model training module is specifically configured to:
s1, inputting effective information in credit card transaction information and transaction characteristic information of the credit card transaction information into a convolutional neural network model to obtain an abnormal transaction prediction result output by the convolutional neural network model;
S2, calculating a prediction error of the neural network model according to the transaction label of the credit card transaction information and the prediction result by using a preset loss function;
s3, updating parameters of the neural network model according to the prediction error to obtain a neural network model with updated parameters;
and S4, continuing to execute the steps S1 to S3 until the prediction error of the neural network model is smaller than a first preset value or the iteration times of the steps S1 to S3 reach a second preset value, so as to obtain a trained convolutional neural network model.
The embodiment of the apparatus provided in the embodiment of the present application may be specifically used to execute the above-mentioned processing flow applied to each method embodiment, and the functions thereof are not described herein again, and may refer to the detailed description of the above-mentioned method embodiment.
It should be noted that, the method and the device for identifying abnormal credit card transactions provided by the embodiment of the application can be used in the financial field and also can be used in any technical field except the financial field, and the application field of the method and the device for identifying abnormal credit card transactions is not limited.
Fig. 7 is a schematic physical structure of an electronic device according to an embodiment of the present application, as shown in fig. 7, the electronic device may include: processor 301, communication interface (Communications Interface) 302, memory (memory) 303 and communication bus 304, wherein processor 301, communication interface 302, memory 303 accomplish the communication between each other through communication bus 304. The processor 301 may invoke logic instructions in the memory 303 to perform the method described in any of the embodiments above.
Further, the logic instructions in the memory 303 may be implemented in the form of software functional units and stored in a computer readable storage medium when sold or used as a stand alone product. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The present embodiments disclose a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, are capable of performing the methods provided by the method embodiments described above.
The present embodiment provides a computer-readable storage medium storing a computer program that causes the computer to execute the methods provided by the above-described method embodiments.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In the description of the present specification, reference to the terms "one embodiment," "one particular embodiment," "some embodiments," "for example," "an example," "a particular example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
The foregoing description of the embodiments has been provided for the purpose of illustrating the general principles of the application, and is not meant to limit the scope of the application, but to limit the application to the particular embodiments, and any modifications, equivalents, improvements, etc. that fall within the spirit and principles of the application are intended to be included within the scope of the application.

Claims (10)

1. A method for identifying abnormal transactions on a credit card, comprising:
extracting valid information in credit card transaction information, the credit card transaction information including transaction amount, transaction time, transaction location, transaction type, account balance, cardholder information, and/or merchant information;
generating transaction characteristic information of the credit card transaction information according to the credit card transaction information;
inputting the effective information and the transaction characteristic information into a pre-trained convolutional neural network model to obtain an abnormal transaction prediction result output by the convolutional neural network model, wherein the convolutional neural network model is obtained by training according to historical credit card transaction information and transaction characteristic information of the historical credit card transaction information.
2. The method of claim 1, wherein generating transaction characteristic information for the credit card transaction information based on the credit card transaction information comprises:
calculating the duty ratio information of the transaction amount relative to the account balance according to the transaction amount and the account balance in the credit card transaction information; and/or
Extracting first time characteristic information from transaction time in the credit card transaction information according to a preset time dimension; and/or
Acquiring transaction interval rule characteristic information associated with the credit card transaction information according to the credit card transaction information and the historical transaction information of the credit card; and/or
Generating transaction statistics characteristic information associated with the credit card transaction information according to the credit card transaction information and the historical transaction information of the credit card; and/or
Generating first combination characteristic information according to the transaction amount and the transaction type in the credit card transaction information; and/or
Generating second combination characteristic information according to the transaction amount in the credit card transaction information and the account type of the credit card; and/or
And determining the transaction frequency characteristic information associated with the credit card transaction information according to the transaction times of the credit card in the preset time length.
3. The method according to claim 1 or 2, wherein the convolutional neural network model comprises:
an input layer for receiving the effective information and the transaction characteristic information;
the convolution layer is used for extracting first characteristic information from the effective information and the transaction characteristic information;
the pooling layer is used for extracting second characteristic information from the local characteristic information output by the convolution layer;
the full-connection layer is used for fusing the first characteristic information extracted by the convolution layer and the second characteristic information extracted by the pooling layer to generate fused characteristic information;
output layer: and the fusion characteristic information output by the full connection layer is mapped into the [0,1] interval by using an activation function, and the probability that the credit card transaction is abnormal is represented.
4. A method according to claim 3, characterized in that the method further comprises:
analyzing credit card transaction information in the sample set by using a principal component analysis method, and determining effective information in the credit card transaction information;
extracting effective information in each credit card transaction information in the training set;
generating transaction characteristic information of each piece of credit card transaction information in the training set;
And training a preset convolutional neural network model by using the effective information in each piece of credit card transaction information in the training set, the transaction characteristic information of the piece of credit card transaction information and the transaction label of the piece of credit card transaction information to obtain a trained convolutional neural network model.
5. The method of claim 4, wherein analyzing the credit card transaction information in the sample set using principal component analysis, determining valid information in the credit card transaction information comprises:
analyzing credit card transaction information in a sample set by using a principal component analysis method, and determining important characteristic information in the credit card transaction information;
and analyzing the important characteristic information by using an analysis of variance method to obtain effective characteristic information in the important characteristic information.
6. The method of claim 5, wherein analyzing the important feature information by using an analysis of variance method to obtain valid information in the important feature information comprises:
calculating a variance value of each piece of important characteristic information in the sample set, and determining the important characteristic information with the variance value meeting a preset condition as the characteristic information to be selected;
And screening effective characteristic information among the characteristic information to be selected according to the correlation among the characteristic information to be selected.
7. The method of claim 4, wherein training the predetermined convolutional neural network model using the valid information in each piece of credit card transaction information in the training set, the transaction characteristic information of the piece of credit card transaction information, and the transaction tag of the piece of credit card transaction information, the training the trained convolutional neural network model comprises:
s1, inputting effective information in credit card transaction information and transaction characteristic information of the credit card transaction information into a convolutional neural network model to obtain an abnormal transaction prediction result output by the convolutional neural network model;
s2, calculating a prediction error of the neural network model according to the transaction label of the credit card transaction information and the prediction result by using a preset loss function;
s3, updating parameters of the neural network model according to the prediction error to obtain a neural network model with updated parameters;
and S4, continuing to execute the steps S1 to S3 until the prediction error of the neural network model is smaller than a first preset value or the iteration times of the steps S1 to S3 reach a second preset value, so as to obtain a trained convolutional neural network model.
8. A credit card abnormal transaction recognition apparatus, comprising:
the extraction module is used for extracting effective information from credit card transaction information, wherein the credit card transaction information comprises transaction amount, transaction time, transaction position, transaction type, account balance, cardholder information and/or merchant information;
the generation module is used for generating transaction characteristic information of the credit card transaction information according to the credit card transaction information;
the prediction module is used for inputting the effective information and the transaction characteristic information into a pre-trained convolutional neural network model to obtain an abnormal transaction prediction result output by the convolutional neural network model, wherein the convolutional neural network model is obtained through training according to historical credit card transaction information and transaction characteristic information of the historical credit card transaction information.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method of any one of claims 1 to 7 when the computer program is executed by the processor.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method according to any one of claims 1 to 7.
CN202310745200.9A 2023-06-21 2023-06-21 Credit card abnormal transaction identification method and device Pending CN116703568A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310745200.9A CN116703568A (en) 2023-06-21 2023-06-21 Credit card abnormal transaction identification method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310745200.9A CN116703568A (en) 2023-06-21 2023-06-21 Credit card abnormal transaction identification method and device

Publications (1)

Publication Number Publication Date
CN116703568A true CN116703568A (en) 2023-09-05

Family

ID=87823721

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310745200.9A Pending CN116703568A (en) 2023-06-21 2023-06-21 Credit card abnormal transaction identification method and device

Country Status (1)

Country Link
CN (1) CN116703568A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117455497A (en) * 2023-11-12 2024-01-26 北京营加品牌管理有限公司 Transaction risk detection method and device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117455497A (en) * 2023-11-12 2024-01-26 北京营加品牌管理有限公司 Transaction risk detection method and device

Similar Documents

Publication Publication Date Title
Wirawan et al. Short term prediction on bitcoin price using ARIMA method
CN108985929B (en) Training method, business data classification processing method and device, and electronic equipment
CN112819604A (en) Personal credit evaluation method and system based on fusion neural network feature mining
CN110751557A (en) Abnormal fund transaction behavior analysis method and system based on sequence model
CN112700324A (en) User loan default prediction method based on combination of Catboost and restricted Boltzmann machine
CN107392217B (en) Computer-implemented information processing method and device
CN116703568A (en) Credit card abnormal transaction identification method and device
CN115271886A (en) Financial product recommendation method and device, storage medium and electronic equipment
Zhu et al. Explainable prediction of loan default based on machine learning models
CN117041017B (en) Intelligent operation and maintenance management method and system for data center
CN112508684B (en) Collecting-accelerating risk rating method and system based on joint convolutional neural network
Yahaya et al. An enhanced bank customers churn prediction model using a hybrid genetic algorithm and k-means filter and artificial neural network
CN113283582A (en) Textile industry financial loss prediction method and device, storage medium and processor
Renström et al. Fraud Detection on Unlabeled Data with Unsupervised Machine Learning
CN113421154B (en) Credit risk assessment method and system based on control chart
CN112950350B (en) Loan product recommendation method and system based on machine learning
CN113743643B (en) Method, device, equipment and medium for determining commodity data prediction accuracy
CN114756720A (en) Time sequence data prediction method and device
CN113554099A (en) Method and device for identifying abnormal commercial tenant
Zhang et al. A ResNet-LSTM Based Credit Scoring Approach for Imbalanced Data
CN112446505A (en) Meta-learning modeling method and device, electronic equipment and storage medium
Khanarsa et al. Self-Identification ResNet-ARIMA Forecasting Model
Qureshi et al. A comparative analysis of traditional SARIMA and machine learning models for CPI data modelling in Pakistan
CN111833171B (en) Abnormal operation detection and model training method, device and readable storage medium
CN115393060A (en) Online financial wind control model based on real-time streaming data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination