CN114140238A - Abnormal transaction data identification method and device, computer equipment and storage medium - Google Patents

Abnormal transaction data identification method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN114140238A
CN114140238A CN202111462122.9A CN202111462122A CN114140238A CN 114140238 A CN114140238 A CN 114140238A CN 202111462122 A CN202111462122 A CN 202111462122A CN 114140238 A CN114140238 A CN 114140238A
Authority
CN
China
Prior art keywords
transaction data
identification
abnormal
model
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111462122.9A
Other languages
Chinese (zh)
Inventor
欧阳春
韩锐吉
陈家隆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN202111462122.9A priority Critical patent/CN114140238A/en
Publication of CN114140238A publication Critical patent/CN114140238A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The application relates to an abnormal transaction data identification method, an abnormal transaction data identification device, computer equipment and a storage medium, which can be used in the financial field or other fields, and comprise the following steps: acquiring preprocessed initial transaction data, a first abnormal characteristic and a second abnormal characteristic; training a pre-constructed transaction data identification model according to the first abnormal characteristic to identify abnormal transaction data from the initial transaction data to obtain an updated transaction data identification model; training the updated transaction data identification model again according to the first error identification data and the second error identification data to obtain a target transaction data identification model; the first error identification data is transaction data which accords with the second abnormal characteristic and does not belong to the abnormal transaction data in the initial transaction data; and inputting the transaction data to be identified into the target transaction data identification model to obtain an identification result of the transaction data to be identified. By adopting the method, the effect of identifying abnormal transaction data is improved.

Description

Abnormal transaction data identification method and device, computer equipment and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method, an apparatus, a computer device, a storage medium, and a computer program product for identifying abnormal transaction data.
Background
With the rapid development of socioeconomic, financial institutions have hundreds of millions of basic transaction data each day, and normal transaction data without risks are generally called positive samples, and abnormal transaction data with risks are called negative samples.
In the related art, a model is usually trained by a positive sample and a negative sample, so that the model can identify abnormal transaction data; however, the proportion of positive and negative samples is seriously unbalanced, so that the training effect of the model is influenced due to the lack of effective negative sample data characteristics, and the identification effect of abnormal transaction data is poor.
Disclosure of Invention
In view of the above, it is necessary to provide an abnormal transaction data identification method, apparatus, computer device, computer readable storage medium and computer program product for solving the above technical problems.
In a first aspect, the present application provides a method for identifying abnormal transaction data, the method comprising:
responding to the abnormal transaction data identification request, and acquiring preprocessed initial transaction data, a first abnormal feature and a second abnormal feature;
training a pre-constructed transaction data identification model to identify abnormal transaction data from the initial transaction data according to the first abnormal characteristic to obtain an updated transaction data identification model; the transaction data identification model is constructed on the basis of a difficult case mining algorithm;
training the updated transaction data recognition model again according to the first error recognition data and the second error recognition data to obtain a target transaction data recognition model; the first error identification data is transaction data which is in accordance with the second abnormal characteristic and does not belong to the abnormal transaction data in the initial transaction data, and the second error identification data is transaction data which is not in accordance with the second abnormal characteristic in the abnormal transaction data;
and inputting the transaction data to be identified into the target transaction data identification model to obtain an identification result of the transaction data to be identified.
In one embodiment, the training of the pre-constructed transaction data recognition model to recognize abnormal transaction data from the initial transaction data includes:
fusing the abnormal transaction data and the initial transaction data according to a preset fusion proportion to obtain updated initial transaction data;
inputting the updated initial transaction data into the transaction data identification model for identification processing to obtain an abnormal identification result;
and adjusting model parameters in the transaction data identification model according to the abnormal identification result, returning to execute the step of fusing the abnormal transaction data and the initial transaction data according to a preset fusion proportion until the transaction data identification model converges.
In one embodiment, the adjusting the model parameters in the transaction data identification model according to the abnormal identification result includes:
calculating according to a preset loss function to obtain a first loss function value corresponding to the abnormal recognition result; the first loss function value is used for adjusting the weight of the transaction data identification model in the process of identifying abnormal transaction data;
adjusting model parameters in the transaction data identification model according to the first loss function value.
In one embodiment, said adjusting model parameters in said transaction data identification model according to said first loss function value comprises:
and adjusting a weight coefficient and a bias coefficient corresponding to the transaction data identification model according to the first loss function value until the first loss function value is smaller than a preset loss threshold value, and determining that the transaction data identification model reaches convergence.
In one embodiment, the retraining the updated transaction data recognition model based on the first misrecognized data and the second misrecognized data comprises:
fusing the first error identification data, the second error identification data and the updated initial transaction data according to a preset fusion proportion to obtain target transaction data;
inputting the target transaction data into the transaction data identification model for identification processing to obtain a target identification result;
and adjusting model parameters in the updated transaction data identification model according to the target identification result, and returning to execute the step of fusing the first error identification data, the second error identification data and the updated initial transaction data according to a preset fusion proportion until the updated transaction data identification model converges.
In one embodiment, the adjusting the model parameters in the updated transaction data recognition model according to the target recognition result includes:
calculating according to a preset loss function to obtain a second loss function value corresponding to the target identification result; the second loss function value is used for adjusting the weight of the updated transaction data identification model in the process of identifying abnormal transaction data;
and adjusting model parameters in the updated transaction data identification model according to the second loss function value.
In a second aspect, the present application further provides an abnormal transaction data identification apparatus, including:
the request response module is used for responding to the abnormal transaction data identification request and acquiring the preprocessed initial transaction data, the first abnormal characteristic and the second abnormal characteristic;
the data identification module is used for training a pre-constructed transaction data identification model to identify abnormal transaction data from the initial transaction data according to the first abnormal characteristic to obtain an updated transaction data identification model; the transaction data identification model is constructed on the basis of a difficult case mining algorithm;
the model updating module is used for retraining the updated transaction data identification model according to the first error identification data and the second error identification data to obtain a target transaction data identification model; the first error identification data is transaction data which is in accordance with the second abnormal characteristic and does not belong to the abnormal transaction data in the initial transaction data, and the second error identification data is transaction data which is not in accordance with the second abnormal characteristic in the abnormal transaction data;
and the result determining module is used for inputting the transaction data to be identified into the target transaction data identification model to obtain the identification result of the transaction data to be identified.
In a third aspect, the present application also provides a computer device. The computer device comprises a memory storing a computer program and a processor implementing the following steps when executing the computer program:
responding to the abnormal transaction data identification request, and acquiring preprocessed initial transaction data, a first abnormal feature and a second abnormal feature;
training a pre-constructed transaction data identification model to identify abnormal transaction data from the initial transaction data according to the first abnormal characteristic to obtain an updated transaction data identification model; the transaction data identification model is constructed on the basis of a difficult case mining algorithm;
training the updated transaction data recognition model again according to the first error recognition data and the second error recognition data to obtain a target transaction data recognition model; the first error identification data is transaction data which is in accordance with the second abnormal characteristic and does not belong to the abnormal transaction data in the initial transaction data, and the second error identification data is transaction data which is not in accordance with the second abnormal characteristic in the abnormal transaction data;
and inputting the transaction data to be identified into the target transaction data identification model to obtain an identification result of the transaction data to be identified.
In a fourth aspect, the present application further provides a computer-readable storage medium. The computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of:
responding to the abnormal transaction data identification request, and acquiring preprocessed initial transaction data, a first abnormal feature and a second abnormal feature;
training a pre-constructed transaction data identification model to identify abnormal transaction data from the initial transaction data according to the first abnormal characteristic to obtain an updated transaction data identification model; the transaction data identification model is constructed on the basis of a difficult case mining algorithm;
training the updated transaction data recognition model again according to the first error recognition data and the second error recognition data to obtain a target transaction data recognition model; the first error identification data is transaction data which is in accordance with the second abnormal characteristic and does not belong to the abnormal transaction data in the initial transaction data, and the second error identification data is transaction data which is not in accordance with the second abnormal characteristic in the abnormal transaction data;
and inputting the transaction data to be identified into the target transaction data identification model to obtain an identification result of the transaction data to be identified.
In a fifth aspect, the present application further provides a computer program product. The computer program product comprising a computer program which when executed by a processor performs the steps of:
responding to the abnormal transaction data identification request, and acquiring preprocessed initial transaction data, a first abnormal feature and a second abnormal feature;
training a pre-constructed transaction data identification model to identify abnormal transaction data from the initial transaction data according to the first abnormal characteristic to obtain an updated transaction data identification model; the transaction data identification model is constructed on the basis of a difficult case mining algorithm;
training the updated transaction data recognition model again according to the first error recognition data and the second error recognition data to obtain a target transaction data recognition model; the first error identification data is transaction data which is in accordance with the second abnormal characteristic and does not belong to the abnormal transaction data in the initial transaction data, and the second error identification data is transaction data which is not in accordance with the second abnormal characteristic in the abnormal transaction data;
and inputting the transaction data to be identified into the target transaction data identification model to obtain an identification result of the transaction data to be identified.
The abnormal transaction data identification method, the abnormal transaction data identification device, the computer equipment, the storage medium and the computer program product comprise the following steps: responding to the abnormal transaction data identification request, and acquiring preprocessed initial transaction data, a first abnormal feature and a second abnormal feature; training a pre-constructed transaction data identification model to identify abnormal transaction data from initial transaction data according to the first abnormal characteristic to obtain an updated transaction data identification model; the transaction data identification model is constructed on the basis of a difficult case mining algorithm; training the updated transaction data identification model again according to the first error identification data and the second error identification data to obtain a target transaction data identification model; the first error identification data is transaction data which accords with the second abnormal characteristic in the initial transaction data and does not belong to the abnormal transaction data, and the second error identification data is transaction data which does not accord with the second abnormal characteristic in the abnormal transaction data; inputting the transaction data to be identified into a target transaction data identification model to obtain an identification result of the transaction data to be identified; the transaction data which accord with the abnormal characteristics are continuously screened from the transaction data, and the transaction data is used for continuously updating and training the transaction data recognition model, so that the recognition accuracy of the transaction data recognition model is improved, and the effect of recognizing the abnormal transaction data is improved.
Drawings
FIG. 1 is a diagram of an exemplary application environment for a method for anomalous transaction data identification in one embodiment;
FIG. 2 is a flow diagram illustrating a method for anomalous transaction data identification in one embodiment;
FIG. 3 is a flowchart illustrating the steps of training a pre-constructed transaction data recognition model to recognize anomalous transaction data from initial transaction data in one embodiment;
FIG. 4 is a schematic flow chart diagram illustrating a method for anomalous transaction data identification in an embodiment;
FIG. 5 is a schematic diagram of a transaction system in one embodiment;
FIG. 6 is a block diagram showing the structure of an abnormal transaction data recognition apparatus according to an embodiment;
FIG. 7 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
It is noted that user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data for presentation, analyzed data, etc.) to which the present disclosure relates are both information and data that are authorized by the user or sufficiently authorized by various parties; correspondingly, the present disclosure also provides a corresponding user authorization entry for the user to select authorization or to select denial.
In addition, the abnormal transaction data identification method, the abnormal transaction data identification device, the computer equipment, the storage medium and the computer program product can be used in the technical field of finance to improve the effect of identifying abnormal transaction data; but also in any field other than the financial field, such as the information security field. The technical fields of the abnormal transaction data identification method, the abnormal transaction data identification device, the computer equipment, the storage medium and the computer program product provided by the application are not limited.
The abnormal transaction data identification method provided by the embodiment of the application can be applied to the application environment shown in fig. 1. Wherein the terminal 102 communicates with the server 104 via a network. The data storage system may store data that the server 104 needs to process. The data storage system may be integrated on the server 104, or may be located on the cloud or other network server. The server 104 responds to the abnormal transaction data identification request sent by the terminal 102, and obtains the preprocessed initial transaction data, the first abnormal characteristic and the second abnormal characteristic; the server 104 trains a pre-constructed transaction data identification model to identify abnormal transaction data from the initial transaction data according to the first abnormal characteristic to obtain an updated transaction data identification model; the transaction data identification model is constructed on the basis of a difficult case mining algorithm; the server 104 trains the updated transaction data identification model again according to the first error identification data and the second error identification data to obtain a target transaction data identification model; the first error identification data is transaction data which accords with the second abnormal characteristic in the initial transaction data and does not belong to the abnormal transaction data, and the second error identification data is transaction data which does not accord with the second abnormal characteristic in the abnormal transaction data; the server 104 inputs the transaction data to be identified into the target transaction data identification model to obtain an identification result of the transaction data to be identified; the server 104 returns the identification result of the transaction data to be identified to the terminal 102.
The terminal 102 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, internet of things devices and portable wearable devices, and the internet of things devices may be smart speakers, smart televisions, smart air conditioners, smart car-mounted devices, and the like. The portable wearable device can be a smart watch, a smart bracelet, a head-mounted device, and the like. The server 104 may be implemented as a stand-alone server or as a server cluster comprised of multiple servers.
In one embodiment, as shown in fig. 2, an abnormal transaction data identification method is provided, which is described by taking the method as an example applied to the server 104 in fig. 1, and includes the following steps:
step 202, in response to the abnormal transaction data identification request, acquiring the preprocessed initial transaction data, the first abnormal characteristic and the second abnormal characteristic.
The abnormal transaction data identification request refers to a request for identifying and processing abnormal transaction data; the abnormal transaction data identification request may be a request which is sent to the server by the terminal, or may be a request which is generated before the abnormal transaction data identification operation is performed by the server according to a set cycle, frequency and the like.
The initial transaction data refers to transaction original data and is not processed, classified, screened and the like; the preprocessed initial transaction data refers to transaction data obtained by processing the initial transaction data through screening, processing, classifying and the like. It should be noted that the initial transaction data may be subjected to user's personal feature removal processing during processing, and the corresponding transaction data may need to be authorized by the user during collection.
The first abnormal feature refers to a type of preset condition information capable of identifying the preprocessed initial transaction data; the second abnormal feature refers to another type of preset condition information capable of re-identifying the preprocessed initial transaction data. The first abnormal characteristic and the second abnormal characteristic are equivalent to two application conditions during data processing, and can be updated according to the processing process of transaction processing so as to ensure that the abnormal characteristic is accurately identified.
Specifically, the initial transaction data may be obtained from a database of a commercial establishment such as a bank according to a certain time range, area range, transaction type, and the like. For example, a bank system can continuously receive transaction data as initial transaction data through multiple channels; data sources include, but are not limited to, personal customer transaction data, public customer transaction data, credit card transaction data, and the like. The process of preprocessing the initial transaction data comprises but is not limited to matching the initial transaction data into loan and loan according to transaction types, identifying a client of a target bank, eliminating transaction data with the same client number and mutual transfer, eliminating repeated transaction and white list transaction data, identifying a transaction counter-party through an account number of the transaction data, supplementing information such as a financial mechanism network point number, a region number, MAC, IP, an industry to which the counter-party belongs and the like, completing the preprocessing process of the initial transaction data, and realizing the screening and perfecting of the transaction data.
Specifically, the first abnormal feature may be an abnormal transaction data feature that is determined by a commercial institution such as a bank in advance according to historical experience, laws and regulations and the like and does not conform to a normal transaction feature, for example, a ratio of the personal debit accumulated number to the credit accumulated number in T days is less than or equal to 1: x, or X or more: 1, and the accumulated amount of the lender is more than or equal to A yuan; the proportion of the number of fund transactions in the total number of fund transactions at night exceeds a threshold value X and the like. The second abnormal feature may be an abnormal clue feature obtained by analyzing and summarizing the transaction details determined as abnormal transaction data one by a machine or a human, for example, the examples shown in table 1:
Figure RE-GDA0003492126340000081
TABLE 1 Examples of Exception features
Step 204, training a pre-constructed transaction data identification model to identify abnormal transaction data from the initial transaction data according to the first abnormal characteristic to obtain an updated transaction data identification model; the transaction data identification model is constructed on the basis of a difficult case mining algorithm.
The abnormal transaction data is the transaction data which accords with the first abnormal characteristic in the initial transaction data.
The normal transaction data without the abnormal transaction characteristics is called a positive sample, and the abnormal transaction data with the abnormal transaction characteristics is called a negative sample. In transaction data accumulated by commercial establishments such as banks, there is often a case where the proportion of positive and negative samples is seriously unbalanced. Therefore, the transaction data identification model can be constructed and operated from two aspects of algorithm and data by using a Hard Negative Mining Method (a Hard Negative Mining algorithm, also called Hard Mining) Method and a Focal local (a Loss function for solving the problem of serious imbalance of positive and Negative sample proportions in one-stage target detection on the basis of a cross entropy Loss function) Method based on a Hard Negative Mining algorithm; in the aspect of a Loss function, a Focal local method is adopted to increase the weight of negative sample Loss and reduce the weight of positive sample Loss, so that the model is focused on the regular learning of the negative samples to process the sample imbalance problem. In the aspect of sample data, a Hard Negative Mining Method idea is applied, the Negative sample data with a small proportion are used for training alternately in an iterative mode, the model is updated by the sample set, then the model is fixed to select the target data with the wrong resolution, and the target data is added into the sample set to continue training.
The Focal loss method adopted in the embodiment is based on the change of the binary cross entropy loss. The conventional two-class cross entropy loss is:
Figure RE-GDA0003492126340000091
wherein y is the real category of the sample data, wherein when y takes the value-1, the sample data is normal transaction data, when y takes the value 1, the sample data is suspicious data with risk, p is the predicted category probability, the value range is [0,1], and pt is defined:
Figure RE-GDA0003492126340000092
then
CE(p:y)=CE(pt)=-log(pt);
The loss function treats both positive and negative samples equally, and when the positive and negative samples are not balanced, the total loss of the positive samples overwhelms the total loss of a small number of negative samples. Because the loss is the total loss of all the samples, the final learning direction of the model does not put emphasis on the negative samples, the loss of the positive samples dominates the loss in the learning process of the model, the influence of the loss of the negative samples is submerged, and even though the loss of a single negative sample is relatively large in practice.
The weight of the modification penalty is therefore:
CE(pt)=-αtlog(pt);
wherein the definition of alpha t is the same as that of pt. The weight of the loss of the negative sample is increased, the weight of the loss of the positive sample is reduced, and the problem that the proportion of the positive sample and the negative sample is not uniform is solved, so that the model focuses on learning the rule of the negative sample.
Then, the loss of the simple samples is weighted down, so that the loss of the model training can be concentrated on the negative samples which are difficult to learn.
The form of Focal local is as follows:
FL(pt)=-(1-pt)ylog(pt);
wherein, 1-pt is called as a modulation factor, when a sample is classified by errors, pt is very small, and the modulation factor is close to 1 and has no influence on the original loss; when pt goes towards 1, the modulation factor is close to 0. In this case, the loss of samples that are easy to classify is weighted down. r is used to control the rate of weight reduction. When r is 0, FL is the cross entropy loss function, and the obvious suppression of the loss in the positive sample is realized, so that the final learning focus of the model is placed on the negative sample.
For example, in practical application, taking screening of abnormal transaction data with a risk in initial transaction data as an example, because the abnormal transaction data may be only one ten-thousandth of normal transaction data, a large amount of normal transaction data, namely easy examples (background samples), exists in a sample, the positive-negative ratio is seriously unbalanced, so that the contribution of suspicious data to overall loss is less, the normal transaction data mainly contributes to the overall loss, the updating direction of the gradient is dominant, so that the model cannot obtain useful information, and the abnormal transaction data with a risk cannot be accurately screened. Thus, the α t parameter in the focal loss function can be determined by: the method comprises the steps of carrying out batch processing on initial transaction data based on a transaction data identification model, checking the performance of the model discrimination data, carrying out statistical recording on the alpha t parameter estimation effect at each time in an evaluation period, and estimating the alpha t parameter from multiple dimensions of alarm rate, suspicious rate and recall rate.
For example, during the evaluation period, the alarm rate may be set to transaction data for a single evaluation α t parameter, and the identification model predicts the ratio of the total amount of samples at risk to the total amount of samples during the evaluation period; the suspicion rate may be set as the ratio of the total number of samples actually screened as risky by the transaction data recognition model for the single evaluation at parameter to the total number of samples predicted as risky by the single evaluation model during the evaluation period; the recall rate can be set as the ratio of the total number of samples at risk that the model for the single evaluation of the α t parameter misclassifies to the total number of samples at risk that the model for the single evaluation predicts during the evaluation period; and dynamically adjusting the value of alpha t, and when the recall rate is within an acceptable range, infinitely converging the alarm rate to the suspicious rate, namely selecting the alpha t parameter under the result.
Step 206, training the updated transaction data identification model again according to the first error identification data and the second error identification data to obtain a target transaction data identification model; the first error identification data is transaction data which accords with the second abnormal characteristic and does not belong to the abnormal transaction data in the initial transaction data, and the second error identification data is transaction data which does not accord with the second abnormal characteristic in the abnormal transaction data.
The first error identification data is transaction data which accords with the second abnormal characteristic in the initial transaction data, namely the first error identification data is abnormal transaction data which is not identified through the first abnormal characteristic; similarly, the second error identification data is transaction data that does not conform to the second abnormal characteristic in the abnormal transaction data, i.e., normal transaction data identified as abnormal transaction data.
Specifically, the initial transaction data are respectively screened through the second abnormal features, and abnormal transaction data which are not identified by the first abnormal features are screened out to be used as first error identification data; in addition, the abnormal transaction data is screened through the second abnormal characteristic, and the transaction data which is identified as the abnormal transaction data by the first abnormal characteristic and is actually the normal transaction data is screened out. And inputting the first error identification data, the second error identification data and the initial transaction data into the transaction data identification model so as to train and update the transaction data identification model again, thereby improving the identification accuracy of the transaction data identification model.
And step 208, inputting the transaction data to be identified into the target transaction data identification model to obtain an identification result of the transaction data to be identified.
Specifically, the trained target transaction data identification model has the capability of accurately identifying the transaction data as normal transaction data or abnormal transaction data, and has higher accuracy; therefore, the transaction data to be identified can be directly input into the target transaction data identification model, and whether the transaction data to be identified is abnormal transaction data or not can be determined according to the identification result output by the target transaction data identification model.
In the abnormal transaction data identification method, the initial transaction data, the first abnormal characteristic and the second abnormal characteristic which are preprocessed are obtained by responding to the abnormal transaction data identification request; training a pre-constructed transaction data identification model to identify abnormal transaction data from initial transaction data according to the first abnormal characteristic to obtain an updated transaction data identification model; the transaction data identification model is constructed on the basis of a difficult case mining algorithm; training the updated transaction data identification model again according to the first error identification data and the second error identification data to obtain a target transaction data identification model; the first error identification data is transaction data which accords with the second abnormal characteristic in the initial transaction data and does not belong to the abnormal transaction data, and the second error identification data is transaction data which does not accord with the second abnormal characteristic in the abnormal transaction data; inputting the transaction data to be identified into a target transaction data identification model to obtain an identification result of the transaction data to be identified; the transaction data which accord with the abnormal characteristics are continuously screened from the transaction data, and the transaction data is used for continuously updating and training the transaction data recognition model, so that the recognition accuracy of the transaction data recognition model is improved, and the effect of recognizing the abnormal transaction data is improved.
In one embodiment, as shown in fig. 3, training the pre-constructed transaction data recognition model to recognize abnormal transaction data from the initial transaction data includes:
step 302, fusing the abnormal transaction data and the initial transaction data according to a preset fusion proportion to obtain updated initial transaction data;
step 304, inputting the updated initial transaction data into a transaction data identification model for identification processing to obtain an abnormal identification result;
and step 306, adjusting the model parameters in the transaction data identification model according to the abnormal identification result, returning to execute the step of fusing the abnormal transaction data and the initial transaction data according to a preset fusion proportion until the transaction data identification model converges.
The preset fusion proportion means that abnormal transaction data are fused into initial transaction data continuously, so that the abnormal transaction data in the fused initial transaction data occupy a certain proportion; the updated initial transaction data is new transaction data after the proportion of the abnormal transaction data reaches the fusion proportion. The abnormal recognition result is a recognition result obtained after updated abnormal transaction data are input into the transaction data recognition model, and the current recognition accuracy of the transaction data recognition model can be judged according to the recognition result so as to continuously adjust the model parameters of the transaction data recognition model until the accuracy of the transaction data recognition model is converged.
In this embodiment, the abnormal transaction data and the initial transaction data are fused continuously, so that the proportion of positive and negative samples in the training data of the transaction data identification model is coordinated with each other, and the transaction data identification model can increase the identification capability of the negative sample, namely the abnormal transaction data, so as to improve the effect of identifying the abnormal transaction data.
In one embodiment, adjusting model parameters in the transaction data recognition model according to the abnormal recognition result comprises: calculating according to a preset loss function to obtain a first loss function value corresponding to the abnormal recognition result; the first loss function value is used for adjusting the weight of the transaction data identification model in the process of identifying abnormal transaction data; model parameters in the transaction data identification model are adjusted according to the first loss function value.
Specifically, the server acquires a preset loss function, and determines a first loss function value of a transaction data identification model according to an abnormal identification result and a real label; and adjusting the model parameters in the transaction data identification model according to the first loss function value, and continuously reducing the first loss function value so as to ensure that the identification accuracy of the transaction data identification model is more accurate.
In the embodiment, the model parameters of the transaction data identification model are adjusted by determining the first loss function value, so that the accuracy of the transaction data identification model in identifying abnormal transaction data is improved.
In one embodiment, adjusting model parameters in the transaction data identification model according to the first loss function value comprises: and adjusting a weight coefficient and a bias coefficient corresponding to the transaction data identification model according to the first loss function value until the first loss function value is smaller than a preset loss threshold value, and determining that the transaction data identification model reaches convergence.
Specifically, the model parameters include weighting coefficients between individual neurons in the transaction data identification model and the network layer, and corresponding bias coefficients; i.e. the adjustment of the model parameters comprises weight coefficients and bias coefficients. And after the weight coefficient and the bias coefficient are adjusted each time, recalculating the first loss function value of the adjusted transaction data identification model, and when the first loss function value is smaller than a preset loss threshold, indicating that the adjusted transaction data identification model reaches a preset training target, namely judging that the transaction data identification model reaches convergence.
According to the embodiment, the weight coefficient and the bias coefficient corresponding to the transaction data identification model are adjusted according to the first function loss value, so that the efficiency of adjusting the transaction data identification model is improved, and the accuracy of identifying abnormal transaction data by the transaction data identification model is improved.
In one embodiment, retraining the updated transaction data recognition model based on the first misrecognized data and the second misrecognized data comprises: fusing the first error identification data, the second error identification data and the updated initial transaction data according to a preset fusion proportion to obtain target transaction data; inputting the target transaction data into a transaction data identification model for identification processing to obtain a target identification result; and adjusting model parameters in the updated transaction data identification model according to the target identification result, and returning to execute the step of fusing the first error identification data, the second error identification data and the updated initial transaction data according to a preset fusion proportion until the updated transaction data identification model converges.
Specifically, the first error identification data and the second error identification data are data for identifying errors according to the transaction data identification model identified by the second abnormal feature, and are equivalent to negative samples; therefore, the first error identification data, the second error identification data and the initial transaction data are fused, and when the first error identification data and the second error identification data reach a preset fusion proportion, the fused transaction data is the target transaction data. And inputting the target transaction data into the transaction data identification model for identification processing, and after a target identification result is obtained, adjusting model parameters in the transaction data identification model according to the target identification result so as to converge the updated transaction data identification model.
In the embodiment, the first error identification data, the second error identification data and the initial transaction data are fused continuously, so that the transaction data identification model continuously strengthens the learning of the transaction data for identifying errors in the training process, the error correction function is realized, and the effect of identifying abnormal transaction data is improved.
In one embodiment, adjusting model parameters in the updated transaction data recognition model according to the target recognition result includes: calculating according to a preset loss function to obtain a second loss function value corresponding to the target identification result; the second loss function value is used for adjusting the weight of the updated transaction data identification model in the process of identifying abnormal transaction data; and adjusting the model parameters in the updated transaction data identification model according to the second loss function value.
Specifically, the server obtains a preset loss function, and a second loss function value of the transaction data identification model is determined according to the target identification result and the real label; and adjusting the model parameters in the transaction data identification model according to the second loss function value, and continuously reducing the second loss function value so as to ensure that the identification accuracy of the transaction data identification model is more accurate. It should be noted that, the parameter adjustment of the transaction data identification model according to the first loss function value and the adjustment of the model parameter in the transaction data identification model according to the second loss function value are not performed in isolation, but performed sequentially and jointly on the same transaction data identification model.
In the embodiment, the model parameters of the transaction data identification model are adjusted by determining the second loss function value, so that the accuracy of the transaction data identification model in identifying abnormal transaction data is improved.
In one embodiment, as shown in fig. 4, there is provided another abnormal transaction data identification method, which is applied to a transaction system and is capable of identifying data in transaction data, and includes the following specific steps:
step 1: and acquiring original transaction data and preprocessing the data.
Specifically, the system receives hundreds of millions of transaction data as original data input by a model through multiple channels, wherein the data source comprises personal customer transaction data, public customer transaction data, credit card transaction data and the like; primarily processing transaction data, matching the transaction data into two loan pens, identifying customers in the bank, eliminating mutual transfer transaction data of the same customers, eliminating repeated transactions and white list transactions, identifying a transaction counter through an account number, and supplementing information of financial mechanism network point numbers, area numbers, MAC, IP, industries of the counter party and the like;
step 2: and carrying out model batch running and preliminary screening on data, and adjusting the proportion of positive and negative samples.
Specifically, the system refers to suspicious data characteristics with risks (for example, in D days, the proportion of the personal debit accumulated number to the credit accumulated number is less than or equal to 1: X or greater than or equal to X: 1, and the debit accumulated amount is greater than or equal to A yuan; the proportion of the fund transaction number in the fund transaction number at night to the total number exceeds a threshold value X, and the like), the model is operated to run batch screening data, normal transaction data and suspicious transaction data are screened out, suspicious data are accumulated, and secondary data input by the model are obtained; and adding the screened suspicious data into the original data, increasing the proportion of negative samples in the sample set, running again and updating the model.
And step 3: and (5) performing iterative training.
Specifically, repeating the step 2, continuously increasing the proportion of the negative samples and performing iterative training on the model;
and 4, step 4: and accurately screening suspicious data.
Specifically, according to the actual success of the business, the suspicious transaction details obtained in the above steps are analyzed one by one, the suspicious transaction details are analyzed one by one, suspicious transaction clues are found out, and after summary, whether a preset suspicious characteristic sample is met or not is comprehensively judged;
and 5: and training the sample set until the model converges.
Specifically, using the suspicious data with risks accurately screened in the step 4, the system acquires and operates the model again; continuously selecting the wrong discrimination data by the fixed model, adding the wrong discrimination data into the sample set, and continuing training; and continuously adjusting the weight value and the bias value, and training the optimization model until convergence.
Step 6: and outputting the suspicious data.
In particular, the system may output suspicious data at risk.
The transaction system in this embodiment may be composed of five parts and implemented, as shown in fig. 5, including:
part a 1: the system comprises an original data acquisition module, a data processing module and a data processing module, wherein the system acquires basic transaction data from upstream and performs primary processing;
part a 2: the sample weight adjusting module is used for acquiring the A1 part of the primarily processed basic transaction data, obviously suppressing the Loss of the normal transaction data through the Focal local algorithm, increasing the weight of the Loss of the negative sample, and enabling the follow-up module of the device to study the principle that the emphasis is inclined to the negative sample
Part a 3: the data training module is used for adding suspicious data screened each time into next basic data in the sample set updating submodule so as to increase the proportion of negative samples and rerun the model for training;
part a 4: the data screening module is used for screening normal transaction data and suspicious data screened in the prepositive module and screening wrong data screened in the prepositive module;
part a 5: and the error data training module is used for fixing the model trained in the front module, acquiring screening error data in the part A4 and independently adding the screening error data as basic data into the model for training, and continuously adjusting parameters until the model is optimal.
The embodiment has the following beneficial effects: (1) optimizing model parameters, reducing the influence of normal transaction data with large proportion in the whole model operation process, and increasing the weight of suspicious data loss so that the model focuses on the study of suspicious data characteristics. (2) The positive and negative sample proportion balance is realized, the screened suspicious data is added into the sample set through repeated iterative training, the negative sample proportion is increased, the model obtains the negative sample data support, effective data characteristics are obtained, and the transaction data are effectively discriminated. (3) And finally, the situation that the transaction data is wrongly discriminated in the model is effectively reduced.
It should be understood that, although the steps in the flowcharts related to the embodiments are shown in sequence as indicated by the arrows, the steps are not necessarily executed in sequence as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a part of the steps in the flowcharts related to the above embodiments may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the steps or stages is not necessarily sequential, but may be performed alternately or alternately with other steps or at least a part of the steps or stages in other steps.
Based on the same inventive concept, the embodiment of the application also provides an abnormal transaction data identification device for realizing the abnormal transaction data identification method. The implementation scheme for solving the problem provided by the device is similar to the implementation scheme recorded in the method, so that specific limitations in one or more embodiments of the abnormal transaction data identification device provided below can be referred to the limitations on the abnormal transaction data identification method in the above, and details are not described herein again.
In one embodiment, as shown in fig. 6, there is provided an abnormal transaction data identification apparatus including: a request response module 602, a data identification module 604, a model update module 606, and a result determination module 608, wherein:
a request response module 602, configured to, in response to the abnormal transaction data identification request, obtain the preprocessed initial transaction data, the first abnormal feature, and the second abnormal feature;
the data identification module 604 is configured to train a pre-established transaction data identification model to identify abnormal transaction data from the initial transaction data according to the first abnormal feature, so as to obtain an updated transaction data identification model; the transaction data identification model is constructed on the basis of a difficult case mining algorithm;
the model updating module 606 is configured to train the updated transaction data identification model again according to the first error identification data and the second error identification data to obtain a target transaction data identification model; the first error identification data is transaction data which accords with the second abnormal characteristic in the initial transaction data and does not belong to the abnormal transaction data, and the second error identification data is transaction data which does not accord with the second abnormal characteristic in the abnormal transaction data;
and the result determining module 608 is configured to input the transaction data to be identified into the target transaction data identification model, so as to obtain an identification result of the transaction data to be identified.
In an embodiment, the data identification module 604 is further configured to fuse the abnormal transaction data and the initial transaction data according to a preset fusion ratio to obtain updated initial transaction data; inputting the updated initial transaction data into a transaction data identification model for identification processing to obtain an abnormal identification result; and adjusting model parameters in the transaction data identification model according to the abnormal identification result, returning to execute the step of fusing the abnormal transaction data and the initial transaction data according to a preset fusion proportion until the transaction data identification model converges.
In an embodiment, the data identification module 604 is further configured to calculate a first loss function value corresponding to the abnormal identification result according to a preset loss function; the first loss function value is used for adjusting the weight of the transaction data identification model in the process of identifying abnormal transaction data; model parameters in the transaction data identification model are adjusted according to the first loss function value.
In one embodiment, the data identification module 604 is further configured to adjust the weight coefficient and the bias coefficient corresponding to the transaction data identification model according to the first loss function value, and determine that the transaction data identification model converges when the first loss function value is smaller than a preset loss threshold.
In one embodiment, the model updating module 606 is further configured to fuse the first error identification data, the second error identification data, and the updated initial transaction data according to a preset fusion ratio to obtain target transaction data; inputting the target transaction data into a transaction data identification model for identification processing to obtain a target identification result; and adjusting model parameters in the updated transaction data identification model according to the target identification result, and returning to execute the step of fusing the first error identification data, the second error identification data and the updated initial transaction data according to a preset fusion proportion until the updated transaction data identification model converges.
In an embodiment, the model updating module 606 is further configured to calculate a second loss function value corresponding to the target recognition result according to a preset loss function; the second loss function value is used for adjusting the weight of the updated transaction data identification model in the process of identifying abnormal transaction data; and adjusting the model parameters in the updated transaction data identification model according to the second loss function value.
The modules in the abnormal transaction data identification device can be wholly or partially implemented by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 7. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is for storing anomalous transaction data identifying data. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement an anomalous transaction data identification method.
Those skilled in the art will appreciate that the architecture shown in fig. 7 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having a computer program stored therein, the processor implementing the following steps when executing the computer program:
responding to the abnormal transaction data identification request, and acquiring preprocessed initial transaction data, a first abnormal feature and a second abnormal feature;
training a pre-constructed transaction data identification model to identify abnormal transaction data from initial transaction data according to the first abnormal characteristic to obtain an updated transaction data identification model; the transaction data identification model is constructed on the basis of a difficult case mining algorithm;
training the updated transaction data identification model again according to the first error identification data and the second error identification data to obtain a target transaction data identification model; the first error identification data is transaction data which accords with the second abnormal characteristic in the initial transaction data and does not belong to the abnormal transaction data, and the second error identification data is transaction data which does not accord with the second abnormal characteristic in the abnormal transaction data;
and inputting the transaction data to be identified into the target transaction data identification model to obtain an identification result of the transaction data to be identified.
In one embodiment, the processor, when executing the computer program, further performs the steps of: fusing the abnormal transaction data and the initial transaction data according to a preset fusion proportion to obtain updated initial transaction data; inputting the updated initial transaction data into a transaction data identification model for identification processing to obtain an abnormal identification result; and adjusting model parameters in the transaction data identification model according to the abnormal identification result, returning to execute the step of fusing the abnormal transaction data and the initial transaction data according to a preset fusion proportion until the transaction data identification model converges.
In one embodiment, the processor, when executing the computer program, further performs the steps of: calculating according to a preset loss function to obtain a first loss function value corresponding to the abnormal recognition result; the first loss function value is used for adjusting the weight of the transaction data identification model in the process of identifying abnormal transaction data; model parameters in the transaction data identification model are adjusted according to the first loss function value.
In one embodiment, the processor, when executing the computer program, further performs the steps of: and adjusting a weight coefficient and a bias coefficient corresponding to the transaction data identification model according to the first loss function value until the first loss function value is smaller than a preset loss threshold value, and determining that the transaction data identification model reaches convergence.
In one embodiment, the processor, when executing the computer program, further performs the steps of: fusing the first error identification data, the second error identification data and the updated initial transaction data according to a preset fusion proportion to obtain target transaction data; inputting the target transaction data into a transaction data identification model for identification processing to obtain a target identification result; and adjusting model parameters in the updated transaction data identification model according to the target identification result, and returning to execute the step of fusing the first error identification data, the second error identification data and the updated initial transaction data according to a preset fusion proportion until the updated transaction data identification model converges.
In one embodiment, the processor, when executing the computer program, further performs the steps of: calculating according to a preset loss function to obtain a second loss function value corresponding to the target identification result; the second loss function value is used for adjusting the weight of the updated transaction data identification model in the process of identifying abnormal transaction data; and adjusting the model parameters in the updated transaction data identification model according to the second loss function value.
In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of:
responding to the abnormal transaction data identification request, and acquiring preprocessed initial transaction data, a first abnormal feature and a second abnormal feature;
training a pre-constructed transaction data identification model to identify abnormal transaction data from initial transaction data according to the first abnormal characteristic to obtain an updated transaction data identification model; the transaction data identification model is constructed on the basis of a difficult case mining algorithm;
training the updated transaction data identification model again according to the first error identification data and the second error identification data to obtain a target transaction data identification model; the first error identification data is transaction data which accords with the second abnormal characteristic in the initial transaction data and does not belong to the abnormal transaction data, and the second error identification data is transaction data which does not accord with the second abnormal characteristic in the abnormal transaction data;
and inputting the transaction data to be identified into the target transaction data identification model to obtain an identification result of the transaction data to be identified.
In one embodiment, the computer program when executed by the processor further performs the steps of: fusing the abnormal transaction data and the initial transaction data according to a preset fusion proportion to obtain updated initial transaction data; inputting the updated initial transaction data into a transaction data identification model for identification processing to obtain an abnormal identification result; and adjusting model parameters in the transaction data identification model according to the abnormal identification result, returning to execute the step of fusing the abnormal transaction data and the initial transaction data according to a preset fusion proportion until the transaction data identification model converges.
In one embodiment, the computer program when executed by the processor further performs the steps of: calculating according to a preset loss function to obtain a first loss function value corresponding to the abnormal recognition result; the first loss function value is used for adjusting the weight of the transaction data identification model in the process of identifying abnormal transaction data; model parameters in the transaction data identification model are adjusted according to the first loss function value.
In one embodiment, the computer program when executed by the processor further performs the steps of: and adjusting a weight coefficient and a bias coefficient corresponding to the transaction data identification model according to the first loss function value until the first loss function value is smaller than a preset loss threshold value, and determining that the transaction data identification model reaches convergence.
In one embodiment, the computer program when executed by the processor further performs the steps of: fusing the first error identification data, the second error identification data and the updated initial transaction data according to a preset fusion proportion to obtain target transaction data; inputting the target transaction data into a transaction data identification model for identification processing to obtain a target identification result; and adjusting model parameters in the updated transaction data identification model according to the target identification result, and returning to execute the step of fusing the first error identification data, the second error identification data and the updated initial transaction data according to a preset fusion proportion until the updated transaction data identification model converges.
In one embodiment, the computer program when executed by the processor further performs the steps of: calculating according to a preset loss function to obtain a second loss function value corresponding to the target identification result; the second loss function value is used for adjusting the weight of the updated transaction data identification model in the process of identifying abnormal transaction data; and adjusting the model parameters in the updated transaction data identification model according to the second loss function value.
In one embodiment, a computer program product is provided, comprising a computer program which, when executed by a processor, performs the steps of:
responding to the abnormal transaction data identification request, and acquiring preprocessed initial transaction data, a first abnormal feature and a second abnormal feature;
training a pre-constructed transaction data identification model to identify abnormal transaction data from initial transaction data according to the first abnormal characteristic to obtain an updated transaction data identification model; the transaction data identification model is constructed on the basis of a difficult case mining algorithm;
training the updated transaction data identification model again according to the first error identification data and the second error identification data to obtain a target transaction data identification model; the first error identification data is transaction data which accords with the second abnormal characteristic in the initial transaction data and does not belong to the abnormal transaction data, and the second error identification data is transaction data which does not accord with the second abnormal characteristic in the abnormal transaction data;
and inputting the transaction data to be identified into the target transaction data identification model to obtain an identification result of the transaction data to be identified.
In one embodiment, the computer program when executed by the processor further performs the steps of: fusing the abnormal transaction data and the initial transaction data according to a preset fusion proportion to obtain updated initial transaction data; inputting the updated initial transaction data into a transaction data identification model for identification processing to obtain an abnormal identification result; and adjusting model parameters in the transaction data identification model according to the abnormal identification result, returning to execute the step of fusing the abnormal transaction data and the initial transaction data according to a preset fusion proportion until the transaction data identification model converges.
In one embodiment, the computer program when executed by the processor further performs the steps of: calculating according to a preset loss function to obtain a first loss function value corresponding to the abnormal recognition result; the first loss function value is used for adjusting the weight of the transaction data identification model in the process of identifying abnormal transaction data; model parameters in the transaction data identification model are adjusted according to the first loss function value.
In one embodiment, the computer program when executed by the processor further performs the steps of: and adjusting a weight coefficient and a bias coefficient corresponding to the transaction data identification model according to the first loss function value until the first loss function value is smaller than a preset loss threshold value, and determining that the transaction data identification model reaches convergence.
In one embodiment, the computer program when executed by the processor further performs the steps of: fusing the first error identification data, the second error identification data and the updated initial transaction data according to a preset fusion proportion to obtain target transaction data; inputting the target transaction data into a transaction data identification model for identification processing to obtain a target identification result; and adjusting model parameters in the updated transaction data identification model according to the target identification result, and returning to execute the step of fusing the first error identification data, the second error identification data and the updated initial transaction data according to a preset fusion proportion until the updated transaction data identification model converges.
In one embodiment, the computer program when executed by the processor further performs the steps of: calculating according to a preset loss function to obtain a second loss function value corresponding to the target identification result; the second loss function value is used for adjusting the weight of the updated transaction data identification model in the process of identifying abnormal transaction data; and adjusting the model parameters in the updated transaction data identification model according to the second loss function value.
It should be noted that, the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data for analysis, stored data, presented data, etc.) referred to in the present application are information and data authorized by the user or sufficiently authorized by each party.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware related to instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, database, or other medium used in the embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high-density embedded nonvolatile Memory, resistive Random Access Memory (ReRAM), Magnetic Random Access Memory (MRAM), Ferroelectric Random Access Memory (FRAM), Phase Change Memory (PCM), graphene Memory, and the like. Volatile Memory can include Random Access Memory (RAM), external cache Memory, and the like. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others. The databases referred to in various embodiments provided herein may include at least one of relational and non-relational databases. The non-relational database may include, but is not limited to, a block chain based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic devices, quantum computing based data processing logic devices, etc., without limitation.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above examples only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present application shall be subject to the appended claims.

Claims (10)

1. A method for anomalous transaction data identification, said method comprising:
responding to the abnormal transaction data identification request, and acquiring preprocessed initial transaction data, a first abnormal feature and a second abnormal feature;
training a pre-constructed transaction data identification model to identify abnormal transaction data from the initial transaction data according to the first abnormal characteristic to obtain an updated transaction data identification model; the transaction data identification model is constructed on the basis of a difficult case mining algorithm;
training the updated transaction data recognition model again according to the first error recognition data and the second error recognition data to obtain a target transaction data recognition model; the first error identification data is transaction data which is in accordance with the second abnormal characteristic and does not belong to the abnormal transaction data in the initial transaction data, and the second error identification data is transaction data which is not in accordance with the second abnormal characteristic in the abnormal transaction data;
and inputting the transaction data to be identified into the target transaction data identification model to obtain an identification result of the transaction data to be identified.
2. The method of claim 1, wherein training the pre-constructed transaction data recognition model to identify anomalous transaction data from the initial transaction data comprises:
fusing the abnormal transaction data and the initial transaction data according to a preset fusion proportion to obtain updated initial transaction data;
inputting the updated initial transaction data into the transaction data identification model for identification processing to obtain an abnormal identification result;
and adjusting model parameters in the transaction data identification model according to the abnormal identification result, returning to execute the step of fusing the abnormal transaction data and the initial transaction data according to a preset fusion proportion until the transaction data identification model converges.
3. The method of claim 2, wherein said adjusting model parameters in said transaction data recognition model based on said anomaly recognition result comprises:
calculating according to a preset loss function to obtain a first loss function value corresponding to the abnormal recognition result; the first loss function value is used for adjusting the weight of the transaction data identification model in the process of identifying abnormal transaction data;
adjusting model parameters in the transaction data identification model according to the first loss function value.
4. The method of claim 3, wherein said adjusting model parameters in the transaction data identification model according to the first loss function value comprises:
and adjusting a weight coefficient and a bias coefficient corresponding to the transaction data identification model according to the first loss function value until the first loss function value is smaller than a preset loss threshold value, and determining that the transaction data identification model reaches convergence.
5. The method of claim 2, wherein retraining the updated transaction data recognition model based on the first and second misrecognized data comprises:
fusing the first error identification data, the second error identification data and the updated initial transaction data according to a preset fusion proportion to obtain target transaction data;
inputting the target transaction data into the transaction data identification model for identification processing to obtain a target identification result;
and adjusting model parameters in the updated transaction data identification model according to the target identification result, and returning to execute the step of fusing the first error identification data, the second error identification data and the updated initial transaction data according to a preset fusion proportion until the updated transaction data identification model converges.
6. The method of claim 5, wherein the adjusting model parameters in the updated transaction data recognition model based on the target recognition result comprises:
calculating according to a preset loss function to obtain a second loss function value corresponding to the target identification result; the second loss function value is used for adjusting the weight of the updated transaction data identification model in the process of identifying abnormal transaction data;
and adjusting model parameters in the updated transaction data identification model according to the second loss function value.
7. An anomalous transaction data identification device, said device comprising:
the request response module is used for responding to the abnormal transaction data identification request and acquiring the preprocessed initial transaction data, the first abnormal characteristic and the second abnormal characteristic;
the data identification module is used for training a pre-constructed transaction data identification model to identify abnormal transaction data from the initial transaction data according to the first abnormal characteristic to obtain an updated transaction data identification model; the transaction data identification model is constructed on the basis of a difficult case mining algorithm;
the model updating module is used for retraining the updated transaction data identification model according to the first error identification data and the second error identification data to obtain a target transaction data identification model; the first error identification data is transaction data which is in accordance with the second abnormal characteristic and does not belong to the abnormal transaction data in the initial transaction data, and the second error identification data is transaction data which is not in accordance with the second abnormal characteristic in the abnormal transaction data;
and the result determining module is used for inputting the transaction data to be identified into the target transaction data identification model to obtain the identification result of the transaction data to be identified.
8. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 6.
9. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 6.
10. A computer program product comprising a computer program, characterized in that the computer program realizes the steps of the method of any one of claims 1 to 6 when executed by a processor.
CN202111462122.9A 2021-12-02 2021-12-02 Abnormal transaction data identification method and device, computer equipment and storage medium Pending CN114140238A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111462122.9A CN114140238A (en) 2021-12-02 2021-12-02 Abnormal transaction data identification method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111462122.9A CN114140238A (en) 2021-12-02 2021-12-02 Abnormal transaction data identification method and device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114140238A true CN114140238A (en) 2022-03-04

Family

ID=80387107

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111462122.9A Pending CN114140238A (en) 2021-12-02 2021-12-02 Abnormal transaction data identification method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114140238A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114912549A (en) * 2022-07-11 2022-08-16 支付宝(杭州)信息技术有限公司 Training method of risk transaction identification model, and risk transaction identification method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114912549A (en) * 2022-07-11 2022-08-16 支付宝(杭州)信息技术有限公司 Training method of risk transaction identification model, and risk transaction identification method and device
CN114912549B (en) * 2022-07-11 2022-12-13 支付宝(杭州)信息技术有限公司 Training method of risk transaction identification model, and risk transaction identification method and device

Similar Documents

Publication Publication Date Title
US11410187B2 (en) Feature drift hardened online application origination (OAO) service for fraud prevention systems
US11818163B2 (en) Automatic machine learning vulnerability identification and retraining
Zurada Could decision trees improve the classification accuracy and interpretability of loan granting decisions?
TW201734893A (en) Method and apparatus for acquiring score credit and outputting feature vector value
Zhou et al. Fraud detection within bankcard enrollment on mobile device based payment using machine learning
US20220351207A1 (en) System and method for optimization of fraud detection model
CN114187112A (en) Training method of account risk model and determination method of risk user group
Tsai Two‐stage hybrid learning techniques for bankruptcy prediction
Kumar et al. Credit score prediction system using deep learning and k-means algorithms
AU2021290143B2 (en) Machine learning module training using input reconstruction techniques and unlabeled transactions
CN114140238A (en) Abnormal transaction data identification method and device, computer equipment and storage medium
US11916927B2 (en) Systems and methods for accelerating a disposition of digital dispute events in a machine learning-based digital threat mitigation platform
WO2022060709A1 (en) Discriminative machine learning system for optimization of multiple objectives
Harikrishna et al. Credit scoring using support vector machine: a comparative analysis
Wong et al. Weighted random forests for evaluating financial credit risk
Zhang et al. Credit evaluation of SMEs based on GBDT-CNN-LR hybrid integrated model
Lai Default Prediction of Internet Finance Users Based on Imbalance-XGBoost
US11580426B2 (en) Systems and methods for determining relative importance of one or more variables in a nonparametric machine learning model
Zhou Loan Default Prediction Based on Machine Learning Methods
US20230195056A1 (en) Automatic Control Group Generation
US11544715B2 (en) Self learning machine learning transaction scores adjustment via normalization thereof accounting for underlying transaction score bases
Peng et al. Credit scoring model in imbalanced data based on cnn-atcn
Pisula Measuring sovereign credit risk in EU countries using an ensemble of classifiers approach
CA3108609A1 (en) System and method for machine learning based detection of fraud
Didkovskyi et al. Cross-Domain Behavioral Credit Modeling: transferability from private to central data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination