CN117196630A - Transaction risk prediction method, device, terminal equipment and storage medium - Google Patents

Transaction risk prediction method, device, terminal equipment and storage medium Download PDF

Info

Publication number
CN117196630A
CN117196630A CN202310949724.XA CN202310949724A CN117196630A CN 117196630 A CN117196630 A CN 117196630A CN 202310949724 A CN202310949724 A CN 202310949724A CN 117196630 A CN117196630 A CN 117196630A
Authority
CN
China
Prior art keywords
data
transaction
risk prediction
transaction risk
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310949724.XA
Other languages
Chinese (zh)
Inventor
裴正蒙
洪雪芬
傅杰
雷映雪
马超
王平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Merchants Bank Co Ltd
Original Assignee
China Merchants Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Merchants Bank Co Ltd filed Critical China Merchants Bank Co Ltd
Priority to CN202310949724.XA priority Critical patent/CN117196630A/en
Publication of CN117196630A publication Critical patent/CN117196630A/en
Pending legal-status Critical Current

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a transaction risk prediction method, a device, terminal equipment and a storage medium, wherein the method comprises the following steps: acquiring customer transaction data; performing feature processing on the client transaction data to obtain first feature data; and inputting the first characteristic data into a pre-established transaction risk prediction model for prediction to obtain a transaction risk prediction result. According to the method, the data after feature processing is input into the transaction risk prediction model which is created in advance to predict, so that the accuracy of transaction risk prediction is improved, and the effect of transaction risk prediction is improved.

Description

Transaction risk prediction method, device, terminal equipment and storage medium
Technical Field
The present invention relates to the field of data analysis technologies, and in particular, to a risk prediction method, a risk prediction device, a terminal device, and a storage medium.
Background
At present, a transaction risk prediction model under the existing financial wind control system in the industry is based on expert rule verification, and suspicious transaction judgment is carried out according to the existing rules under different scenes. The transaction risk prediction model comprises a pre-situation model of easiness in anti-fraud for buses. The transaction anti-fraud scenes are mainly divided into three categories, namely a priori scene, a middle scene and a post-incident scene, wherein the priori scene is mainly aimed at transaction clients, potential fraud transaction risks are evaluated, risk figures of the clients are depicted, the clients are classified, and identification management is carried out.
However, the conventional risk prediction model is formulated and developed once for the expert rules corresponding to each risk condition, so that the consumption time is too long, repeated contents are too much, and the recognition effect is limited. In addition, the traditional transaction risk prediction model only utilizes the client transaction data to evaluate the transaction risk, so that the information utilization rate is low, and further the subjective rule coverage is insufficient. In summary, the current industry transaction risk prediction capability is limited, and suspicious transactions cannot be accurately positioned and intercepted in real time, so that the accuracy of the transaction risk prediction under the existing financial wind control system is low, and the effect of the transaction risk prediction is poor.
Disclosure of Invention
The invention mainly aims to provide a transaction risk prediction method, a device, terminal equipment and a storage medium, aiming at improving the accuracy of transaction risk prediction and further improving the effect of transaction risk prediction.
To achieve the above object, the present invention provides a transaction risk prediction method, which is applied to a transaction risk prediction system, the transaction risk prediction method comprising the steps of:
acquiring customer transaction data;
performing feature processing on the client transaction data to obtain first feature data;
And inputting the first characteristic data into a pre-established transaction risk prediction model for prediction to obtain a transaction risk prediction result.
Optionally, the step of inputting the first feature data into a pre-created transaction risk prediction model to predict, and obtaining a transaction risk prediction result includes:
the transaction risk prediction model is created, and specifically comprises the following steps:
acquiring first historical customer data;
obtaining training set sample data based on the first historical customer data;
performing feature processing on the training set sample data to obtain second feature data;
training based on the second characteristic data to obtain the transaction risk prediction model.
Optionally, the training set sample data includes a first type of training sample and a second type of training sample, and the step of obtaining training set sample data based on the first historical client data includes:
carrying out transaction risk prediction based on a preset first time span interval and the first historical customer data to obtain a first customer sample;
classifying the first customer sample based on a transaction tag in the first customer sample to obtain a first type training sample and a second type training sample;
And obtaining the training set sample data based on the first type training samples and the second type training samples.
Optionally, the step of performing feature processing on the customer transaction data to obtain first feature data includes:
data grouping is carried out on the client transaction data to obtain risk association data, internet banking data, log data and transaction data;
and running the feature processing scripts corresponding to the risk association data, the internet banking data, the log data and the transaction data based on a big data clustering technology to obtain the first feature data.
Optionally, the training based on the second feature data, the obtaining the transaction risk prediction model includes:
performing machine learning training based on the second characteristic data and a machine learning algorithm to obtain a trained sub-model;
and fusing and weighting the trained sub-models to obtain the transaction risk prediction model.
Optionally, the step of inputting the first feature data into a pre-created transaction risk prediction model to predict, and obtaining a transaction risk prediction result includes:
obtaining verification set sample data based on second historical client data obtained in advance;
And verifying the transaction risk prediction result based on the verification set sample data to obtain the prediction accuracy of the transaction risk prediction model.
Optionally, the step of obtaining the sample data of the verification set based on the second historical client data obtained in advance includes:
carrying out transaction risk prediction based on a preset second time span interval and the second historical client data obtained in advance to obtain a second client sample;
classifying the second customer sample based on the transaction tag in the second customer sample to obtain a first type verification sample and a second type verification sample;
the verification set sample data is obtained based on the first type of verification sample and the second type of verification sample.
In addition, to achieve the above object, the present invention also provides a transaction risk prediction apparatus, including:
the data acquisition module is used for acquiring the client transaction data;
the feature processing module is used for carrying out feature processing on the client transaction data to obtain first feature data;
and the model prediction module is used for inputting the first characteristic data into a pre-established transaction risk prediction model for prediction to obtain a transaction risk prediction result.
Optionally, the feature processing module is further configured to:
data grouping is carried out on the client transaction data to obtain risk association data, internet banking data, log data and transaction data;
and running the feature processing scripts corresponding to the risk association data, the internet banking data, the log data and the transaction data based on a big data clustering technology to obtain the first feature data.
Optionally, the model prediction module is further configured to:
the transaction risk prediction model is created, and specifically comprises the following steps:
acquiring first historical customer data;
obtaining training set sample data based on the first historical customer data;
performing feature processing on the training set sample data to obtain second feature data;
training based on the second characteristic data to obtain the transaction risk prediction model.
Optionally, the model prediction module is further configured to:
carrying out transaction risk prediction based on a preset first time span interval and the first historical customer data to obtain a first customer sample;
classifying the first customer sample based on a transaction tag in the first customer sample to obtain a first type training sample and a second type training sample;
And obtaining the training set sample data based on the first type training samples and the second type training samples.
Optionally, the model prediction module is further configured to:
performing machine learning training based on the second characteristic data and a machine learning algorithm to obtain a trained sub-model;
and fusing and weighting the trained sub-models to obtain the transaction risk prediction model.
Optionally, the model prediction module is further configured to:
obtaining verification set sample data based on second historical client data obtained in advance;
and verifying the transaction risk prediction result based on the verification set sample data to obtain the prediction accuracy of the transaction risk prediction model.
Optionally, the model prediction module is further configured to:
carrying out transaction risk prediction based on a preset second time span interval and the second historical client data obtained in advance to obtain a second client sample;
classifying the second customer sample based on the transaction tag in the second customer sample to obtain a first type verification sample and a second type verification sample;
the verification set sample data is obtained based on the first type of verification sample and the second type of verification sample.
In addition, to achieve the above object, the present invention also provides a terminal device including a memory, a processor, and a transaction risk prediction program stored on the memory and executable on the processor, the transaction risk prediction program implementing the transaction risk prediction method as described above when executed by the processor.
In addition, in order to achieve the above object, the present invention also provides a computer-readable storage medium having stored thereon a transaction risk prediction program which, when executed by a processor, implements the transaction risk prediction method as described above.
The embodiment of the invention provides a transaction risk prediction method, a device, terminal equipment and a storage medium, which are used for acquiring customer transaction data; performing feature processing on the client transaction data to obtain first feature data; and inputting the first characteristic data into a pre-established transaction risk prediction model for prediction to obtain a transaction risk prediction result. According to the embodiment, the obtained customer transaction data is subjected to feature processing, and the processed first feature data is input into the pre-created transaction risk prediction model for prediction, so that the accuracy of transaction risk prediction is improved, and further the transaction risk prediction effect is improved.
Drawings
FIG. 1 is a schematic diagram of functional modules of a terminal device to which a transaction risk prediction device of the present invention belongs;
FIG. 2 is a flowchart of a transaction risk prediction method according to a first exemplary embodiment of the present invention;
FIG. 3 is a flow chart of feature processing and model training in a first exemplary embodiment of a transaction risk prediction method according to the present invention;
FIG. 4 is a flowchart of a transaction risk prediction method according to a second exemplary embodiment of the present invention;
FIG. 5 is a flowchart of a transaction risk prediction method according to a third exemplary embodiment of the present invention;
fig. 6 is a flowchart of a fourth exemplary embodiment of a transaction risk prediction method according to the present invention.
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
The main solutions of the embodiments of the present invention are: acquiring customer transaction data; performing feature processing on the client transaction data to obtain first feature data; and inputting the first characteristic data into a pre-established transaction risk prediction model for prediction to obtain a transaction risk prediction result.
The embodiment of the application considers that the current transaction risk prediction method based on expert rule verification has the following defects: the expert rules corresponding to each risk condition need to be formulated and developed once for customer portraits, the consumption time is too long, repeated contents are too much, and the recognition effect is limited. In addition, the traditional transaction risk prediction model only utilizes the client transaction data to evaluate the transaction risk, so that the information utilization rate is low, and further the subjective rule coverage is insufficient. In summary, the current industry transaction risk prediction capability is limited, and suspicious transactions cannot be accurately positioned and intercepted in real time, so that the accuracy of the transaction risk prediction under the existing financial wind control system is low, and the effect of the transaction risk prediction is poor.
Based on the above, the embodiment of the application provides a solution, and the obtained customer transaction data is subjected to feature processing, and the processed first feature data is input into a pre-created transaction risk prediction model for prediction, so that the accuracy of transaction risk prediction is improved, and the transaction risk prediction effect is improved.
Specifically, referring to fig. 1, fig. 1 is a schematic diagram of functional modules of a terminal device to which the transaction risk prediction device of the present application belongs. The transaction risk prediction device may be a device independent of the terminal device, capable of performing transaction risk prediction and recommendation, and may be carried on the terminal device in a form of hardware or software. The terminal device may be an intelligent mobile terminal with a data processing function, or may be a fixed terminal device or a server with a data processing function, and in addition, the transaction risk prediction device may also be carried in a transaction risk prediction system.
In this embodiment, the terminal device to which the transaction risk prediction apparatus belongs at least includes an output module 110, a processor 120, a memory 130, and a communication module 140.
The memory 130 stores an operating system and a transaction risk prediction program; the output module 110 may be a display screen or the like. The communication module 140 may include a WIFI module, a mobile communication module, a bluetooth module, and the like, and communicates with an external device or a server through the communication module 140.
Wherein the transaction risk prediction program in the memory 130 when executed by the processor performs the steps of:
acquiring customer transaction data;
performing feature processing on the client transaction data to obtain first feature data;
and inputting the first characteristic data into a pre-established transaction risk prediction model for prediction to obtain a transaction risk prediction result.
Further, the transaction risk prediction program in the memory 130, when executed by the processor, further performs the steps of:
the transaction risk prediction model is created, and specifically comprises the following steps:
acquiring first historical customer data;
obtaining training set sample data based on the first historical customer data;
performing feature processing on the training set sample data to obtain second feature data;
Training based on the second characteristic data to obtain the transaction risk prediction model.
Further, the transaction risk prediction program in the memory 130, when executed by the processor, further performs the steps of:
carrying out transaction risk prediction based on a preset first time span interval and the first historical customer data to obtain a first customer sample;
classifying the first customer sample based on a transaction tag in the first customer sample to obtain a first type training sample and a second type training sample;
and obtaining the training set sample data based on the first type training samples and the second type training samples.
Further, the transaction risk prediction program in the memory 130, when executed by the processor, further performs the steps of:
data grouping is carried out on the client transaction data to obtain risk association data, internet banking data, log data and transaction data;
and running the feature processing scripts corresponding to the risk association data, the internet banking data, the log data and the transaction data based on a big data clustering technology to obtain the first feature data.
Further, the transaction risk prediction program in the memory 130, when executed by the processor, further performs the steps of:
Performing machine learning training based on the second characteristic data and a machine learning algorithm to obtain a trained sub-model;
and fusing and weighting the trained sub-models to obtain the transaction risk prediction model.
Further, the transaction risk prediction program in the memory 130, when executed by the processor, further performs the steps of:
obtaining verification set sample data based on second historical client data obtained in advance;
and verifying the transaction risk prediction result based on the verification set sample data to obtain the prediction accuracy of the transaction risk prediction model.
Further, the transaction risk prediction program in the memory 130, when executed by the processor, further performs the steps of:
carrying out transaction risk prediction based on a preset second time span interval and the second historical client data obtained in advance to obtain a second client sample;
classifying the second customer sample based on the transaction tag in the second customer sample to obtain a first type verification sample and a second type verification sample;
the verification set sample data is obtained based on the first type of verification sample and the second type of verification sample.
According to the embodiment, through the scheme, the client transaction data are obtained; performing feature processing on the client transaction data to obtain first feature data; and inputting the first characteristic data into a pre-established transaction risk prediction model for prediction to obtain a transaction risk prediction result. According to the embodiment, the obtained customer transaction data is subjected to feature processing, and the processed first feature data is input into the pre-created transaction risk prediction model for prediction, so that the accuracy of transaction risk prediction is improved, and further the transaction risk prediction effect is improved.
The method embodiment of the application is proposed based on the above-mentioned terminal equipment architecture but not limited to the above-mentioned architecture.
Referring to fig. 2, fig. 2 is a flowchart illustrating a transaction risk prediction method according to a first exemplary embodiment of the present application. The transaction risk prediction method comprises the following steps:
step S10, obtaining customer transaction data;
specifically, the embodiment is applied to a transaction anti-fraud scene under a wind control system. The embodiment provides a transaction risk prediction method, which predicts and evaluates risks possibly occurring in a customer transaction in advance through a transaction risk prediction model, so that the transaction risk prediction model is also a model for easily preventing fraud in buses in advance. The transaction anti-fraud scene can be divided into three categories, namely a priori category, a middle category and a post-incident category. The prior scene is mainly aimed at trading clients, evaluates potential fraud risks of the clients, characterizes risk portraits of the clients, classifies the clients and manages the identifications. In this embodiment, the bank system obtains the customer transaction data, and the bank records and processes the customer transaction data in its own system. When a customer performs banking operations (e.g., deposit, withdrawal, transfer, etc.), the transaction data is captured and recorded by the banking system. When a customer uses a bank card to conduct a consumer or other transaction, transaction data may be recorded in the bank's payment system. The bank can acquire transaction data generated by a customer when using a bank card through a payment system of the bank. In some cases, the bank may establish a data sharing relationship with a third party partner or data provider to obtain more comprehensive customer transaction data. These partners may include other financial institutions, payment service providers, or data providers with which the bank may obtain the customer transaction data through data sharing. Taking the T day as an example, the daily banking system will select relevant customer transaction data of T-1 day to perform feature processing and risk prediction. The customer transaction data may be customer bank operation data, operation log data, IP (Internet Protocol, address of unique identification device on network) and MAC (Media Access Control, physical address of network interface card) analysis data, and may further include risk relationship data, transaction information data and transaction account data of a customer transaction.
Step S20, carrying out feature processing on the customer transaction data to obtain first feature data;
specifically, in the embodiment, the feature processing is performed on the customer transaction data to extract more valuable features, so that the machine learning model performance is improved, the dimension of a feature space is reduced, and the risk prediction efficiency is improved. The processing of the features in this embodiment is to process and convert the customer transaction data.
And step S30, inputting the first characteristic data into a pre-established transaction risk prediction model for prediction to obtain a transaction risk prediction result.
Specifically, the present embodiment performs risk prediction by inputting the first feature data into a transaction risk prediction model created in advance, the transaction risk prediction model being created based on a machine learning algorithm. The risk prediction result is a comprehensive risk assessment fraud score of the client, and the historical client data is used for training the risk prediction model, wherein the historical client data can be divided into good sample data and bad sample data, the good sample data is used as a positive learning result, and the bad sample data is used as a negative learning result. The classification of good or bad samples may be based on transaction tags in historical customer data. In addition, the algorithm applied by the machine learning model adopted in the embodiment is LGBM (Light Gradient Boosting Machine, lightweight gradient lifting algorithm), which is a machine learning algorithm based on a gradient lifting decision tree. LGBMs are known for their efficient performance and low memory consumption, and perform well in processing large data sets. LGBM trains multiple decision trees iteratively and uses gradient boosting to continuously optimize the predictive power of the model. It has wide application in various machine learning tasks including classification, regression, and ranking. In the embodiment, a machine learning model is constructed through the algorithm, prediction of the risk of the client is completed, and a transaction risk prediction result is obtained. The transaction risk prediction result is also reported to related departments for processing so as to warn clients or further prevent and control transaction security.
Further, in step S20, feature processing is performed on the customer transaction data, so as to obtain first feature data for refinement.
In this embodiment, step S20, performing feature processing on the customer transaction data to obtain first feature data includes:
step A, data grouping is carried out on the client transaction data to obtain risk association data, internet banking data, log data and transaction data;
specifically, the present embodiment first groups the customer transaction data. Wherein, the customer transaction data specifically includes: the online banking operation data, operation log data, IP and MAC analysis data can also comprise risk relation data, transaction information data and transaction account data of client transactions. The internet banking operation data can be classified into internet banking data, the operation log data can be classified into log data, the IP and MAC analysis data and the risk relation data can be classified into risk association data, and the transaction information data and the transaction account data can be classified into transaction data.
And step B, running the feature processing scripts corresponding to the risk associated data, the online banking data, the log data and the transaction data based on a big data clustering technology to obtain the first feature data.
Specifically, referring to fig. 3, fig. 3 is a schematic flow chart of feature processing and model training in the method of the present application; as shown in fig. 3, in this embodiment, the feature processing script is further executed to perform feature processing on the risk association class data, the internet bank class data, the log class data and the transaction class data, the risk association class feature is obtained by performing feature processing on the risk association class data, the internet bank operation class feature and the internet bank derivative class feature are obtained by performing feature processing on the internet bank class data, the log operation class feature is obtained by performing feature processing on the log class data, the risk and association class feature is obtained by performing feature processing on the IP and MAC analysis data and the risk relation data, and the transaction class feature and the transaction account static feature are obtained by performing feature processing on the transaction class data. In this embodiment, feature processing of IP and MAC analysis data is taken as an example, through analysis of IP and MAC, a banking system may obtain login address information of a customer account to determine whether the customer account belongs to a different login, and perform feature processing on IP and MAC analysis data and risk relationship data, and specifically may scale feature values in different ranges to similar ranges through feature scaling, for example, normalization and normalization; the non-numerical features can be converted into numerical features through feature coding, so that the non-numerical features are applicable to machine learning algorithms; in addition, feature processing can be realized through feature selection, and the applicable feature selection methods comprise filtering type selection, package type selection and embedded type selection.
More specifically, the present embodiment performs data processing based on a large data cluster technique employing a Hadoop (distributed computing) large data processing framework, which is generally used to construct large data clusters. Hadoop clusters are composed of multiple computer nodes, each with storage and computing capabilities to handle large-scale data sets. In addition, in this embodiment, a pyssark (a tool in Python) is used to develop the feature processing scripts corresponding to the risk-related data, the internet bank data, the log data and the transaction data, where pyssark provides an interface of Python programming language, and is used to perform big data processing and analysis on Spark (an open-source distributed computing framework) clusters. The use of PySpark allows for the flexibility and ease of use of Python for large scale data processing. It provides rich functions and APIs (application program interfaces) that enable loading, conversion, analysis, and visualization of data. The specific steps of feature processing by using PySpark in this embodiment may include: first, importing necessary libraries and modules; secondly, creating an object; then, loading the data set; finally, feature selection and conversion are performed.
According to the embodiment, through the scheme, the client transaction data are obtained; performing feature processing on the client transaction data to obtain first feature data; and inputting the first characteristic data into a pre-established transaction risk prediction model for prediction to obtain a transaction risk prediction result. The embodiment is specifically based on the Hadoop big data clustering technology, utilizes the Pyspark to process the features, introduces the online banking features and the log features of the clients, and increases the information dimension of the training data, thereby improving the accuracy and the robustness of the model. The embodiment also carries out machine learning model training based on the LGBM algorithm, has high speed in processing large-scale data sets, low memory consumption and can carry out training and prediction efficiently. According to the embodiment, the obtained customer transaction data is subjected to feature processing, and the processed first feature data is input into the pre-created transaction risk prediction model for prediction, so that the accuracy of transaction risk prediction is improved, and further the transaction risk prediction effect is improved.
Referring to fig. 4, fig. 4 is a flowchart illustrating a transaction risk prediction method according to a second exemplary embodiment of the present application.
Based on the first embodiment, a second embodiment of the present application is proposed, which differs from the first embodiment in that:
In this embodiment, step S30, the inputting the first feature data into a pre-created transaction risk prediction model to predict, and before obtaining the transaction risk prediction result, further includes:
step S25, creating the transaction risk prediction model;
specifically, the present embodiment creates a transaction risk prediction model in advance, and the algorithm applied by the machine learning model employed is LGBM. LGBM trains multiple decision trees iteratively and uses gradient boosting to continuously optimize the predictive power of the model. It has wide application in various machine learning tasks including classification, regression, and ranking. In the embodiment, a machine learning model is constructed through the algorithm, prediction of the risk of the client is completed, and a transaction risk prediction result is obtained.
Further, in step S25, the present embodiment further refines creating the transaction risk prediction model.
In this embodiment, in step S25, creating the transaction risk prediction model includes:
step S251, obtaining historical customer data;
specifically, the embodiment can be applied to a banking wind control system, historical customer data can be obtained from the banking system, the historical customer data can determine a corresponding selected sample value time span according to actual requirements, and the transaction labels in the historical customer data are used for classification.
Step S252, training set sample data is obtained based on the historical client data;
specifically, the present embodiment may obtain training set sample data based on the historical client data obtained in the above step S251 to complete training of the machine learning model.
Step S253, performing feature processing on the training set sample data to obtain second feature data;
specifically, in this embodiment, the training set sample data may be processed by using a pypark tool to develop a related feature script, and specific steps may refer to the first embodiment and will not be described herein.
And step S254, training based on the second characteristic data to obtain the transaction risk prediction model.
Specifically, the present embodiment performs training of the reinforcement learning model based on the second feature data and a decision tree algorithm (i.e., LGBM) to obtain a sub-model. By weighting the AR (Area Under the ROC Curve), a machine learning performance index, of the sub-model, a comprehensive fraud risk score for the customer can be calculated, i.e. the risk prediction model is obtained.
Further, in step S252, the training set sample data obtained based on the first historical client data is further refined.
In this embodiment, step S252, obtaining training set sample data based on the first historical client data may include:
step S2521, carrying out transaction risk prediction based on a preset first time span interval and the first historical customer data to obtain a first customer sample;
the method of the embodiment is particularly applicable to a banking wind control system, wherein the first time span interval refers to a sample value time span interval of a first customer sample selected from first historical customer data. In this embodiment, customer data corresponding to a certain time node in the first historical customer data is selected, and the first time span interval is from the time node to a year before the time node.
Step S2522, classifying the first customer sample based on the transaction tag in the first customer sample to obtain a first class training sample and a second class training sample;
specifically, the present embodiment classifies the clients based on the first time span interval and the transaction tag, and may specifically be classified into a first type training sample and a second type training sample. The first training sample refers to a good sample, and specifically can be a set of fraudulent-type transaction behaviors of clients; the second training sample is a bad sample, and specifically can be a collection of fraudulent transaction behaviors of the clients; the transaction tag may include information about the category, time, amount, merchant information, payment method, etc. of the customer transaction. In the embodiment, a certain time node is selected to select historical client data to obtain a corresponding target client, and if no fraudulent transaction behavior exists in a transaction label in a time span interval from the previous year to the next year of the time node, a certain transaction behavior information in the time span interval is randomly selected and used as a first training sample according to the transaction time of the certain transaction behavior information; if the target client has fraudulent transaction behaviors in the transaction label in the time span interval from the previous year to the next year of the time node, randomly selecting a certain transaction behavior information in the time span interval, and taking the transaction time as a second type training sample.
Step S2523 is to obtain the training set sample data based on the first type training sample and the second type training sample.
Specifically, in this embodiment, after the first type training sample and the second type training sample are obtained in step S2522, the first type training sample and the second type training sample are integrated, and finally the training set sample data is obtained, so that the training set sample data is input into a machine learning model and trained, and thus scoring of comprehensive fraud risk of a customer is completed, and a transaction risk prediction result is obtained.
According to the embodiment, through the scheme, the client transaction data are obtained; performing feature processing on the client transaction data to obtain first feature data; and inputting the first characteristic data into a pre-established transaction risk prediction model for prediction to obtain a transaction risk prediction result. According to the embodiment, the obtained customer transaction data is subjected to feature processing, and the processed first feature data is input into the pre-created transaction risk prediction model for prediction, so that the accuracy of transaction risk prediction is improved, and further the transaction risk prediction effect is improved.
Referring to fig. 5, fig. 5 is a flowchart illustrating a transaction risk prediction method according to a third exemplary embodiment of the present application.
Based on the second embodiment, a third embodiment of the present application is proposed, which differs from the second embodiment in that:
in this embodiment, step S254 is performed to train the transaction risk prediction model based on the second feature data, so as to refine the transaction risk prediction model.
In this embodiment, step S254 of training based on the second feature data may include:
step S2541, performing machine learning training based on the second characteristic data and a machine learning algorithm to obtain a trained sub-model;
specifically, the present embodiment performs training of the reinforcement learning model based on the second feature data and a decision tree algorithm (i.e., LGBM) to obtain a sub-model. In this embodiment, data processing is performed on the training set sample data to obtain the second feature data, then the model is trained by using the second feature data, and the performance of the model is optimized by adjusting the super parameters of the model. The training process may generally involve an iterative optimization algorithm, such as a gradient descent algorithm.
And step S2542, fusing and weighting the trained sub-models to obtain the transaction risk prediction model.
Specifically, as shown in fig. 3, first, the present embodiment selects a plurality of sub-models that perform well, and these sub-models may be trained using different algorithms or different second feature data. Model fusion is then performed by performing predictions on the test set and combining the prediction results of the sub-models. Common fusion methods include voting (voting), averaging (averaging), stacking (stacking), and the like. The choice of each fusion method depends on the specific situation and the type of the problem, and the embodiment is not limited to this, and can be selected according to the actual situation in the specific implementation process. For the result of model fusion, each sub-model may be assigned a weight. Weights may be assigned based on the performance, accuracy, or other evaluation criteria of the sub-models on the validation set. The better sub-model can obtain higher weight, thereby playing a larger role in the final risk prediction model. The transaction risk prediction model is stored in a model version management system.
According to the embodiment, through the scheme, the client transaction data are obtained; performing feature processing on the client transaction data to obtain first feature data; and inputting the first characteristic data into a pre-established transaction risk prediction model for prediction to obtain a transaction risk prediction result. The embodiment particularly carries out machine learning model training based on the LGBM algorithm, has high speed in processing large-scale data sets, low memory consumption and can carry out training and prediction efficiently. According to the embodiment, the obtained customer transaction data is subjected to feature processing, and the processed first feature data is input into the pre-created transaction risk prediction model for prediction, so that the accuracy of transaction risk prediction is improved, and further the transaction risk prediction effect is improved.
Referring to fig. 6, fig. 6 is a flowchart illustrating a transaction risk prediction method according to a fourth exemplary embodiment of the present application.
Based on the first embodiment, a fourth embodiment of the present application is proposed, which differs from the first embodiment in that:
in this embodiment, step S30, inputting the first feature data into a pre-created risk prediction model to perform prediction, and obtaining a risk prediction result may further include:
step S40, obtaining sample data of a verification set based on second historical client data acquired in advance;
specifically, the present embodiment first needs to acquire second historical client data, where the client stored in the second historical client data should be the same client as the client in the first historical client data, but the transaction behavior information time stamp of the stored client is different. Because the verification set sample data is used for verifying the model effect of the risk prediction model, when the verification set sample data is obtained from the second historical client data, the selected time span is required to be set to be more than 3 months of the transaction time corresponding to the client transaction behavior in the training set sample data. It should be noted that, in this embodiment, the transaction time in the verification set data is set to be more than 3 months of the transaction time of the training set data in consideration of the reliability of the verification result, and the delay time period of the verification set data relative to the training set data can be set in the actual implementation process.
And step S50, verifying the transaction risk prediction result based on the verification set sample data to obtain the prediction accuracy of the transaction risk prediction model.
Specifically, in this embodiment, after verification set data is acquired, a verification task is set, and verification is performed based on a transaction risk prediction model and the verification set data. In addition, in the process of predicting the transaction risk prediction model, the transaction risk prediction model can be further updated by super parameters based on the verification set data, so that the model performance is further improved. According to the method and the device, the prediction accuracy of the transaction risk prediction model can be obtained, and the actual effect of the risk prediction model is judged based on the transaction risk prediction accuracy.
Further, in step S40, the verification set sample data is obtained based on the second history client data obtained in advance, and is refined.
In this embodiment, step S40, based on the second historical client data acquired in advance, the obtaining verification set sample data includes:
step S401, carrying out transaction risk prediction based on a preset second time span interval and the second historical client data acquired in advance to obtain a second client sample;
Specifically, the preset second time span interval in this embodiment is a transaction time of the client transaction behavior of the verification set sample data. The clients stored in the second historical client data should be the same clients as the clients in the first historical client data, but the stored transaction behavior information timestamps of the clients are different. Because the verification set sample data is used for verifying the model effect of the risk prediction model, when the verification set sample data is obtained from the second historical client data, the selected time span is required to be set to be more than 3 months of the transaction time corresponding to the client transaction behavior in the training set sample data. It should be noted that, in this embodiment, the transaction time in the verification set data is set to be more than 3 months of the transaction time of the training set data in consideration of the reliability of the verification result, and the delay time period of the verification set data relative to the training set data can be set in the actual implementation process. The time span interval of the sample value of the verification set is the current month.
Step S402, classifying the second customer sample based on the transaction label in the second customer sample to obtain a first type verification sample and a second type verification sample;
Specifically, the present embodiment selects the client data from a certain point as the verification set, and the time node of the verification set is greater than the training set by more than 3 months. The sample value time span interval is the current month. The first class of validation sample ranges: and taking customer data within one month, determining whether the customer data is a first type verification sample based on the transaction label, and taking sample transaction time to form a first type verification sample set. The second type sample selection range: and taking a certain transaction time randomly in the month by all other clients who are not the first type verification sample in the month to form a second type verification sample set.
Step S403, obtaining the verification set sample data based on the first type verification sample and the second type verification sample.
Specifically, in this embodiment, after the first type verification sample and the second type verification sample are obtained in step S402, the first type verification sample and the second type verification sample are integrated, and finally the verification set sample data is obtained, so as to input and train a machine learning model (LGBM model), thereby completing effect verification of a risk assessment model, and obtaining a prediction accuracy of the risk prediction model.
According to the embodiment, through the scheme, the client transaction data are obtained; performing feature processing on the client transaction data to obtain first feature data; and inputting the first characteristic data into a pre-established transaction risk prediction model for prediction to obtain a transaction risk prediction result. According to the embodiment, the acquired customer transaction data is subjected to feature processing, and the processed first feature data is input into the pre-created transaction risk prediction model for prediction, so that the accuracy of transaction risk prediction is improved, and further the transaction risk prediction effect is improved
It should be noted that, the foregoing embodiments may be implemented in a reasonable combination according to actual situations, which is not described in detail in this embodiment.
In addition, an embodiment of the present application further provides a transaction risk prediction apparatus, where the transaction risk prediction apparatus includes:
the data acquisition module is used for acquiring the client transaction data;
the feature processing module is used for carrying out feature processing on the client transaction data to obtain first feature data;
and the model prediction module is used for inputting the first characteristic data into a pre-established transaction risk prediction model for prediction to obtain a transaction risk prediction result.
The principle and implementation process of transaction risk prediction are implemented in this embodiment, please refer to the above embodiments, and are not described herein again.
In addition, the embodiment of the application also provides a terminal device, which comprises a memory, a processor and a transaction risk prediction program stored in the memory and capable of running on the processor, wherein the transaction risk prediction program realizes the steps of the transaction risk prediction method when being executed by the processor.
Because the transaction risk prediction program is executed by the processor and adopts all the technical schemes of all the embodiments, the transaction risk prediction program at least has all the beneficial effects brought by all the technical schemes of all the embodiments, and is not described in detail herein.
In addition, the embodiment of the application also provides a computer readable storage medium, wherein the transaction risk prediction readable storage medium is stored with a transaction risk prediction program, and the transaction risk prediction program realizes the steps of the transaction risk prediction method when being executed by a processor.
Because the transaction risk prediction program is executed by the processor and adopts all the technical schemes of all the embodiments, the transaction risk prediction program at least has all the beneficial effects brought by all the technical schemes of all the embodiments, and is not described in detail herein.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The above ordering of embodiments of the invention is merely for illustration, and does not represent the advantages or disadvantages of the embodiments.
From the description of the above embodiments, it will be apparent to those skilled in the art that the above embodiment methods may be implemented by means of software plus necessary general hardware platforms. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) as described above, comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method according to the embodiments of the present invention.
The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.

Claims (10)

1. A transaction risk prediction method, characterized in that the transaction risk prediction method comprises the following steps:
acquiring customer transaction data;
performing feature processing on the client transaction data to obtain first feature data;
and inputting the first characteristic data into a pre-established transaction risk prediction model for prediction to obtain a transaction risk prediction result.
2. The transaction risk prediction method according to claim 1, wherein the step of inputting the first feature data into a pre-created transaction risk prediction model for prediction, and obtaining a transaction risk prediction result includes, before:
the transaction risk prediction model is created, and specifically comprises the following steps:
acquiring first historical customer data;
obtaining training set sample data based on the first historical customer data;
performing feature processing on the training set sample data to obtain second feature data;
Training based on the second characteristic data to obtain the transaction risk prediction model.
3. The transaction risk prediction method of claim 2, wherein the training set sample data includes a first type of training sample and a second type of training sample, and the step of deriving training set sample data based on the first historical customer data includes:
carrying out transaction risk prediction based on a preset first time span interval and the first historical customer data to obtain a first customer sample;
classifying the first customer sample based on a transaction tag in the first customer sample to obtain a first type training sample and a second type training sample;
and obtaining the training set sample data based on the first type training samples and the second type training samples.
4. The transaction risk prediction method of claim 1, wherein the step of characterizing the customer transaction data to obtain first characterization data includes:
data grouping is carried out on the client transaction data to obtain risk association data, internet banking data, log data and transaction data;
and running the feature processing scripts corresponding to the risk association data, the internet banking data, the log data and the transaction data based on a big data clustering technology to obtain the first feature data.
5. The transaction risk prediction method according to claim 2, wherein the training based on the second feature data to obtain the transaction risk prediction model includes:
performing machine learning training based on the second characteristic data and a machine learning algorithm to obtain a trained sub-model;
and fusing and weighting the trained sub-models to obtain the transaction risk prediction model.
6. A transaction risk prediction method according to claim 3, wherein the step of inputting the first feature data into a pre-created transaction risk prediction model for prediction, and obtaining a transaction risk prediction result, comprises:
obtaining verification set sample data based on second historical client data obtained in advance;
and verifying the transaction risk prediction result based on the verification set sample data to obtain the prediction accuracy of the transaction risk prediction model.
7. The transaction risk prediction method of claim 6, wherein the step of deriving verification set sample data based on pre-acquired second historical customer data includes:
carrying out transaction risk prediction based on a preset second time span interval and the second historical client data obtained in advance to obtain a second client sample;
Classifying the second customer sample based on the transaction tag in the second customer sample to obtain a first type verification sample and a second type verification sample;
the verification set sample data is obtained based on the first type of verification sample and the second type of verification sample.
8. A transaction risk prediction device, characterized in that the transaction risk prediction device comprises:
the data acquisition module is used for acquiring the client transaction data;
the feature processing module is used for carrying out feature processing on the client transaction data to obtain first feature data;
and the model prediction module is used for inputting the first characteristic data into a pre-established transaction risk prediction model for prediction to obtain a transaction risk prediction result.
9. A terminal device, characterized in that it comprises a memory, a processor and a transaction risk prediction program stored on the memory and executable on the processor, which transaction risk prediction program, when executed by the processor, implements the transaction risk prediction method according to any of claims 1-7.
10. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a transaction risk prediction program, which when executed by a processor, implements the transaction risk prediction method according to any of claims 1-7.
CN202310949724.XA 2023-07-31 2023-07-31 Transaction risk prediction method, device, terminal equipment and storage medium Pending CN117196630A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310949724.XA CN117196630A (en) 2023-07-31 2023-07-31 Transaction risk prediction method, device, terminal equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310949724.XA CN117196630A (en) 2023-07-31 2023-07-31 Transaction risk prediction method, device, terminal equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117196630A true CN117196630A (en) 2023-12-08

Family

ID=88996870

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310949724.XA Pending CN117196630A (en) 2023-07-31 2023-07-31 Transaction risk prediction method, device, terminal equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117196630A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117876119A (en) * 2024-03-11 2024-04-12 药融云数字科技(成都)有限公司 Distributed-type-based wind control model construction method and system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117876119A (en) * 2024-03-11 2024-04-12 药融云数字科技(成都)有限公司 Distributed-type-based wind control model construction method and system

Similar Documents

Publication Publication Date Title
CN108876600A (en) Warning information method for pushing, device, computer equipment and medium
CN110070391B (en) Data processing method and device, computer readable medium and electronic equipment
CN109165840A (en) Risk profile processing method, device, computer equipment and medium
CN109816483B (en) Information recommendation method and device and readable storage medium
WO2022252363A1 (en) Data processing method, computer device and readable storage medium
CN112199510A (en) Fraud probability determination method and device, electronic equipment and storage medium
CN110930038A (en) Loan demand identification method, loan demand identification device, loan demand identification terminal and loan demand identification storage medium
CN111476653A (en) Risk information identification, determination and model training method and device
CN114186626A (en) Abnormity detection method and device, electronic equipment and computer readable medium
CN117196630A (en) Transaction risk prediction method, device, terminal equipment and storage medium
CN112907356A (en) Overdue collection method, device and system and computer readable storage medium
CN113486983A (en) Big data office information analysis method and system for anti-fraud processing
CN115545886A (en) Overdue risk identification method, overdue risk identification device, overdue risk identification equipment and storage medium
CN109146667B (en) Method for constructing external interface comprehensive application model based on quantitative statistics
CN111061948A (en) User label recommendation method and device, computer equipment and storage medium
CN111582757B (en) Method, device, equipment and computer readable storage medium for analyzing fraud risk
CN117196808A (en) Mobility risk prediction method and related device for peer business
US20100042446A1 (en) Systems and methods for providing core property review
CN116821759A (en) Identification prediction method and device for category labels, processor and electronic equipment
CN111951008A (en) Risk prediction method and device, electronic equipment and readable storage medium
CN117132383A (en) Credit data processing method, device, equipment and readable storage medium
CN111091460A (en) Data processing method and device
CN115238815A (en) Abnormal transaction data acquisition method, device, equipment, medium and program product
CN113744054A (en) Anti-fraud method, device and equipment
CN112818235A (en) Violation user identification method and device based on associated features and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination