CN115409636A - Product risk prediction method, device, equipment and medium - Google Patents

Product risk prediction method, device, equipment and medium Download PDF

Info

Publication number
CN115409636A
CN115409636A CN202211068818.8A CN202211068818A CN115409636A CN 115409636 A CN115409636 A CN 115409636A CN 202211068818 A CN202211068818 A CN 202211068818A CN 115409636 A CN115409636 A CN 115409636A
Authority
CN
China
Prior art keywords
operation data
target
risk
basic operation
risk prediction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211068818.8A
Other languages
Chinese (zh)
Inventor
王松松
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN202211068818.8A priority Critical patent/CN115409636A/en
Publication of CN115409636A publication Critical patent/CN115409636A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance

Abstract

The present disclosure provides a product risk prediction method, apparatus, device, medium, and program product, which can be applied to the fields of artificial intelligence, big data, and financial technology. The method comprises the following steps: inquiring a basic operation data set corresponding to the target customer and the respective risk correlation information of the basic operation data in the basic operation data set from the basic operation data table by using the target customer field and the basic operation data field; screening out target operation data from the basic operation data set corresponding to the target customer according to the respective risk correlation information of the basic operation data; inputting target operation data into a target risk prediction model and outputting a risk prediction result; inputting the target operation data into an operation data scoring model, and outputting an operation scoring result; and determining a product risk prediction result of the target product associated with the target customer according to the risk prediction result and the operation scoring result.

Description

Product risk prediction method, device, equipment and medium
Technical Field
The present disclosure relates to the field of artificial intelligence, the field of big data, and the field of financial technology, and in particular, to a method, an apparatus, a device, a medium, and a program product for product risk prediction.
Background
Along with the rapid development of economy, related institutions such as enterprises can sell financial products such as bond and trust to help the related institutions to obtain financing amount for meeting business development requirements by meeting the teaching of related regulations for meeting the business requirements of rapid development. Accordingly, the user who purchases the financial products such as bond, trust and the like can also obtain the profit amount according to the profit proportion specified in the financial products.
In an actual financial product transaction scene, the financial product needs to be assisted in transaction by predicting the risk of the financial product, and the prediction accuracy of the risk of the financial product in the related technology is generally low, so that the actual demand is difficult to meet.
Disclosure of Invention
In view of the foregoing, the present disclosure provides product risk prediction methods, apparatuses, devices, media and program products.
According to a first aspect of the present disclosure, there is provided a product risk prediction method, comprising:
inquiring a basic operation data set corresponding to a target client and the respective risk correlation information of basic operation data in the basic operation data set from a basic operation data table by using a target client field and a basic operation data field;
screening out target operation data from the basic operation data set corresponding to the target client according to the respective risk correlation information of the basic operation data;
inputting the target operation data into a target risk prediction model, and outputting a risk prediction result;
inputting the target operation data into an operation data scoring model, and outputting an operation scoring result; and
and determining a product risk prediction result of a target product related to the target customer according to the risk prediction result and the operation scoring result.
According to the embodiment of the present disclosure, the basic operation data has an operation data identifier;
the product risk prediction method further comprises the following steps:
performing risk correlation analysis on sample basic operation data according to a time series model to obtain risk correlation information of the sample basic operation data, wherein the sample basic operation data has a sample operation data identifier corresponding to the basic operation data; and
and determining the respective risk correlation information of the basic operation data according to the risk correlation information of the sample basic operation data and the corresponding relation between the sample operation data identifier and the operation data identifier.
According to the embodiment of the disclosure, performing risk correlation analysis on sample basic operation data according to a time series model, and obtaining risk correlation information of the sample basic operation data includes:
inputting the sample basic operation data into the time series model to obtain a sample risk prediction result; and
and processing the sample risk prediction result and a sample label corresponding to the sample basic operation data based on a preset algorithm so as to iteratively adjust parameters of the time series model until a difference value between the sample risk prediction result and the sample label is converged to obtain target parameters of the time series model, wherein the target parameters of the time series model are respective risk correlation information of the sample basic operation data.
According to an embodiment of the present disclosure, the basic operation data set includes L basic operation data, where L is a positive integer greater than 1;
screening the target operation data from the basic operation data set corresponding to the target customer according to the respective risk correlation information of the basic operation data comprises:
determining M candidate basic operation data in the basic operation data set according to a preset risk threshold, wherein the risk correlation information of the candidate basic operation data is greater than or equal to the preset risk threshold, and L is greater than M and is greater than or equal to 1; and
and screening M candidate basic operation data from the basic operation data set to obtain M target operation data.
According to an embodiment of the present disclosure, the product risk prediction method further comprises:
respectively training each initial risk prediction model in the N initial risk prediction models by using a training sample to obtain N trained candidate risk prediction models, wherein the training sample comprises sample target operation data and a sample label corresponding to the sample target operation data, and N is a positive integer greater than 2;
determining respective prediction effect information of the N candidate risk prediction models according to the sample labels; and
and determining the target risk prediction model from the N candidate risk prediction models according to the prediction effect information of each of the N candidate risk prediction models.
According to an embodiment of the present disclosure, the prediction effect information includes at least one of:
predicted coverage rate information, predicted accuracy rate information and predicted recall rate information.
In accordance with an embodiment of the present disclosure,
the target risk prediction model comprises a decision tree prediction model constructed based on a decision tree algorithm; or
The target risk prediction model comprises a neural network prediction model constructed based on a neural network algorithm.
According to an embodiment of the present disclosure, the decision tree prediction model includes any one of the following items:
an extreme gradient lifting model, a random forest model and a mild gradient lifting model.
According to an embodiment of the present disclosure, the risk prediction result includes a first product risk probability, and the product risk prediction result includes a product risk probability;
wherein determining a product risk prediction result of a target product associated with the target customer according to the risk prediction result and the operation scoring result comprises:
processing the operation scoring result by using a preset mapping function to obtain a second product risk probability corresponding to the operation scoring result; and
and determining the product risk probability of the target product according to the first product risk probability and the second product risk probability.
A second aspect of the present disclosure provides a product risk prediction apparatus, including:
the query module is used for querying a basic operation data set corresponding to the target client and the respective risk correlation information of the basic operation data in the basic operation data set from a basic operation data table by using the target client field and the basic operation data field;
the first screening module is used for screening target operation data from the basic operation data set corresponding to the target customer according to the respective risk correlation information of the basic operation data;
the risk prediction module is used for inputting the target operation data into a target risk prediction model and outputting a risk prediction result;
the operation scoring module is used for inputting the target operation data into an operation data scoring model and outputting an operation scoring result; and
and the product risk determining module is used for determining a product risk prediction result of the target product related to the target customer according to the risk prediction result and the operation scoring result.
A third aspect of the present disclosure provides an electronic device, comprising: one or more processors; a memory for storing one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the above product risk prediction method.
A fourth aspect of the present disclosure also provides a computer-readable storage medium having stored thereon executable instructions that, when executed by a processor, cause the processor to perform the above product risk prediction method.
A fifth aspect of the present disclosure also provides a computer program product comprising a computer program which, when executed by a processor, implements the product risk prediction method described above.
Drawings
The foregoing and other objects, features and advantages of the disclosure will be apparent from the following description of embodiments of the disclosure, which proceeds with reference to the accompanying drawings, in which:
fig. 1 schematically illustrates an application scenario diagram of a product risk prediction method and apparatus according to an embodiment of the present disclosure;
FIG. 2 schematically illustrates a flow diagram of a product risk prediction method according to an embodiment of the present disclosure;
FIG. 3 schematically illustrates a flow diagram of a product risk prediction method according to another embodiment of the present disclosure;
FIG. 4 is a schematic diagram illustrating the prediction effectiveness information of candidate risk prediction models according to an embodiment of the disclosure;
FIG. 5 schematically shows an application scenario of the product risk prediction method according to an embodiment of the disclosure;
FIG. 6 schematically shows a block diagram of a product risk prediction device according to an embodiment of the present disclosure; and
fig. 7 schematically shows a block diagram of an electronic device adapted to implement a product risk prediction method according to an embodiment of the present disclosure.
Detailed Description
Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that these descriptions are illustrative only and are not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present disclosure.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.
All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It is noted that the terms used herein should be interpreted as having a meaning that is consistent with the context of this specification and should not be interpreted in an idealized or overly formal sense.
Where a convention analogous to "at least one of A, B, and C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B, and C" would include but not be limited to systems that have A alone, B alone, C alone, A and B together, A and C together, B and C together, and/or A, B, and C together, etc.).
In the technical scheme of the disclosure, the collection, storage, use, processing, transmission, provision, disclosure, application and other processing of the personal information of the related user are all in accordance with the regulations of related laws and regulations, necessary confidentiality measures are taken, and the customs of the public order is not violated.
In the technical scheme of the disclosure, before the personal information of the user is obtained or collected, the authorization or the consent of the user is obtained.
In the risk assessment system for financial products, the risk of the products can be assessed based on a scoring model in the related art, for example, a financial institution can assess the default risk of bond products of customers based on professional knowledge and related rules, so as to predict the default probability of the bond products. The risk prediction method for financial products is difficult to iterate rapidly, the accuracy of risk prediction is low, and actual requirements are difficult to meet.
Embodiments of the present disclosure provide a product risk prediction method, apparatus, device, medium, and program product. The product risk prediction method comprises the following steps:
inquiring a basic operation data set corresponding to the target customer and the respective risk correlation information of the basic operation data in the basic operation data set from the basic operation data table by using the target customer field and the basic operation data field; screening out target operation data from the basic operation data set corresponding to the target customer according to the respective risk correlation information of the basic operation data; inputting target operation data into a target risk prediction model, and outputting a risk prediction result; inputting the target operation data into an operation data scoring model, and outputting an operation scoring result; and determining a product risk prediction result of the target product associated with the target customer according to the risk prediction result and the operation scoring result.
According to the embodiment of the disclosure, the target operation data is determined from the basic operation data set according to the risk correlation information, data which are low in correlation with product risks in the basic operation data can be screened out, noise data interference is eliminated, the calculation efficiency of a subsequent target risk prediction model is improved, the target operation data are input into the target risk prediction model, risks of target products can be predicted from an algorithm dimension of the prediction model, meanwhile, the target operation data are input into the operation data scoring model, the output operation scoring result can be used for evaluating the risks of the target products from a regular scoring dimension, the product risk prediction result of the target products is further determined according to the risk prediction result and the operation scoring result, the possibility that the target products generate risks can be predicted under the condition that the product risks of the target products are predicted in a multi-dimension mode, and therefore the technical effect of improving the accuracy of product risk prediction of the target products is achieved.
Fig. 1 schematically shows an application scenario diagram of a product risk prediction method and apparatus according to an embodiment of the present disclosure.
As shown in fig. 1, the application scenario 100 according to this embodiment may include terminal devices 101, 102, 103, a network 104, and a server 105. Network 104 is the medium used to provide communication links between terminal devices 101, 102, 103 and server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal devices 101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. The terminal devices 101, 102, 103 may have installed thereon various communication client applications, such as shopping-like applications, web browser applications, search-like applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only).
The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.
The server 105 may be a server providing various services, such as a background management server (for example only) providing support for websites browsed by users using the terminal devices 101, 102, 103. The background management server may analyze and perform other processing on the received data such as the user request, and feed back a processing result (e.g., a webpage, information, or data obtained or generated according to the user request) to the terminal device.
It should be noted that the product risk prediction method provided by the embodiment of the present disclosure may be generally executed by the server 105. Accordingly, the product risk prediction device provided by the embodiments of the present disclosure may be generally disposed in the server 105. The product risk prediction method provided by the embodiment of the present disclosure may also be executed by a server or a server cluster different from the server 105 and capable of communicating with the terminal devices 101, 102, 103 and/or the server 105. Correspondingly, the product risk prediction device provided by the embodiment of the present disclosure may also be disposed in a server or a server cluster that is different from the server 105 and is capable of communicating with the terminal devices 101, 102, 103 and/or the server 105.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
The product risk prediction method of the disclosed embodiment will be described in detail below with reference to fig. 2 to 5 based on the scenario described in fig. 1.
Fig. 2 schematically shows a flow chart of a product risk prediction method according to an embodiment of the present disclosure.
As shown in fig. 2, the product risk prediction method of this embodiment includes operations S210 to S250.
In operation S210, a basic operation data set corresponding to the target customer and risk related information of the basic operation data in the basic operation data set are queried from the basic operation data table by using the target customer field and the basic operation data field.
According to an embodiment of the present disclosure, the basic operation data table may include basic operation data of the target customer acquired from the public database, and may further include basic operation data of the target customer acquired from an operation database related within the organization. The underlying operational data may include financial data, credit rating data, business attribute data, etc. generated by the target customer during operation.
According to an embodiment of the present disclosure, the risk correlation information may include information for characterizing a degree of risk correlation of the basic operation data with the target product issued by the target customer, such as a risk correlation parameter and the like.
It should be noted that the risk correlation information of each piece of basic operation data may be determined based on a risk evaluation method in the related art, for example, the risk correlation information of each piece of basic operation data may be set based on expert experience rules, or a contribution degree of each piece of basic operation data to a risk evaluation result may also be determined based on a risk evaluation model (for example, a fully-connected neural network model) constructed by a neural network, and the risk correlation information of each piece of basic operation data may be determined based on the contribution degree. The method for determining the risk correlation information is not limited in the embodiments of the present disclosure, and those skilled in the art may design according to actual requirements as long as the risk correlation information can represent the correlation between the basic operation data and the product risk of the target product.
According to the embodiment of the disclosure, the risk correlation information of the basic operation data set and the basic operation data corresponding to the target client can be queried by editing the query statement containing the target client field and the basic operation data field.
In operation S220, target operation data is screened from the basic operation data set corresponding to the target customer according to the risk correlation information of each of the basic operation data.
According to the embodiment of the disclosure, the target operation data are screened from the basic operation data set according to the risk correlation information, and the basic operation data with low risk correlation degree with the target product can be screened out, so that the calculation amount of a subsequent target risk prediction model and/or an operation data scoring model is reduced, and the calculation efficiency is improved.
In operation S230, the target operation data is input to the target risk prediction model, and a risk prediction result is output.
According to the embodiment of the present disclosure, the target risk prediction model may include a model constructed based on a machine learning algorithm, for example, the target risk prediction model may include a target risk prediction model constructed based on a recurrent neural network algorithm, but is not limited thereto, and the target risk prediction model may also be constructed based on a machine learning model such as a decision tree model.
In operation S240, the target operation data is input to the operation data scoring model, and an operation scoring result is output.
According to the embodiment of the disclosure, the operation data scoring model may be constructed based on a rule scoring model in the related art, for example, the rule scoring model may be constructed based on related expert experience. The target operation data are processed through the operation data scoring model, data information of the target operation data can be converted into scoring information based on scoring rules in the operation data scoring model, and then an operation scoring result of a target client can be determined according to the scoring information, so that the operation scoring result can be fully utilized to represent the operation capacity of the target client, and the accuracy of subsequent target product risk prediction is further improved.
In operation S250, a product risk prediction result of the target product associated with the target customer is determined according to the risk prediction result and the operation scoring result.
According to an embodiment of the present disclosure, the target product may include a financial product sold by the target customer, such as a bond product issued by a financial institution with associated qualifications entrusted by the enterprise, a trust product, and the like. The product risk prediction result may be a level for predicting a risk of the target product, a probability of breach, and the like.
According to the embodiment of the disclosure, the target operation data is determined from the basic operation data set according to the risk correlation information, data which is low in correlation with product risks in the basic operation data can be screened out, noise data interference is eliminated, the calculation efficiency of a subsequent target risk prediction model is improved, the target operation data is input into the target risk prediction model, risks of target products can be predicted from an algorithm dimension of the prediction model, meanwhile, the target operation data is input into the operation data scoring model, the output operation scoring result can be used for evaluating the risks of the target products from a regular scoring dimension, the product risk prediction result of the target products is determined according to the risk prediction result and the operation scoring result, the possibility that the target products generate risks can be predicted under the condition that the product risks of the target products are predicted in a multi-dimension mode, and therefore the technical effect of accuracy of product risk prediction of the target products is improved.
According to an embodiment of the present disclosure, the base operation data may have an operation data identification.
Fig. 3 schematically illustrates a flow diagram of a product risk prediction method according to another embodiment of the present disclosure.
As shown in fig. 3, the product risk prediction method may further include operations S310 to S320.
In operation S310, a risk correlation analysis is performed on the sample basic operation data according to the time series model to obtain risk correlation information of the sample basic operation data, where the sample basic operation data has a sample operation data identifier corresponding to the basic operation data.
In operation S320, risk correlation information of the basic operation data is determined according to the risk correlation information of the sample basic operation data and the corresponding relationship between the sample operation data identifier and the operation data identifier.
According to an embodiment of the present disclosure, the time series model may include a model constructed based on a time series algorithm, and the time series model may include, for example, an autoregressive model, a moving average model, and the like.
According to an embodiment of the present disclosure, the sample basic operation data may include historical operation data having the same data item type as the basic operation data, and the sample operation data identifier and the operation data identifier may form a correspondence relationship by the data item type. The same sample target client can correspond to sample basic data and a sample basic operation data set containing the sample basic data, the respective sample basic operation data sets of one or more sample target clients are processed through a time series model, the correlation degree of the sample basic operation data of the same data item type to the risk result of the product in the historical stage can be obtained, and the risk correlation information of the sample operation data can be obtained.
According to the embodiment of the disclosure, the risk correlation information of the basic operation data with the same data item type is determined through the risk correlation information of the sample basic operation data, so that a foundation can be laid for subsequently screening the target operation data, and the timeliness of a product risk prediction result for a target product is improved.
According to the embodiment of the present disclosure, it should be noted that before the correlation analysis is performed on the sample basic operation data, the abnormal operation data and the missing operation data in the sample basic operation data may be preprocessed, so as to meet the requirement of performing the correlation analysis on the sample basic operation data.
In one embodiment of the present disclosure, the sample basic operation data may be obtained from a related data platform, for example, initial sample basic operation data such as a bond default statement, a bond basic data, a customer basic information table, a legal customer loan balance table, and the like may be obtained. And then analyzing the data in the initial sample basic operation data set from the service logic, and removing data information related to bond default, such as corresponding basic attribute information in a customer information table, corresponding large deposit list in a bond basic information table and the like. Abnormal operation data and missing operation data in the initial sample basic operation data can be preprocessed.
For the missing and abnormal of continuous data in the initial sample basic operation data, the average value of the fields where the abnormal operation data are located can be used for filling the initial sample basic operation data of the same client. For the lack and the abnormality of the discontinuous numerical values in the initial sample basic operation data, the same client can be filled by using the mode of the field where the initial sample basic operation data is located. And finally, connecting the data tables obtained after the preprocessing to form a wide table, wherein the wide table can contain fields for representing the sample operation data identifications and data items corresponding to the sample operation data identifications, namely sample basic operation data.
According to an embodiment of the present disclosure, the operation S310 of performing risk correlation analysis on the sample basic operation data according to the time series model to obtain risk correlation information of the sample basic operation data may include the following operations:
inputting the basic operation data of the sample into a time series model to obtain a sample risk prediction result; and processing the sample risk prediction result and a sample label corresponding to the sample basic operation data based on a preset algorithm so as to iteratively adjust parameters of the time series model until a difference value between the sample risk prediction result and the sample label is converged to obtain target parameters of the time series model, wherein the target parameters of the time series model are respective risk correlation information of the sample basic operation data.
According to an embodiment of the present disclosure, the time series model may include a linear time series model, for example, equation (1) may be employed to identify the time series model.
X t =k 1 x 1 +k 2 x 2 +...+k t-1 x t-1 +μ; (1)
In the formula (1), x i Representing sample underlying operational data, k i Represents a parameter corresponding to the sample basis data in the time series model, mu represents a random parameter, X t Can represent the basic operation data x i The corresponding sample label.
In addition, k in the formula (1) i It can be shown that when the difference value between the sample risk prediction result and the sample label is zero, the respective target parameters of the sample basic operation data are obtained. Targeting sample objects by using a linear time series modelThe parameters of the sample basic operation data of the user are adjusted, so that the target parameters can represent the degree of correlation between the sample basic operation data and the bond default risk, namely, the sample basic operation data can reflect the weight relationship between the sample basic operation data and the bond default risk through the target parameters, and effective basis is provided for subsequent screening of the basic operation data with low degree of correlation with the bond default risk.
According to the embodiment of the present disclosure, the basic operation data set includes L basic operation data, where L is a positive integer greater than 1.
Operation S220, according to the risk correlation information of each basic operation data, screening out the target operation data from the basic operation data set corresponding to the target customer includes the following operations:
determining M candidate basic operation data in the basic operation data set according to a preset risk threshold, wherein the risk correlation information of the candidate basic operation data is greater than or equal to the preset risk threshold, and L is greater than M and is not less than 1; and screening out M candidate basic operation data from the basic operation data set to obtain M target operation data.
According to the embodiment of the disclosure, the risk correlation information can represent the correlation degree between the corresponding basic operation data and the product risk, and the M candidate basic operation data with the risk correlation information being greater than or equal to the preset risk threshold value are determined from the basic operation data set, so that the operation data with the high correlation degree with the product risk can be determined from the basic operation data set, the data processing speed of a subsequent target risk prediction model and/or operation data scoring model is reduced, technical problems such as dimensionality disaster generated by the target risk prediction model are at least partially avoided, and the accuracy and timeliness of the product risk prediction result are improved.
It should be noted that after the M candidate basic operation data are obtained, the candidate basic operation data having different data item types (or different operation data identifiers) but the same meaning in the candidate basic operation data may be merged to further reduce the data amount of the subsequent target operation data.
For example, candidate base operation data having operation data identifications respectively of "listed company identification" and "listed or not listed company" may be merged to reduce the data amount of the subsequent target operation data.
According to an embodiment of the present disclosure, the preset risk threshold may be a fixed threshold, or may also be a dynamic preset risk threshold set according to different target products and/or different target customers. Accordingly, the preset risk threshold may be designed according to actual requirements, and the embodiment of the present disclosure does not limit the specific numerical value and the design manner of the preset risk threshold.
In one embodiment of the present disclosure, the target operation data of the target customer may be represented by the field of operation data identification and the description of the target operation data in table 1.
It should be noted that the target operation data in table 1 is only an exemplary target operation data that can be screened, and does not indicate a specific data value of each target operation data. Those skilled in the art can screen out specific target operation data according to actual needs and according to the product risk prediction method provided in the foregoing embodiment, and the embodiment of the present disclosure does not limit the specific data item type of the target operation data.
TABLE 1
Figure BDA0003827824900000131
Figure BDA0003827824900000141
According to an embodiment of the present disclosure, the product risk prediction method may further include the operations of:
respectively training each initial risk prediction model in the N initial risk prediction models by using a training sample to obtain N trained candidate risk prediction models, wherein the training sample comprises sample target operation data and a sample label corresponding to the sample target operation data, and N is a positive integer greater than 2; determining respective prediction effect information of the N candidate risk prediction models according to the sample labels; and determining a target risk prediction model from the N candidate risk prediction models according to the respective prediction effect information of the N candidate risk prediction models.
According to the embodiment of the disclosure, the N initial risk prediction models may be models having different model structures, and each initial risk prediction model may be trained according to a training method in the related art, thereby obtaining N candidate risk prediction models that may be used for predicting a product risk prediction result. And then determining the candidate risk prediction model with the best prediction effect as the target risk prediction model according to the respective prediction effect information of the candidate risk prediction models, thereby improving the prediction accuracy of the target risk prediction model.
According to an embodiment of the present disclosure, the predicted effect information includes at least one of:
predicted coverage information, predicted accuracy information, and predicted recall information.
According to the embodiment of the disclosure, the prediction effect information can be constructed from any one or more aspects of the coverage rate prediction information, the accuracy rate prediction information and the recall rate prediction information, so that the prediction effect of the candidate risk prediction model is comprehensively evaluated in combination with actual demands, the robustness and the prediction accuracy of the target risk prediction model are further improved, and the accuracy rate of the subsequent product risk prediction result is further improved.
According to an embodiment of the present disclosure, the target risk prediction model includes a neural network prediction model constructed based on a neural network algorithm.
According to an embodiment of the present disclosure, the neural network prediction model may include, for example, a network model constructed based on neural network algorithms such as a fully-connected neural network, a recurrent neural network, a long-short term memory network, and the like.
It should be noted that, the embodiment of the present disclosure does not limit the specific network structure of the neural network prediction model, and a person skilled in the art may construct an initial risk prediction model based on a neural network algorithm in the related art according to actual needs, and determine a target risk prediction model based on the product risk prediction method in the above embodiment.
According to an embodiment of the present disclosure, the target risk prediction model includes a decision tree prediction model constructed based on a decision tree algorithm.
According to an embodiment of the present disclosure, the decision tree prediction model may determine a risk classification result of the target product, i.e., a first risk prediction result, according to a processing result of the target operation data. The ability can be predicted more accurately through the decision tree model, the accuracy of the first risk prediction result can be effectively improved, and therefore the prediction accuracy of the subsequent product risk prediction result is further improved.
According to an embodiment of the present disclosure, the decision tree prediction model includes any one of:
eXtreme Gradient Boosting (XGBoost), random Forest (RF), and mild Gradient Boosting (lightGBM).
It should be noted that the target risk prediction model may be determined from the trained candidate risk prediction model, and accordingly, the method for constructing the candidate risk prediction model and the network structure are not limited in the embodiment of the present disclosure, and those skilled in the art may design according to actual requirements.
According to the embodiment of the disclosure, the XGBoost model and the LightGBM model may be selected as initial risk prediction models, the initial risk prediction models are respectively trained by using target basic operation data to obtain trained candidate risk prediction models, and the target risk prediction models are determined according to respective prediction coverage rate information, prediction accuracy rate information and prediction recall rate information of the candidate risk prediction models (i.e., the trained XGBoost model and the LightGBM model).
The XGboost model is used as a machine learning algorithm model and also is an integrated learning algorithm model based on a CART regression tree, new trees can be generated through continuous iteration to learn residual errors between real values (namely sample labels) and predicted values of all current trees, and the results of all the trees are accumulated to be used as final results, so that sample risk prediction results can be obtained, and the highest classification accuracy can be obtained.
The XGboost model comprises t trees, each tree scores sample target operation data, and all scoring results are summed to serve as a predicted value of the sample target operation data. The actual value of the initial tree can be set to 0 by equation (2).
Figure BDA0003827824900000161
The prediction function for determining the t-th regression tree by equation (3) can be expressed as:
Figure BDA0003827824900000162
in equations (2) and (3), xi represents the ith sample target operation data, f k Denotes the kth tree, fk (xi) denotes the value of xi in the kth tree, y i (t) And the predicted result of the sample target operation data xi after k iterations is shown.
The penalty function of the XGBoost model may be represented by equation (4).
Figure BDA0003827824900000163
In the formula (4), the first and second groups,
Figure BDA0003827824900000164
used to measure the error loss function of the predicted value and the actual value directly,
Figure BDA0003827824900000165
a regularization term to prevent overfitting. yi represents a sample label corresponding to the ith sample target operation data,
Figure BDA0003827824900000166
representing the pre-prediction of the i-th sample target operation data by the XGboost model after t-round iterationAnd (4) measuring the value. The XGboost model can adopt an additive model, namely the predicted value of the t-th wheel to the sample target operation data is the sum of scores of the sample target operation data scored by the previous t iterations. The loss function of the XGBoost model is further expressed by equation (5).
Figure BDA0003827824900000167
In the formula (5), the first and second groups,
Figure BDA0003827824900000168
indicates the predicted value of the t-1 th round, f t (x i ) Represents the score of the sample target operational data scored by the tth wheel. The regularization items of the t round are added from the regularization items of t trees, namely the sum of the complexity of each tree, and the purpose is to control the complexity of the XGboost model and prevent overfitting. The regularization term is represented by equation (6):
Figure BDA0003827824900000171
in the formula (6), T represents the number of leaf nodes, ω j The value of the jth leaf node is represented, both tau and alpha represent penalty coefficients, and in practical application, parameters can be adjusted to enable the XGboost model to achieve the best effect.
The LightGBM algorithm is an evolutionary form of a GBDT (Gradient Boosting Decision Tree) algorithm as a Decision Tree algorithm for processing mass data. Different from the conventional GBDT and XGBoost algorithms, the LightGBM algorithm provides the operation efficiency of model training by using a histogram algorithm, a GOSS algorithm (one-sided sampling algorithm), and an EFB algorithm (mutually exclusive feature bundling algorithm), while reducing the time complexity of training and the occupation of memory resources.
The LightGBM model adopts a leaf-wise strategy to split leaf nodes, calculates the information gains of all current leaf nodes, finds the leaf node with the maximum information gain to split, and limits the depth of the LightGBM model to prevent overfitting and shorten the time for finding the optimal depth tree.
The LightGBM model has more advantages than the conventional model in model training, and is mainly embodied in two aspects: the originality of the data item type of the target operational data and the parallelism of the model training. The originality of the data item type is mainly embodied in that the traditional algorithm model needs to perform one-hot coding processing on target operation data before model training, so that the storage space is increased on the basis of the original target operation data, the model training efficiency is reduced, the LightGBM model can directly use the original data item type, and the model training time is reduced to a certain extent. The parallelism of model training comprises data item type parallelism and data parallelism, massive data need to be trained in the model training process, and the training efficiency is high in overhead, and the LightGBM model can put different target operation data sets on different machines for training, so that the model can be optimally divided among the machines, and the training efficiency is greatly improved.
According to the embodiment of the disclosure, the default risk of the bond product can be predicted based on the LightGBM model and the XGboost model obtained after training, and since the bond default is a small probability event, the Coverage Rate (CR), the accuracy rate (p) and the recall rate (AVG) can be used as the prediction effect information to evaluate the prediction effect of the model.
For example, the prediction effect information of each of the LightGBM model and the XGBoost model may be represented based on the chaotic matrix in table 2
TABLE 2
Figure BDA0003827824900000181
In table 2, TP indicates that the prediction result is 1, and the actual result (sample label) is 1, i.e., the prediction is correct; FP means that the prediction result is 1, and the actual result is 0, namely the prediction is wrong; FN means that the prediction result is 0, and the actual result is 1, namely the prediction is wrong; TN means that the prediction result is 0, and the actual result is 0, namely the prediction is correct; type Ierror represents the error rate when the predicted result is 1 and the actual result is 0, and Type II error represents the error rate when the predicted result is 0 and the actual result is 1.
Coverage (CR) refers to the extent of coverage of the target customers with real default in the market to the target customer subject to bond product default in the market in the prediction list with risk predicted by the candidate risk prediction model. The Coverage (CR) can be expressed by, for example, formula (7).
Figure BDA0003827824900000182
In formula (7), CR represents the coverage, c represents the number of target customer entities that violate the forecast inventory, D represents the number of target customer entities that violate the market, N represents the number of target customer entities in the forecast inventory, p represents the forecast accuracy, N represents the number of target customer entities that issue bond products in the market, and D represents the bond violation rate in the market.
Accuracy (p) may refer to the proportion of the candidate risk prediction model that is actually violated with respect to the prediction inventory, which may represent the accuracy of the candidate risk prediction model. The accuracy (p) can be expressed by, for example, equation (8).
Figure BDA0003827824900000191
In formula (8), P represents the accuracy, c represents the number of target customer bodies that violate the forecast inventory, n represents the number of target customer bodies in the forecast inventory, and P represents the forecast accuracy.
The recall ratio (AVG) can be the ratio of the target customers in the list with the default result predicted by the candidate risk prediction model being the same as the actual default result represented by the sample label to the target customers in the list with the real default, and the recall ratio (AVG) can represent the integrity of the selected risk prediction model for the default bond prediction. This can be expressed by, for example, equation (9).
Figure BDA0003827824900000192
In formula (9), AVG represents a recall rate, c represents the number of target customer entities that violate the forecast inventory, D represents the number of target customer entities in the market that violate the condition, N represents the number of target customer entities in the forecast inventory, N represents the number of target customer entities that issue bond products in the market, and D represents a bond violation rate in the market.
According to the embodiment of the present disclosure, the improvement of the accuracy (p) can be generally achieved only by improving the feature selection of the model, i.e. improving the selection of the target basic operation data, or by improving the data quality of the target basic operation data. Accurate prediction accuracy is difficult to achieve and the stability of the target risk prediction model needs to be verified. Therefore, under the condition that the accuracy (p) is unknown, a deterministic method for ensuring the Coverage Rate (CR) can be realized, namely, the number (n) of subjects of the target customer in the prediction list is increased, but the blind increase of the number (n) of subjects can cause an excessively high Type I error, that is, prediction default can occur in the prediction result of the candidate risk prediction model but no default actually exists (False Positive); while the Coverage (CR) is not sufficient and the Type II error (1-p) is too high, i.e., the actual outcome violates but does not reflect the actual proportion of the subject in the prediction list. Therefore, the Type II error can be considered to be limited when n is as small as possible in the construction process of the candidate risk prediction model according to the actual risk control requirement.
Fig. 4 is a schematic diagram illustrating the prediction effect information of each candidate risk prediction model according to an embodiment of the disclosure.
As shown in fig. 4, the sample target base operational data may be bond default data in the dimension of months, for example, the bond default data of the whole year from 1 month in 2020 to 12 months in 2020 may be used for training and testing. And finally, evaluating the prediction results of the candidate risk prediction models respectively aiming at bond default by using the recall rate.
The graph (a) in fig. 4 may be a prediction recall rate of a candidate risk prediction model obtained by training an initial risk prediction model LightGBM model and an initial risk prediction model XGBoost model, which are obtained by dividing sample target basic operation data into a training set and a test set in an unparalleled manner.
The diagram (b) in fig. 4 may be a diagram that proportionally divides sample target basic operation data into a training set and a test set to cross-verify the prediction recall rate of the candidate risk prediction model LightGBM model and the candidate risk prediction model XGBoost model.
It should be noted that the first 1%, the first 2.5%, and the first 5% of the graphs (a) and (b) in fig. 4 may represent the first 1%, the first 2.5%, and the first 5% of the sample risk prediction results output by the trained XGBoost model and the trained LightGBM model, respectively. Accordingly, the respective prediction recall rates of the trained XGBoost model and the trained LightGBM model may be determined according to the first 1%, the first 2.5%, and the first 5% partial results in the sample risk prediction results.
By combining the graph (a) and the graph (b) in fig. 4, the XGBoost model obtained after training can be determined as the target risk prediction model.
According to an embodiment of the present disclosure, the risk prediction result includes a first product risk probability, and the product risk prediction result includes a product risk probability;
operation S250, determining a product risk prediction result of the target product associated with the target customer according to the risk prediction result and the operation scoring result, may include the following operations:
processing the operation scoring result by using a preset mapping function to obtain a second product risk probability corresponding to the operation scoring result; and determining the product risk probability of the target product according to the first product risk probability and the second product risk probability.
According to the embodiment of the disclosure, the operation data scoring model can be constructed based on a rule scoring model (also called a scoring card model) in the related art, so that the target operation data can be evaluated numerically by using the operation data scoring model.
The operation data scoring model can evaluate the target operation data from two aspects, namely the operation data scoring model can be constructed by utilizing an empirical model and a data model, wherein the empirical model can grade the service scene through service personnel with abundant experience, and then the grade scoring is subjected to numerical processing, so that the scoring computability of the empirical model for the target operation data is realized. The data model can be scored by using a probability calculation method for the target operation data, has stronger calculation logic, is more operable from the viewpoint of calculation, and is more instructive from the viewpoint of result.
In one embodiment of the present disclosure, the operation data scoring model may be constructed based on an empirical model, for example, the operation data scoring model may be constructed through table 3, and the operation scoring result is further obtained according to the target operation data.
TABLE 3
Figure BDA0003827824900000211
Figure BDA0003827824900000221
As can be seen from table 3, the Shenwan rule and the rules in the institution can be constructed based on the result of the expert experience summary, and further, the operation data scoring model can be constructed by methods such as a programming language in the related art. Target operation data are input into the operation data scoring model, and default risk scoring results, namely operation scoring results, aiming at the target products can be comprehensively obtained.
Further, an arc tangent function can be selected as a preset mapping function, so that the operation scoring result can be standardized by using the arc tangent function, and a second product risk probability which is uniform with the first product risk probability is obtained.
The characteristic of the arctan function is that the arctan function has continuity of data, and meanwhile, in the process of data standardization, the data dimension is mapped to the interval range of [0,1], so that the operation scoring result can be mapped to the interval range of [0,1], and consistency with the first product risk probability is achieved. The second product risk probability may be generated, for example, by equation (10).
scoreModel=atan(-q)*2/π; (10)
In formula (10), q represents the operation scoring result, atan () represents the arctan function, and scoreModel represents the second product risk probability.
According to the embodiment of the disclosure, the product risk prediction result of the target product associated with the target customer can be determined based on the average value of the first product risk probability and the second product risk probability, so that the risk generation probability of the target product is quantitatively represented, and the product risk of the target product is accurately predicted and evaluated.
Fig. 5 schematically shows an application scenario of the product risk prediction method according to the embodiment of the disclosure.
As shown in fig. 5, in this application scenario, the target operation data 510 may be operation data screened from a basic operation data set corresponding to the target customer. Target operational data 510 may include intra-agency rated credit data 511 and public financial data.
By the product risk prediction method provided by the above embodiment, the target risk prediction model 521 can be determined from candidate risk prediction models obtained after training.
In this embodiment, the target risk prediction model 521 may be determined as a trained XGBoost model. Accordingly, the operational data scoring model 522 may also be constructed based on the expert experience described above.
The target operation data 510 is respectively input to the target risk prediction model 521 and the operation data scoring model 522, so that the first product risk probability 531 output by the target risk prediction model 521 and the operation scoring result 532 output by the operation data scoring model 522 can be obtained.
Processing the operation scoring results 532 using a predetermined mapping function, such as an arctan function, may map the operation scoring results 532 to a range of [0,1] values, thereby obtaining a second product risk probability 533.
According to the average value of the first product risk probability 531 and the second product risk probability 533, a product risk prediction result of the target product associated with the target customer can be obtained, so that the product risk of the target product can be accurately predicted.
Further, when the target customers include a plurality of target customers, the product risk ranking of the target customers can be performed based on the product risk prediction result corresponding to each target customer, and then the target customers with lower product risk can be clearly selected as the target customers to be ready for trading, so as to reduce the product risk in the subsequent product trading process.
Based on the product risk prediction method, the disclosure also provides a product risk prediction device. The apparatus will be described in detail below with reference to fig. 6.
Fig. 6 schematically shows a block diagram of a product risk prediction device according to an embodiment of the present disclosure.
As shown in fig. 6, the product risk prediction apparatus 600 of this embodiment includes a query module 610, a first screening module 620, a risk prediction module 630, an operation scoring module 640, and a product risk determination module 650.
The query module 610 is configured to query, from the basic operation data table, a basic operation data set corresponding to the target customer and risk correlation information of the basic operation data in the basic operation data set by using the target customer field and the basic operation data field.
The first screening module 620 is configured to screen out target operation data from the basic operation data set corresponding to the target customer according to the risk correlation information of each of the basic operation data.
The risk prediction module 630 is configured to input the target operation data into the target risk prediction model and output a risk prediction result.
The operation scoring module 640 is configured to input the target operation data into the operation data scoring model, and output an operation scoring result.
The product risk determination module 650 is configured to determine a product risk prediction result of the target product associated with the target customer according to the risk prediction result and the operation scoring result.
According to an embodiment of the present disclosure, the base operation data has an operation data identification.
The product risk prediction device further includes: a risk correlation analysis module and a first determination module.
And the risk correlation analysis module is used for carrying out risk correlation analysis on the sample basic operation data according to the time series model to obtain risk correlation information of the sample basic operation data, wherein the sample basic operation data has a sample operation data identifier corresponding to the basic operation data.
The first determining module is used for determining the respective risk correlation information of the basic operation data according to the risk correlation information of the basic operation data of the sample and the corresponding relation between the sample operation data identification and the operation data identification.
According to an embodiment of the present disclosure, a risk correlation analysis module includes: a risk correlation analysis unit and a risk correlation information generation unit.
And the risk correlation analysis unit is used for inputting the sample basic operation data into the time series model to obtain a sample risk prediction result.
The risk correlation information generation unit processes the sample risk prediction result and the sample label corresponding to the sample basic operation data based on a preset algorithm so as to iteratively adjust parameters of the time series model until a difference value between the sample risk prediction result and the sample label is converged to obtain target parameters of the time series model, wherein the target parameters of the time series model are the respective risk correlation information of the sample basic operation data.
According to the embodiment of the present disclosure, the basic operation data set includes L basic operation data, where L is a positive integer greater than 1.
The first screening module includes: a first determining unit and a first screening unit.
The first determining unit is used for determining M candidate basic operation data in the basic operation data set according to a preset risk threshold, wherein the risk correlation information of the candidate basic operation data is greater than or equal to the preset risk threshold, and L is greater than M and is larger than or equal to 1.
The first screening unit is used for screening M candidate basic operation data from the basic operation data set to obtain M target operation data.
According to an embodiment of the present disclosure, the product risk prediction apparatus further includes: the device comprises a model training module, a second determining module and a third determining module.
The model training module is used for respectively training each initial risk prediction model in the N initial risk prediction models by using a training sample to obtain N trained candidate risk prediction models, wherein the training sample comprises sample target operation data and a sample label corresponding to the sample target operation data, and N is a positive integer greater than 2.
And the second determining module is used for determining the respective prediction effect information of the N candidate risk prediction models according to the sample labels.
And the third determining module is used for determining the target risk prediction model from the N candidate risk prediction models according to the respective prediction effect information of the N candidate risk prediction models.
According to an embodiment of the present disclosure, the predicted effect information includes at least one of:
predicted coverage rate information, predicted accuracy rate information and predicted recall rate information.
According to an embodiment of the present disclosure, the target risk prediction model includes a decision tree prediction model constructed based on a decision tree algorithm; or
The target risk prediction model comprises a neural network prediction model constructed based on a neural network algorithm.
According to an embodiment of the present disclosure, the decision tree prediction model includes any one of:
an extreme gradient lifting model, a random forest model and a mild gradient lifting model.
According to an embodiment of the present disclosure, the risk prediction result includes a first product risk probability, and the product risk prediction result includes a product risk probability;
the product risk determination module includes: a risk probability generating unit and a product risk probability determining unit.
And the risk probability generating unit is used for processing the operation scoring result by using a preset mapping function to obtain a second product risk probability corresponding to the operation scoring result.
The product risk probability determining unit is used for determining the product risk probability of the target product according to the first product risk probability and the second product risk probability.
According to an embodiment of the present disclosure, any plurality of the query module 610, the first filtering module 620, the risk prediction module 630, the operation scoring module 640, and the product risk determination module 650 may be combined into one module to be implemented, or any one of them may be split into a plurality of modules. Alternatively, at least part of the functionality of one or more of these modules may be combined with at least part of the functionality of the other modules and implemented in one module. According to an embodiment of the present disclosure, at least one of the query module 610, the first screening module 620, the risk prediction module 630, the operation scoring module 640, and the product risk determination module 650 may be implemented at least in part as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or by any other reasonable manner of integrating or packaging a circuit, such as hardware or firmware, or any one of three implementations of software, hardware, and firmware, or any suitable combination of any of them. Alternatively, at least one of the query module 610, the first filtering module 620, the risk prediction module 630, the operation scoring module 640 and the product risk determination module 650 may be implemented at least in part as a computer program module that, when executed, may perform corresponding functions.
Fig. 7 schematically shows a block diagram of an electronic device adapted to implement a product risk prediction method according to an embodiment of the present disclosure.
As shown in fig. 7, an electronic device 700 according to an embodiment of the present disclosure includes a processor 701, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 702 or a program loaded from a storage section 708 into a Random Access Memory (RAM) 703. The processor 701 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or associated chipset, and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), among others. The processor 701 may also include on-board memory for caching purposes. The processor 701 may comprise a single processing unit or a plurality of processing units for performing the different actions of the method flows according to embodiments of the present disclosure.
In the RAM 703, various programs and data necessary for the operation of the electronic apparatus 700 are stored. The processor 701, the ROM 702, and the RAM 703 are connected to each other by a bus 704. The processor 701 performs various operations of the method flows according to the embodiments of the present disclosure by executing programs in the ROM 702 and/or the RAM 703. It is noted that the programs may also be stored in one or more memories other than the ROM 702 and RAM 703. The processor 701 may also perform various operations of method flows according to embodiments of the present disclosure by executing programs stored in the one or more memories.
Electronic device 700 may also include input/output (I/O) interface 705, which input/output (I/O) interface 705 is also connected to bus 704, according to an embodiment of the present disclosure. The electronic device 700 may also include one or more of the following components connected to the I/O interface 705: an input portion 706 including a keyboard, a mouse, and the like; an output section 707 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 708 including a hard disk and the like; and a communication section 709 including a network interface card such as a LAN card, a modem, or the like. The communication section 709 performs communication processing via a network such as the internet. A drive 710 is also connected to the I/O interface 705 as needed. A removable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 710 as necessary, so that the computer program read out therefrom is mounted in the storage section 708 as necessary.
The present disclosure also provides a computer-readable storage medium, which may be embodied in the device/apparatus/system described in the above embodiments; or may exist alone without being assembled into the device/apparatus/system. The computer-readable storage medium carries one or more programs which, when executed, implement the method according to an embodiment of the disclosure.
According to embodiments of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, according to embodiments of the present disclosure, a computer-readable storage medium may include the ROM 702 and/or the RAM 703 and/or one or more memories other than the ROM 702 and the RAM 703 described above.
Embodiments of the present disclosure also include a computer program product comprising a computer program containing program code for performing the method illustrated by the flow chart. When the computer program product runs in a computer system, the program code is used for causing the computer system to realize the method provided by the embodiment of the disclosure.
The computer program performs the above-described functions defined in the system/apparatus of the embodiments of the present disclosure when executed by the processor 701. The systems, apparatuses, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the present disclosure.
In one embodiment, the computer program may be hosted on a tangible storage medium such as an optical storage device, a magnetic storage device, and the like. In another embodiment, the computer program may also be transmitted in the form of a signal on a network medium, distributed, downloaded and installed via the communication section 709, and/or installed from the removable medium 711. The computer program containing program code may be transmitted using any suitable network medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.
In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 709, and/or installed from the removable medium 711. The computer program, when executed by the processor 701, performs the above-described functions defined in the system of the embodiment of the present disclosure. The systems, devices, apparatuses, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the present disclosure.
In accordance with embodiments of the present disclosure, program code for executing computer programs provided by embodiments of the present disclosure may be written in any combination of one or more programming languages, and in particular, these computer programs may be implemented using high level procedural and/or object oriented programming languages, and/or assembly/machine languages. The programming language includes, but is not limited to, programming languages such as Java, C + +, python, the "C" language, or the like. The program code may execute entirely on the user computing device, partly on the user device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
It will be appreciated by a person skilled in the art that various combinations or/and combinations of features recited in the various embodiments of the disclosure and/or in the claims may be made, even if such combinations or combinations are not explicitly recited in the disclosure. In particular, various combinations and/or combinations of the features recited in the various embodiments and/or claims of the present disclosure may be made without departing from the spirit or teaching of the present disclosure. All such combinations and/or associations are within the scope of the present disclosure.
The embodiments of the present disclosure are described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described separately above, this does not mean that the measures in the embodiments cannot be used in advantageous combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be devised by those skilled in the art without departing from the scope of the present disclosure, and such alternatives and modifications are intended to be within the scope of the present disclosure.

Claims (13)

1. A product risk prediction method, comprising:
inquiring a basic operation data set corresponding to a target client and the respective risk correlation information of basic operation data in a basic operation data set from a basic operation data table by using a target client field and a basic operation data field;
screening out target operation data from the basic operation data set corresponding to the target customer according to the respective risk correlation information of the basic operation data;
inputting the target operation data into a target risk prediction model and outputting a risk prediction result;
inputting the target operation data into an operation data scoring model, and outputting an operation scoring result; and
and determining a product risk prediction result of a target product associated with the target customer according to the risk prediction result and the operation scoring result.
2. The method of claim 1, wherein the base operational data has an operational data identification;
the product risk prediction method further comprises:
performing risk correlation analysis on sample basic operation data according to a time series model to obtain risk correlation information of the sample basic operation data, wherein the sample basic operation data has a sample operation data identifier corresponding to the basic operation data; and
and determining the respective risk correlation information of the basic operation data according to the risk correlation information of the sample basic operation data and the corresponding relation between the sample operation data identification and the operation data identification.
3. The method of claim 2, wherein performing risk correlation analysis on the sample base operational data according to the time series model to obtain risk correlation information of the sample base operational data comprises:
inputting the sample basic operation data into the time series model to obtain a sample risk prediction result; and
processing the sample risk prediction result and a sample label corresponding to the sample basic operation data based on a preset algorithm so as to iteratively adjust parameters of the time series model until a difference value between the sample risk prediction result and the sample label is converged to obtain target parameters of the time series model, wherein the target parameters of the time series model are respective risk correlation information of the sample basic operation data.
4. The method of any of claims 1 to 3, wherein the set of base operation data comprises L base operation data, L being a positive integer greater than 1;
screening the target operation data from the basic operation data set corresponding to the target customer according to the respective risk correlation information of the basic operation data comprises:
determining M candidate basic operation data in the basic operation data set according to a preset risk threshold, wherein the risk correlation information of the candidate basic operation data is greater than or equal to the preset risk threshold, and L is greater than M and is greater than or equal to 1; and
and screening M candidate basic operation data from the basic operation data set to obtain M target operation data.
5. The method of claim 1, further comprising:
respectively training each initial risk prediction model in the N initial risk prediction models by using a training sample to obtain N trained candidate risk prediction models, wherein the training sample comprises sample target operation data and a sample label corresponding to the sample target operation data, and N is a positive integer greater than 2;
determining respective prediction effect information of the N candidate risk prediction models according to the sample labels; and
and determining the target risk prediction model from the N candidate risk prediction models according to the respective prediction effect information of the N candidate risk prediction models.
6. The method of claim 5, wherein the predictive effect information comprises at least one of:
predicted coverage information, predicted accuracy information, and predicted recall information.
7. The method of claim 1, wherein,
the target risk prediction model comprises a decision tree prediction model constructed based on a decision tree algorithm; or
The target risk prediction model comprises a neural network prediction model constructed based on a neural network algorithm.
8. The method of claim 7, wherein the decision tree prediction model comprises any one of:
an extreme gradient lifting model, a random forest model and a mild gradient lifting model.
9. The method of claim 1, wherein the risk prediction includes a first product risk probability, the product risk prediction including a product risk probability;
wherein determining a product risk prediction result for a target product associated with the target customer based on the risk prediction result and the operation scoring result comprises:
processing the operation scoring result by using a preset mapping function to obtain a second product risk probability corresponding to the operation scoring result; and
and determining the product risk probability of the target product according to the first product risk probability and the second product risk probability.
10. A product risk prediction device, comprising:
the query module is used for querying a basic operation data set corresponding to a target client and the risk correlation information of the basic operation data in the basic operation data set from a basic operation data table by using a target client field and a basic operation data field;
the first screening module is used for screening target operation data from the basic operation data set corresponding to the target customer according to the respective risk correlation information of the basic operation data;
the risk prediction module is used for inputting the target operation data into a target risk prediction model and outputting a risk prediction result;
the operation scoring module is used for inputting the target operation data into an operation data scoring model and outputting an operation scoring result; and
and the product risk determining module is used for determining a product risk prediction result of a target product related to the target customer according to the risk prediction result and the operation scoring result.
11. An electronic device, comprising:
one or more processors;
a storage device to store one or more programs,
wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method of any of claims 1-9.
12. A computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to perform the method according to any one of claims 1 to 9.
13. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1 to 9.
CN202211068818.8A 2022-09-01 2022-09-01 Product risk prediction method, device, equipment and medium Pending CN115409636A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211068818.8A CN115409636A (en) 2022-09-01 2022-09-01 Product risk prediction method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211068818.8A CN115409636A (en) 2022-09-01 2022-09-01 Product risk prediction method, device, equipment and medium

Publications (1)

Publication Number Publication Date
CN115409636A true CN115409636A (en) 2022-11-29

Family

ID=84163218

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211068818.8A Pending CN115409636A (en) 2022-09-01 2022-09-01 Product risk prediction method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN115409636A (en)

Similar Documents

Publication Publication Date Title
CN108564286B (en) Artificial intelligent financial wind-control credit assessment method and system based on big data credit investigation
US20180260891A1 (en) Systems and methods for generating and using optimized ensemble models
CN110738527A (en) feature importance ranking method, device, equipment and storage medium
CN116542395A (en) Low-carbon building monitoring system and method
Liu et al. Financial credit risk assessment of online supply chain in construction industry with a hybrid model chain
CN111563187A (en) Relationship determination method, device and system and electronic equipment
CN112950359B (en) User identification method and device
CN112561685B (en) Customer classification method and device
CN115809837B (en) Financial enterprise management method, equipment and medium based on digital simulation scene
CN116091249A (en) Transaction risk assessment method, device, electronic equipment and medium
CN116468273A (en) Customer risk identification method and device
CN115795345A (en) Information processing method, device, equipment and storage medium
US20220164374A1 (en) Method of scoring and valuing data for exchange
CN114782170A (en) Method, apparatus, device and medium for evaluating model risk level
US20220058658A1 (en) Method of scoring and valuing data for exchange
CN114493853A (en) Credit rating evaluation method, credit rating evaluation device, electronic device and storage medium
CN114968821A (en) Test data generation method and device based on reinforcement learning
CN114943563A (en) Rights and interests pushing method and device, computer equipment and storage medium
KR20230103025A (en) Method, Apparatus, and System for provision of corporate credit analysis and rating information
CN115409636A (en) Product risk prediction method, device, equipment and medium
CN113052509A (en) Model evaluation method, model evaluation apparatus, electronic device, and storage medium
CN113674087A (en) Enterprise credit rating method, apparatus, electronic device and medium
CN111784503B (en) Operation rendering method, system and storage medium of communication credit investigation data
CN117934154A (en) Transaction risk prediction method, model training method, device, equipment, medium and program product
TWI657393B (en) Marketing customer group prediction system and method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination