CN115293889A - Credit risk prediction model training method, electronic device and readable storage medium - Google Patents
Credit risk prediction model training method, electronic device and readable storage medium Download PDFInfo
- Publication number
- CN115293889A CN115293889A CN202210995711.1A CN202210995711A CN115293889A CN 115293889 A CN115293889 A CN 115293889A CN 202210995711 A CN202210995711 A CN 202210995711A CN 115293889 A CN115293889 A CN 115293889A
- Authority
- CN
- China
- Prior art keywords
- model
- sub
- prediction
- risk prediction
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012549 training Methods 0.000 title claims abstract description 275
- 238000000034 method Methods 0.000 title claims abstract description 69
- 238000013058 risk prediction model Methods 0.000 title claims abstract description 48
- 238000003860 storage Methods 0.000 title claims abstract description 22
- 238000009826 distribution Methods 0.000 claims description 36
- 230000006399 behavior Effects 0.000 claims description 29
- 230000002776 aggregation Effects 0.000 claims description 10
- 238000004220 aggregation Methods 0.000 claims description 10
- 238000009499 grossing Methods 0.000 claims description 6
- 238000005516 engineering process Methods 0.000 abstract description 7
- 238000004590 computer program Methods 0.000 description 13
- 238000004422 calculation algorithm Methods 0.000 description 10
- 230000003321 amplification Effects 0.000 description 9
- 238000003199 nucleic acid amplification method Methods 0.000 description 9
- 238000005457 optimization Methods 0.000 description 9
- 239000013598 vector Substances 0.000 description 8
- 230000009286 beneficial effect Effects 0.000 description 7
- 238000010586 diagram Methods 0.000 description 7
- 230000006870 function Effects 0.000 description 5
- 230000008569 process Effects 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 3
- 238000007621 cluster analysis Methods 0.000 description 3
- 238000004891 communication Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 238000013473 artificial intelligence Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 239000013589 supplement Substances 0.000 description 2
- 241000699670 Mus sp. Species 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000004069 differentiation Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Economics (AREA)
- Strategic Management (AREA)
- Marketing (AREA)
- Game Theory and Decision Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Development Economics (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The application discloses a credit risk prediction model training method, electronic equipment and a readable storage medium, which are applied to the technical field of financial science and technology, wherein the credit risk prediction model training method comprises the following steps: acquiring a training sample set and sample weights of training samples in the training sample set; obtaining a credit risk prediction total model according to the training sample set and the weight of each sample, wherein the credit risk prediction total model consists of a plurality of risk prediction submodels; clustering the risk prediction submodels according to the submodel prediction result of the risk prediction submodel to obtain a submodel group; amplifying a training sample set according to the sub-model prediction result and the sub-model group; and returning to the execution step: and (4) obtaining a credit risk prediction total model through iterative training according to the training sample set and the weights of all samples until the credit risk prediction total model meets a preset iteration updating ending condition, and obtaining a target credit risk prediction total model. The method and the device solve the technical problem that the prediction accuracy of the credit risk of the user is low.
Description
Technical Field
The application relates to the technical field of artificial intelligence of financial technology (Fintech), in particular to a credit risk prediction model training method, electronic equipment and a readable storage medium.
Background
With the continuous development of financial technologies, especially internet technology and finance, more and more technologies (such as distributed technology, artificial intelligence and the like) are applied to the financial field, but the financial industry also puts higher requirements on the technologies, such as higher requirements on the credit level of users in the financial industry.
At present, in order to evaluate the credit level of a user, a to-be-trained risk total model is trained according to predicted credit risks of a plurality of trained risk submodels for user behavior data and real credit risks corresponding to the user behavior data to obtain a risk total model, so that the credit risk of the user is predicted through the risk total model, but the model weight of a small number of risk submodels possibly exists in the risk total model obtained by training in the method, and the model weights of a large number of remaining risk submodels are low, so that the prediction accuracy of the risk total model is low due to the fact that the risk total model depends on the small number of risk submodels too much, and therefore, the prediction accuracy of the current credit risk of the user is low.
Disclosure of Invention
The application mainly aims to provide a credit risk prediction model training method, electronic equipment and a readable storage medium, and aims to solve the technical problem that in the prior art, the prediction accuracy of user credit risk is low.
In order to achieve the above object, the present application provides a training method for a credit risk prediction model, which is applied to a credit risk prediction device, and the training method for the credit risk prediction model includes:
acquiring historical behavior data of a user as a training sample set and sample weights of training samples in the training sample set;
according to the training sample set and the sample weights, carrying out iterative training to obtain a credit risk prediction total model, wherein the credit risk prediction total model consists of a plurality of risk prediction submodels;
obtaining the sub-model prediction result of each risk prediction sub-model for the training sample set, and clustering the risk prediction sub-models according to the sub-model prediction results to obtain at least one sub-model group;
optimizing each sample weight according to each submodel prediction result and each submodel group to amplify the training sample set;
and returning to the execution step: and according to the training sample set and the weight of each sample, carrying out iterative training to obtain a credit risk prediction total model until the credit risk prediction total model meets a preset iterative updating end condition, and obtaining a target credit risk prediction total model.
To achieve the above object, the present application further provides a credit risk prediction apparatus applied to a credit risk prediction device, including:
the acquisition module is used for acquiring historical behavior data of a user as a training sample set and sample weights of training samples in the training sample set;
the training module is used for carrying out iterative training to obtain a credit risk prediction total model according to the training sample set and the sample weights, wherein the credit risk prediction total model consists of a plurality of risk prediction submodels;
the clustering module is used for obtaining the sub-model prediction results of each risk prediction sub-model for the training sample set, and clustering the risk prediction sub-models according to the sub-model prediction results to obtain at least one sub-model group;
the amplification module is used for optimizing the weight of each sample according to the prediction result of each submodel and each submodel group so as to amplify the training sample set;
an optimization module for returning to the execution step: and according to the training sample set and the weight of each sample, carrying out iterative training to obtain a credit risk prediction total model until the credit risk prediction total model meets a preset iterative updating end condition, and obtaining a target credit risk prediction total model.
The present application further provides an electronic device, the electronic device including: a memory, a processor, and a program of the credit risk prediction model training method stored on the memory and executable on the processor, the program of the credit risk prediction model training method when executed by the processor may implement the steps of the credit risk prediction model training method as described above.
The present application also provides a computer-readable storage medium having stored thereon a program implementing a credit risk prediction model training method, which when executed by a processor implements the steps of the credit risk prediction model training method as described above.
The present application also provides a computer program product comprising a computer program which, when executed by a processor, performs the steps of the above training method for a credit risk prediction model.
Compared with a method for obtaining a risk total model by training a to-be-trained risk total model according to credit risks predicted by a plurality of trained risk submodels on user behavior data and real credit risks corresponding to the user behavior data, the method for obtaining the risk total model comprises the steps of obtaining historical behavior data of a user as a training sample set and sample weights of training samples in the training sample set; iteratively training according to the training sample set and each sample weight to obtain a credit risk prediction total model, wherein the credit risk prediction total model consists of a plurality of risk prediction submodels to obtain submodel prediction results of each risk prediction submodel on the training sample set, clustering the risk prediction submodels according to the prediction results of each submodel to obtain at least one submodel group, and realizing cluster analysis of the risk prediction submodels to obtain a submodel group capable of providing supplementary information for the credit risk prediction total model, and further optimizing each sample weight according to each submodel prediction result and each submodel group to amplify the training sample set so that the credit risk prediction total model can carry out incremental learning according to the amplified training samples, and returning to the execution step: according to the training sample set and each sample weight, iterative training is carried out to obtain a credit risk prediction total model until the credit risk prediction total model meets a preset iteration updating end condition, a target credit risk prediction total model is obtained, a proper sample weight is iterated for the credit risk prediction total model, model weighting weights of each risk prediction submodel obtained through training according to a training sample set after each sample weight is amplified after iterative optimization are distributed uniformly, and therefore the data supplement effect of differentiated risk prediction submodels is improved, the technical defect that the prediction accuracy of the credit risk total model is low due to the fact that the risk total model depends on a small number of risk submodels easily occurs when the to-be-trained risk total model is trained to obtain the method of the risk total model is avoided, and the prediction accuracy of the user credit risk is improved.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application.
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
FIG. 1 is a schematic flowchart illustrating a first embodiment of a method for training a credit risk prediction model according to the present application;
FIG. 2 is a schematic flowchart illustrating a second embodiment of a training method for a credit risk prediction model according to the present application;
FIG. 3 is a schematic diagram of an apparatus involved in the training method of the credit risk prediction model of the present application;
fig. 4 is a schematic device structure diagram of a hardware operating environment related to a credit risk prediction model training method in an embodiment of the present application.
The objectives, features, and advantages of the present application will be further described with reference to the accompanying drawings.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, embodiments of the present application are described in detail below with reference to the accompanying drawings. It is to be understood that the embodiments described are only a few embodiments of the present application and not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without any creative effort belong to the protection scope of the present application.
Example one
In a first embodiment of the credit risk prediction model training method, referring to fig. 1, the credit risk prediction model training method includes:
step S10, acquiring historical behavior data of a user as a training sample set and sample weights of training samples in the training sample set;
in this embodiment, it should be noted that the number of the users may be one or multiple, the historical behavior data is behavior data of the user for the bank at the last time step, the time step may be set as a time period extracted periodically, and the number of the historical behavior data is multiple.
Exemplarily, step S10 includes: acquiring behavior data of a user at the last time step to obtain historical behavior data, and taking the historical behavior data as a training sample set, wherein the time step can be one month, one year or three years; the time step can be an experience value of the best judgment time for judging the credit risk behavior of the user, and can also be set by the credit risk judgment party of the user; the historical behavior data can be behavior data of a plurality of time periods of the same user, behavior data of a plurality of time periods of different users, and behavior data of the same time period of different users; and acquiring the sample weight of each training sample in the training sample set, wherein the sample weight can be obtained by calculation according to the sub-model prediction result of each risk prediction sub-model on each training sample, and can also be the weight experience value of each training sample with the best credit risk prediction effect, and can also be set by a user credit risk evaluation party.
Step S20, iteratively training to obtain a credit risk prediction total model according to the training sample set and the sample weights, wherein the credit risk prediction total model consists of a plurality of risk prediction submodels;
s30, acquiring sub-model prediction results of each risk prediction sub-model for the training sample set, and clustering the risk prediction sub-models according to the sub-model prediction results to obtain at least one sub-model group;
s40, optimizing the weight of each sample according to the prediction result of each submodel and each submodel group so as to amplify the training sample set;
step S50, return to the execution step: and according to the training sample set and the weight of each sample, carrying out iterative training to obtain a credit risk prediction total model until the credit risk prediction total model meets a preset iterative updating end condition, and obtaining a target credit risk prediction total model.
In this embodiment, it should be noted that the risk prediction submodel may be a score card model, a tree model, or a deep learning model. The sub-model cluster is a cluster of at least one risk prediction sub-model.
Exemplarily, steps S20 to S50 include: performing iterative training on the credit risk prediction total model to be trained according to the training sample set and the sample weights to obtain a credit risk prediction total model; obtaining the sub-model prediction result of each risk prediction sub-model for the training sample set, carrying out cluster analysis or similarity analysis on the sub-model result according to the sub-model prediction result to obtain an analysis result, and clustering the risk prediction sub-models according to the analysis result to obtain at least one sub-model group; optimizing each sample weight according to each submodel prediction result and each submodel group, and amplifying the training sample set according to each sample weight; and returning to the execution step: and carrying out iterative training according to the training sample set and each sample weight to obtain a credit risk prediction total model, and if the credit risk prediction total model is detected to meet a preset iteration updating end condition, taking the credit risk prediction total model as a target credit risk prediction total model, wherein the training sample set is amplified by optimizing each sample weight, so that the aim of training the credit risk prediction total model by using various training sample sets in the process of continuously training the credit risk prediction total model in an iterative manner is fulfilled, and the generalization of the target credit risk prediction total model obtained by final training is enhanced.
In step S20, iteratively training the training sample set and the sample weights to obtain a total model of credit risk prediction, where the step of forming the total model of credit risk prediction by a plurality of risk prediction submodels includes:
s21, selecting training samples from the training sample set, and determining samples to be predicted according to the training samples and sample weights corresponding to the training samples;
s22, respectively inputting the sample to be predicted into each risk prediction submodel to obtain an output prediction result of each submodel;
s23, carrying out weighted aggregation on the output prediction results of the sub-models according to the model weighted weights corresponding to the risk prediction sub-models to obtain a total model output prediction result;
s24, outputting a prediction result according to the total model, and optimizing each model weighting weight and each risk prediction submodel;
step S25, return to the execution step: and selecting training samples from the training sample set, determining samples to be predicted according to the training samples and sample weights corresponding to the training samples until each model weighted weight meets a preset weight condition and each sub-model parameter meets a preset model parameter condition, and obtaining the credit risk prediction total model.
In this embodiment, it should be noted that the sample to be predicted includes a training sample and a sample weight corresponding to the training sample, and the sample to be predicted is used for performing iterative training on the credit risk prediction total model.
Exemplarily, steps S21 to S25 include: randomly selecting training samples from the training sample set, or selecting training samples from the training sample set according to a preset sequence, wherein the preset sequence can be generated by sample weights, and generating samples to be predicted according to the training samples and the sample weights corresponding to the training samples; respectively inputting the sample to be predicted into each risk prediction submodel to obtain a submodel output prediction result of each risk prediction submodel for the sample to be predicted; according to the model weighting weight corresponding to each risk prediction submodel, performing weighting aggregation on the output prediction result of each submodel by using a preset aggregation method to obtain the output characteristic of the submodel, and mapping the output characteristic of the submodel into the output prediction result of a total model by using the credit risk prediction total model, wherein the preset aggregation method can be a mean aggregation method, a linear aggregation method, an overlap aggregation method and other aggregation methods; acquiring a real label corresponding to the training sample, and optimizing a model weighting weight corresponding to each risk prediction submodel and each risk prediction submodel according to the difference between the real label and a prediction result output by the total model; and returning to the execution step: and selecting training samples from the training sample set, determining samples to be predicted according to the training samples and sample weights corresponding to the training samples until each model weighted weight meets a preset weight condition and each sub-model parameter meets a preset model parameter condition, and obtaining the credit risk prediction total model.
As an example, the determining that each of the model weighting weights satisfies the preset weighting condition may be: if the weighted weights of the models meet a preset weight range, wherein the preset weight range is a preset range of the weighted weights of the models, which is used for judging that the weighted weights of the models are uniformly distributed, the weighted weights of the models meet a preset weight condition, if the weighted weights of the models do not meet the preset weight range, the weighted weights of the models do not meet the preset weight condition, or whether the weighted weights of the models are greater than a first preset weight threshold or less than a second preset weight threshold exists, if the weighted weights of the models are greater than the first preset weight threshold or less than the second preset weight threshold, the weighted weights of the models do not meet the preset weight condition, and if the weighted weights of the models do not exist, the weighted weights of the models are greater than the first preset weight threshold or less than the second preset weight threshold, the weighted weights of the models meet the preset weight condition.
As an example, until each model weighting weight satisfies a preset weight condition and each submodel parameter satisfies a preset model parameter condition, the step of obtaining the credit risk prediction total model may be: according to the difference degree, constructing model loss corresponding to the credit risk prediction total model; judging whether the model loss is converged, if so, taking the model weighting weight corresponding to each risk prediction submodel and the credit risk prediction total model under each risk prediction submodel as a credit risk prediction total model, if not, updating the credit risk prediction total model according to the gradient calculated by the model loss, and returning to the execution step: and selecting a training sample from the training sample set, and determining a sample to be predicted according to the training sample and the sample weight corresponding to the training sample until the calculated model loss is converged.
The risk prediction submodel is obtained by iterative training of a training sample and the risk prediction submodel to be trained, and the risk prediction submodel is subjected to iterative training for multiple times through the training sample and corresponding sample weights after multiple times of optimization, so that the model weight weights of the risk prediction submodels are uniformly distributed.
In step S40, the step of optimizing each sample weight according to each sub-model prediction result and each sub-model group to amplify the training sample set includes:
step S41, selecting a target sub-model group meeting preset conditions from each sub-model group, wherein the preset conditions comprise at least one of correlation conditions and importance conditions;
and S42, adjusting the weight of each sample according to the target submodel group and the prediction result of each submodel so as to amplify the training sample set.
In this embodiment, it should be noted that the preset condition is a preset sub-model group condition for screening a target sub-model group that needs to be subjected to sample weight optimization.
Exemplarily, steps S41 to S42 include: selecting a target submodel group meeting preset conditions from each submodel group; and adjusting the sample weight of each training sample in the training sample set according to the target sub-model group and the training sample set to obtain an adjusted weight so as to amplify the training sample set.
In step S41, the selecting a target sub-model group satisfying a preset condition from the sub-model groups, where the preset condition includes at least one of a relevance condition and an importance condition, includes:
step A10, obtaining a total model prediction result of the credit risk prediction total model to the training sample set;
step A20, generating sub-model group correlation between each sub-model group and the credit risk prediction total model according to the prediction result of each sub-model and the prediction result of the total model;
and A30, selecting a target sub-model group with the correlation smaller than a preset correlation threshold value from each sub-model group.
In this embodiment, it should be noted that the preset correlation threshold is a preset critical value of correlation of the submodel group between the submodel group for determining that the sample weight optimization is required and the credit risk prediction total model.
Exemplarily, the steps a10 to a30 include: inputting each training sample in the training sample set into a credit risk prediction total model to obtain a total model prediction result; obtaining the result correlation between the prediction result of each sub-model and the prediction result of the total model, and taking the result correlation as the sub-model correlation between the risk prediction sub-model and the credit risk prediction total model; generating sub-model group correlation between each sub-model group and the credit risk prediction total model according to the sub-model correlation; and selecting a target sub-model group with the correlation smaller than a preset correlation threshold value from each sub-model group.
As an example, the step of generating a sub-model group correlation between each sub-model group and the credit risk prediction total model according to each sub-model correlation may be: and taking the sum or the average value of the sub-model correlation corresponding to each risk prediction sub-model in the sub-model group as the sub-model group correlation among the credit risk prediction total models.
In step S41, the selecting a target sub-model group satisfying a preset condition from the sub-model groups, where the preset condition includes at least one of a relevance condition and an importance condition, further includes:
step B10, obtaining model weighting weight of each risk prediction submodel;
step B20, determining the sub-model group importance of each sub-model group to the credit risk prediction total model according to the model weighting weight;
and B30, selecting a target sub-model group with the importance degree of the sub-model group smaller than a preset importance degree threshold value from each sub-model group.
In this embodiment, it should be noted that the preset importance threshold is a preset critical value of importance of the submodel group between the submodel group for determining that the sample weight optimization is required and the credit risk prediction total model.
Exemplarily, the steps B10 to B30 include: obtaining model weighting of each risk prediction submodel; taking the weighted weight of each model as the submodel importance of each risk prediction submodel to the credit risk prediction total model, and generating the submodel group importance of each submodel group to the credit risk prediction total model according to the submodel importance; and selecting a target sub-model group with the importance degree of the sub-model group smaller than a preset importance degree threshold value from each sub-model group.
As an example, the step of generating the importance of each sub-model group to the credit risk prediction total model according to each sub-model importance may be: and taking the sum or the average value of the sub-model importance degrees corresponding to each risk prediction sub-model in the sub-model group as the sub-model group importance degree between the credit risk prediction total models.
In step S41, the selecting a target sub-model group satisfying a preset condition from the sub-model groups, where the preset condition includes at least one of a relevance condition and an importance condition, further includes:
and C10, selecting a target sub-model group of which the correlation of the sub-model group is smaller than a preset correlation threshold and the importance of the sub-model group is smaller than a preset importance threshold from each sub-model group.
Exemplarily, step C10 includes: and selecting a target sub-model group of which the correlation of the sub-model group is smaller than a preset correlation threshold value and the importance of the sub-model group is smaller than a preset importance threshold value from each sub-model group according to the correlation of each sub-model group and the importance of each sub-model group.
In step S42, the step of adjusting the weight of each sample according to the target sub-model group and the prediction result of each sub-model includes:
d10, determining the distribution information of the prediction result of the target sub-model group according to the target sub-model group and the prediction result of each sub-model;
and D20, adjusting the weight of each sample according to the distribution information of the prediction result.
In this embodiment, it should be noted that the prediction result distribution information includes distribution information of the sub-model prediction results of each risk prediction sub-model in the target sub-model group. The prediction result distribution information may be data distribution information of prediction results of each submodel, vector distribution information of prediction results of each submodel, and proportion information of prediction results of each submodel.
Exemplarily, the steps D10 to D20 include: determining the distribution information of the sub-model prediction results of each risk prediction sub-model in the target sub-model group according to the sub-model prediction results to obtain the prediction result distribution information; and adjusting the weight of each sample according to the distribution information of the prediction result.
In step D20, the step of adjusting the weights of the samples according to the distribution information of the prediction results includes:
step D21, obtaining the correct prediction proportion of each training sample correctly predicted by each risk prediction submodel in the prediction result distribution information, wherein the prediction result distribution information comprises the submodel prediction results of each risk prediction submodel aiming at the training samples;
step D22, judging whether the correct prediction ratio is larger than a preset ratio threshold value;
step D23, if yes, reducing the sample weight corresponding to each training sample;
and D24, if not, increasing the sample weight corresponding to each training sample.
In this embodiment, it should be noted that the preset duty threshold is a preset critical value for determining a correct prediction duty ratio of each training sample successfully predicted by the target sub-model group, which is correctly predicted by each risk prediction sub-model.
As an example, steps D21 to D24 include: the prediction result distribution comprises prediction ratio information of prediction results of each sub-model, the prediction ratio information comprises a correct prediction ratio and an incorrect prediction ratio, and whether the correct prediction ratio is larger than a preset ratio threshold value is judged; if the correct prediction ratio is larger than a preset ratio threshold, reducing the sample weight corresponding to each training sample; and if the correct prediction occupation ratio is not greater than a preset occupation ratio threshold, increasing the sample weight corresponding to each training sample.
As an example, steps D21 to D24 include: the prediction result distribution comprises data distribution information of prediction results of the submodels, and if the data distribution information meets a preset data range, the preset distribution range is a preset data distribution range for judging that the training samples are successfully predicted by the target submodel group, the sample weight corresponding to the training samples is reduced; and if the data distribution information does not meet the preset data range, increasing the sample weight corresponding to each training sample. Or the prediction result distribution comprises vector distribution information of prediction results of the submodels, and if the vector distribution information meets a preset vector range, wherein the preset vector range is a preset vector distribution range for judging the prediction results of the submodels, successfully predicted by the target submodel group, of the training samples, the sample weight corresponding to each training sample is reduced; and if the data distribution information does not meet the preset vector range, increasing the sample weight corresponding to each training sample.
The embodiment of the application provides a training method of a credit risk prediction model, compared with a method for obtaining a total risk model by training a total risk model to be trained according to a plurality of trained risk submodels for predicting credit risk of user behavior data and real credit risk corresponding to the user behavior data, the embodiment of the application determines a training sample set of the total credit risk prediction model and sample weight of each training sample in the training sample set according to historical behavior data by obtaining historical behavior data of a user, wherein the total credit risk prediction model is obtained by the plurality of risk prediction submodels through model weighting so as to collect submodel prediction results of each risk prediction submodel for the training sample set, so that each risk prediction submodel is classified according to the submodel prediction results to obtain at least one submodel group, cluster analysis of the risk prediction submodels is realized, a model group capable of providing supplementary information for the total credit risk prediction model is obtained, and each sample weight is optimized according to the training sample set and the model group, thereby expanding the training sample set, so as to be used for the training total risk prediction model to be returned to carry out increment learning according to the training sample after the credit risk prediction model expansion, and the training steps: according to the training sample set and the sample weights, iterative training is carried out to obtain a credit risk prediction total model until the credit risk prediction total model meets a preset iteration updating end condition, a target credit risk prediction total model is obtained, and sample weights suitable for iteration of the credit risk prediction total model are applied, so that model weighting weights of all risk prediction submodels obtained by training the training sample set after amplification according to the sample weights after iterative optimization are uniformly distributed, the data supplement effect of differentiated risk prediction submodels is improved, the technical defect that the prediction accuracy of the credit risk total model is low due to the fact that the risk total model depends on a small number of risk submodels easily occurs when the training is carried out on the to-be-trained risk total model to obtain the method of the risk total model is avoided, and the prediction accuracy of the user credit risk is improved.
Example two
Further, referring to fig. 2, based on the first embodiment of the present application, in another embodiment of the present application, the same or similar contents to the first embodiment described above may be referred to the above description, and are not repeated again in the following. On this basis, in step S30, the step of clustering the risk prediction submodels according to the prediction results of the respective submodels to obtain at least one submodel group includes:
s31, acquiring sub-model prediction result similarity among the sub-model prediction results, and taking the sub-model prediction result similarity as the sub-model similarity among the risk prediction sub-models;
and S32, clustering each risk prediction submodel according to the similarity of the submodels to obtain at least one submodel group.
Exemplarily, steps S31 to S32 include: calculating the sub-model prediction result similarity between the sub-model prediction results through a preset similarity algorithm, wherein the preset similarity algorithm can be an Euclidean distance algorithm, a Pearson correlation coefficient algorithm or a cosine similarity algorithm, and it can be understood that because the sub-model prediction results are credit risks of users obtained through prediction, the differentiation between the sub-model prediction results is expressed in numerical values, and the Euclidean distance algorithm has simple algorithm content and needs to ensure that all indexes are at the same scale level; the pearson correlation coefficient algorithm cannot calculate data with variance not 0; the cosine similarity algorithm pays more attention to the difference of the vectors in the direction, so that the Euclidean distance algorithm is preferably adopted to take the efficiency and the accuracy of obtaining the similarity of the submodels into consideration. And clustering each risk prediction submodel corresponding to the similarity of the target submodel which is greater than a model similarity threshold in the similarity of each submodel, wherein the model similarity threshold is a critical value for judging the similarity of the submodel with higher similarity between the risk prediction submodels, so as to obtain at least one submodel group.
As an example, step S30 includes: selecting a first center model of each sub-model group from the risk prediction sub-models, wherein the first center model can be selected according to experience or can be set manually; obtaining model distances between risk prediction submodels except the central models and the first central models; according to the model distance, each risk prediction submodel is allocated to a submodel group corresponding to the first center model with the minimum model distance; determining a second center model of each sub-model group according to each risk prediction sub-model in the sub-model group, judging whether the second center model is consistent with the first center model, if so, taking the sub-model group as a target sub-model group, and if not, returning to the execution step: and selecting a first center model of each sub-model group from the risk prediction sub-models until the second center model is consistent with the first center model to obtain at least one sub-model group.
In step S10, before the step of obtaining the historical behavior data of the user as a training sample set and a sample weight of each training sample in the training sample set, the method further includes:
s11, acquiring a real label corresponding to each training sample;
s12, generating a sub-model prediction result of each risk prediction sub-model for the training sample set according to the training sample set and each risk prediction sub-model;
s13, determining the number of submodels of each training sample which are predicted correctly by each risk prediction submodel according to the real label and the submodel prediction result;
and S14, generating sample weights of the training samples according to the number of the submodels, preset parameters and weight smoothing coefficients.
In this embodiment, it should be noted that the real label is the real credit risk of the user in each training sample.
Exemplarily, steps S11 to S14 include: acquiring a real label corresponding to each training sample; mapping each training sample to be credit risk of a user through each risk prediction submodel to obtain a submodel prediction result of each risk prediction submodel on the training sample set; judging whether the real label is consistent with the sub-model prediction result, if so, judging that the sub-model prediction of the risk prediction corresponding to the sub-model prediction result is correct, accumulating the sub-model number of the risk prediction sub-model corresponding to the sub-model prediction result, and returning to the execution step: judging whether the real label is consistent with the sub-model prediction result, if the real label is inconsistent with the sub-model prediction result, judging that the sub-model prediction error of the risk prediction corresponding to the sub-model prediction result is caused, returning to the execution step to judge whether the real label is consistent with the sub-model prediction result until the sub-model prediction result is judged completely, obtaining the number of sub-models of each training sample, which are predicted correctly by each risk prediction sub-model, and generating the sample weight of each training sample according to the number of sub-models, preset parameters and weight smoothing coefficients.
Optionally, the step of generating the sample weight of each training sample according to the number of the sub-models, the preset parameter and the weight smoothing coefficient may specifically be:
wherein w is the sample weight of each training sample; m is i Predicting the number of correct submodels for each of the training samples by each of the risk prediction submodels; alpha is a weight smoothing coefficient; beta is a preset parameter.
As an example, the step of generating the sample weight of each training sample may further be: and acquiring the number of samples of each training sample, generating a normal distribution random number of the number of samples, and distributing the normal distribution random number to each training sample to be used as the sample weight of each training sample.
The method can be understood that the sample weight of each training sample is determined according to the number of the submodels, which are predicted correctly by the risk prediction submodels, of the training samples, the number of the submodels is in a negative correlation with the sample weight, and the smaller the number of the submodels is, the sample is the supplementary sample data which is differentiated from other samples, so that the iterative optimization step of subsequently adjusting the sample weight is reduced to a certain extent by giving the higher sample weight to the supplementary sample data, and the training efficiency of the credit risk prediction total model is improved.
EXAMPLE III
An embodiment of the present application further provides a credit risk prediction apparatus, where the credit risk prediction apparatus is applied to a credit risk prediction device, and referring to fig. 3, the credit risk prediction apparatus includes:
the acquisition module is used for acquiring historical behavior data of a user as a training sample set and sample weights of training samples in the training sample set;
the training module is used for carrying out iterative training to obtain a credit risk prediction total model according to the training sample set and the sample weights, wherein the credit risk prediction total model consists of a plurality of risk prediction submodels;
the clustering module is used for acquiring the submodel prediction results of each risk prediction submodel on the training sample set, and clustering the risk prediction submodels according to the submodel prediction results to obtain at least one submodel group;
the amplification module is used for optimizing the weight of each sample according to the prediction result of each submodel and each submodel group so as to amplify the training sample set;
an optimization module for returning to the execution step: and according to the training sample set and the weights of all samples, carrying out iterative training to obtain a credit risk prediction total model until the credit risk prediction total model meets a preset iterative update ending condition, and obtaining a target credit risk prediction total model.
Optionally, the training module is further configured to:
selecting a training sample from the training sample set, and determining a sample to be predicted according to the training sample and a sample weight corresponding to the training sample;
respectively inputting the sample to be predicted into each risk prediction submodel to obtain an output prediction result of each submodel;
according to the model weighting weight corresponding to each risk prediction submodel, performing weighting aggregation on the output prediction result of each submodel to obtain a total model output prediction result;
outputting a prediction result according to the total model, and optimizing each model weighting weight and each risk prediction submodel;
and returning to the execution step: and selecting training samples from the training sample set, determining samples to be predicted according to the training samples and sample weights corresponding to the training samples until each model weighted weight meets a preset weight condition and each sub-model parameter meets a preset model parameter condition, and obtaining the credit risk prediction total model.
Optionally, the clustering module is further configured to:
acquiring sub-model prediction result similarity among the sub-model prediction results, and taking the sub-model prediction result similarity as the sub-model similarity among the risk prediction sub-models;
and clustering each risk prediction submodel according to the similarity of the submodels to obtain at least one submodel group.
Optionally, the amplification module is further configured to:
selecting a target sub-model group meeting a preset condition from each sub-model group, wherein the preset condition comprises at least one of a correlation condition and an importance condition;
and adjusting the weight of each sample according to the target submodel group and the prediction result of each submodel so as to amplify the training sample set.
Optionally, the amplification module is further configured to:
acquiring a total model prediction result of the credit risk prediction total model for the training sample set;
generating sub-model group correlation between each sub-model group and the credit risk prediction total model according to the prediction result of each sub-model and the prediction result of the total model;
and selecting a target submodel group of which the correlation is smaller than a preset correlation threshold value from each submodel group.
Optionally, the amplification module is further configured to:
obtaining model weighting of each risk prediction submodel;
determining the importance of each sub-model group to the sub-model group of the credit risk prediction total model according to the model weighting weight;
and selecting a target submodel group of which the importance degree of the submodel group is smaller than a preset importance degree threshold value from each submodel group.
Optionally, the amplification module is further configured to:
and selecting a target sub-model group of which the correlation of the sub-model group is smaller than a preset correlation threshold and the importance of the sub-model group is smaller than a preset importance threshold from each sub-model group.
Optionally, the amplification module is further configured to:
determining the distribution information of the prediction result of the target submodel group according to the target submodel group and the prediction result of each submodel;
and adjusting the weight of each sample according to the distribution information of the prediction result.
Optionally, the amplification module is further configured to:
acquiring the correct prediction ratio of each training sample to each risk prediction submodel to correct prediction in the prediction result distribution information, wherein the prediction result distribution information comprises the submodel prediction results of each risk prediction submodel for the training samples;
judging whether the correct prediction occupation ratio is larger than a preset occupation ratio threshold value or not;
if yes, reducing the sample weight corresponding to each training sample;
if not, increasing the sample weight corresponding to each training sample.
Optionally, before the step of obtaining the historical behavior data of the user as a training sample set and a sample weight of each training sample in the training sample set, the credit risk prediction apparatus is further configured to:
acquiring a real label corresponding to each training sample;
generating a sub-model prediction result of each risk prediction sub-model for the training sample set according to each training sample and each risk prediction sub-model;
determining the number of submodels of each training sample which are predicted correctly by each risk prediction submodel according to the real label and the submodel prediction result;
and generating the sample weight of each training sample according to the number of the sub-models, the preset parameters and the weight smoothing coefficient.
The credit risk prediction device provided by the application adopts the training method of the credit risk prediction model in the embodiment, so that the technical problem of low accuracy of prediction of the credit risk of the user is solved. Compared with the prior art, the beneficial effects of the credit risk prediction device provided by the embodiment of the application are the same as those of the training method of the credit risk prediction model provided by the embodiment, and other technical features of the credit risk prediction device are the same as those disclosed by the embodiment method, which are not repeated herein.
Example four
An embodiment of the present application provides an electronic device, which includes: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method for training a credit risk prediction model in the above embodiments.
Referring now to FIG. 4, shown is a schematic diagram of an electronic device suitable for use in implementing embodiments of the present disclosure. The electronic devices in the embodiments of the present disclosure may include, but are not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., car navigation terminals), and the like, and fixed terminals such as digital TVs, desktop computers, and the like. The electronic device shown in fig. 4 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.
As shown in fig. 4, the electronic device may include a processing means (e.g., a central processing unit, a graphic processor, etc.) that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) or a program loaded from a storage means into a Random Access Memory (RAM). In the RAM, various programs and data necessary for the operation of the electronic apparatus are also stored. The processing device, the ROM, and the RAM are connected to each other by a bus. An input/output (I/O) interface is also connected to the bus.
Generally, the following systems may be connected to the I/O interface: input devices including, for example, touch screens, touch pads, keyboards, mice, image sensors, microphones, accelerometers, gyroscopes, and the like; output devices including, for example, liquid Crystal Displays (LCDs), speakers, vibrators, and the like; storage devices including, for example, magnetic tape, hard disk, etc.; and a communication device. The communication means may allow the electronic device to communicate wirelessly or by wire with other devices to exchange data. While the figures illustrate an electronic device with various systems, it is to be understood that not all illustrated systems are required to be implemented or provided. More or fewer systems may alternatively be implemented or provided.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means, or installed from a storage means, or installed from a ROM. The computer program, when executed by a processing device, performs the above-described functions defined in the methods of the embodiments of the present disclosure.
The electronic equipment provided by the application adopts the credit risk prediction model training method in the embodiment, so that the technical problem of low accuracy of prediction of the credit risk of the user is solved. Compared with the prior art, the beneficial effects of the electronic device provided by the embodiment of the present application are the same as the beneficial effects of the training method for the credit risk prediction model provided by the above embodiment, and other technical features of the electronic device are the same as those disclosed in the method of the above embodiment, which are not described herein again.
It should be understood that portions of the present disclosure may be implemented in hardware, software, firmware, or a combination thereof. In the foregoing description of embodiments, the particular features, structures, materials, or characteristics may be combined in any suitable manner in any one or more embodiments or examples.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily think of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
EXAMPLE five
The present embodiments provide a computer-readable storage medium having computer-readable program instructions stored thereon for performing the method of the credit risk prediction model training method in the above-described embodiments.
The computer readable storage medium provided by the embodiments of the present application may be, for example, a usb disk, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, or device, or a combination of any of the above. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present embodiment, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, or device. Program code embodied on a computer readable storage medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
The computer-readable storage medium may be embodied in an electronic device; or may be present alone without being incorporated into the electronic device.
The computer readable storage medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring historical behavior data of a user as a training sample set and sample weights of training samples in the training sample set; according to the training sample set and the sample weights, carrying out iterative training to obtain a credit risk prediction total model, wherein the credit risk prediction total model consists of a plurality of risk prediction submodels; obtaining the sub-model prediction result of each risk prediction sub-model for the training sample set, and clustering the risk prediction sub-models according to the sub-model prediction results to obtain at least one sub-model group; optimizing each sample weight according to each submodel prediction result and each submodel group to amplify the training sample set; and returning to the execution step: and according to the training sample set and the weight of each sample, carrying out iterative training to obtain a credit risk prediction total model until the credit risk prediction total model meets a preset iterative updating end condition, and obtaining a target credit risk prediction total model.
Computer program code for carrying out operations for aspects of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules described in the embodiments of the present disclosure may be implemented by software or hardware. Wherein the names of the modules do not in some cases constitute a limitation of the unit itself.
The computer-readable storage medium provided by the application stores computer-readable program instructions for executing the above-mentioned training method of the credit risk prediction model, and solves the technical problem that the prediction accuracy of the user credit risk is low. Compared with the prior art, the beneficial effects of the computer-readable storage medium provided by the embodiment of the application are the same as the beneficial effects of the training method for the credit risk prediction model provided by the implementation, and are not repeated herein.
Example six
The present application also provides a computer program product comprising a computer program which, when executed by a processor, performs the steps of the above training method for a credit risk prediction model.
The computer program product solves the technical problem that the prediction accuracy of the credit risk of the user is low. Compared with the prior art, the beneficial effects of the computer program product provided by the embodiment of the present application are the same as the beneficial effects of the training method of the credit risk prediction model provided by the above embodiment, and are not repeated herein.
The above description is only a preferred embodiment of the present application, and not intended to limit the scope of the present application, and all modifications of equivalent structures and equivalent processes, which are made by the contents of the specification and the drawings, or which are directly or indirectly applied to other related technical fields, are included in the scope of the present application.
Claims (12)
1. A training method of a credit risk prediction model is characterized by comprising the following steps:
acquiring historical behavior data of a user as a training sample set and sample weights of training samples in the training sample set;
according to the training sample set and the sample weights, carrying out iterative training to obtain a credit risk prediction total model, wherein the credit risk prediction total model consists of a plurality of risk prediction submodels;
obtaining the sub-model prediction result of each risk prediction sub-model for the training sample set, and clustering the risk prediction sub-models according to the sub-model prediction results to obtain at least one sub-model group;
optimizing the weight of each sample according to the prediction result of each submodel and each submodel group so as to amplify the training sample set;
and returning to the execution step: and according to the training sample set and the weight of each sample, carrying out iterative training to obtain a credit risk prediction total model until the credit risk prediction total model meets a preset iterative updating end condition, and obtaining a target credit risk prediction total model.
2. The training method of the credit risk prediction model according to claim 1, wherein the iterative training is performed to obtain a total credit risk prediction model according to the training sample set and the sample weights, wherein the step of the total credit risk prediction model consisting of a plurality of risk prediction submodels comprises:
selecting a training sample from the training sample set, and determining a sample to be predicted according to the training sample and a sample weight corresponding to the training sample;
respectively inputting the sample to be predicted into each risk prediction submodel to obtain an output prediction result of each submodel;
according to the model weighting weight corresponding to each risk prediction submodel, performing weighting aggregation on the output prediction result of each submodel to obtain a total model output prediction result;
outputting a prediction result according to the total model, and optimizing each model weighting weight and each risk prediction submodel;
and returning to the execution step: and selecting training samples from the training sample set, determining samples to be predicted according to the training samples and sample weights corresponding to the training samples until each model weighted weight meets a preset weight condition and each sub-model parameter meets a preset model parameter condition, and obtaining the credit risk prediction total model.
3. The method of claim 1, wherein the step of clustering the risk prediction submodels according to the prediction results of each of the submodels to obtain at least one submodel group comprises:
acquiring sub-model prediction result similarity among the sub-model prediction results, and taking the sub-model prediction result similarity as the sub-model similarity among the risk prediction sub-models;
and clustering each risk prediction submodel according to the similarity of the submodels to obtain at least one submodel group.
4. The method of claim 1, wherein the step of optimizing the sample weights to augment the training sample set based on the prediction results of each of the sub-models and each of the sub-model groups comprises:
selecting a target submodel group meeting preset conditions from each submodel group, wherein the preset conditions comprise at least one of correlation conditions and importance conditions;
and adjusting the weight of each sample according to the target submodel group and the prediction result of each submodel so as to amplify the training sample set.
5. The training method of the credit risk prediction model according to claim 4, wherein the step of selecting a target sub-model group satisfying a predetermined condition from among the sub-model groups, wherein the predetermined condition includes at least one of a relevance condition and an importance condition includes:
acquiring a total model prediction result of the credit risk prediction total model for the training sample set;
generating sub-model group correlation between each sub-model group and the credit risk prediction total model according to the prediction result of each sub-model and the prediction result of the total model;
and selecting a target sub-model group with the correlation smaller than a preset correlation threshold value from each sub-model group.
6. The method for training a credit risk prediction model according to claim 4, wherein the step of selecting a target sub-model group satisfying a predetermined condition from each of the sub-model groups, wherein the predetermined condition includes at least one of a relevance condition and an importance condition, further comprises:
obtaining model weighting of each risk prediction submodel;
determining the importance of each sub-model group to the sub-model group of the credit risk prediction total model according to the model weighting weight;
and selecting a target sub-model group with the importance degree of the sub-model group smaller than a preset importance degree threshold value from each sub-model group.
7. The training method of the credit risk prediction model according to any one of claims 4 to 6, wherein the selecting a target sub-model group satisfying a predetermined condition among the sub-model groups, wherein the predetermined condition includes at least one of a relevance condition and an importance condition, further comprises:
and selecting a target sub-model group of which the correlation of the sub-model group is smaller than a preset correlation threshold and the importance of the sub-model group is smaller than a preset importance threshold from each sub-model group.
8. The method of claim 4, wherein the step of adjusting the weight of each sample according to the target sub-model group and the prediction result of each sub-model comprises:
determining the distribution information of the prediction result of the target submodel group according to the target submodel group and the prediction result of each submodel;
and adjusting the weight of each sample according to the distribution information of the prediction result.
9. The method for training the credit risk prediction model of claim 8, wherein the step of adjusting the weights of the samples according to the distribution information of the prediction results comprises:
acquiring the correct prediction ratio of each training sample to each risk prediction submodel to correct prediction in the prediction result distribution information, wherein the prediction result distribution information comprises the submodel prediction results of each risk prediction submodel for the training samples;
judging whether the correct prediction occupation ratio is larger than a preset occupation ratio threshold value or not;
if yes, reducing the sample weight corresponding to each training sample;
if not, the sample weight corresponding to each training sample is increased.
10. The training method of the credit risk prediction model according to claim 1, further comprising, before the step of obtaining the historical behavior data of the user as a training sample set and a sample weight of each training sample in the training sample set:
acquiring a real label corresponding to each training sample;
generating a sub-model prediction result of each risk prediction sub-model for the training sample set according to each training sample and each risk prediction sub-model;
determining the number of submodels of each training sample which are predicted correctly by each risk prediction submodel according to the real label and the submodel prediction result;
and generating the sample weight of each training sample according to the number of the sub-models, the preset parameters and the weight smoothing coefficient.
11. An electronic device, characterized in that the electronic device comprises:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the steps of the credit risk prediction model training method of any one of claims 1 to 10.
12. A computer-readable storage medium, wherein the computer-readable storage medium has stored thereon a program implementing a training method for a credit risk prediction model, the program being executed by a processor to implement the steps of the training method for a credit risk prediction model according to any one of claims 1 to 10.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210995711.1A CN115293889A (en) | 2022-08-18 | 2022-08-18 | Credit risk prediction model training method, electronic device and readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210995711.1A CN115293889A (en) | 2022-08-18 | 2022-08-18 | Credit risk prediction model training method, electronic device and readable storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115293889A true CN115293889A (en) | 2022-11-04 |
Family
ID=83830578
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210995711.1A Pending CN115293889A (en) | 2022-08-18 | 2022-08-18 | Credit risk prediction model training method, electronic device and readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115293889A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116319076A (en) * | 2023-05-15 | 2023-06-23 | 鹏城实验室 | Malicious traffic detection method, device, equipment and computer readable storage medium |
-
2022
- 2022-08-18 CN CN202210995711.1A patent/CN115293889A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116319076A (en) * | 2023-05-15 | 2023-06-23 | 鹏城实验室 | Malicious traffic detection method, device, equipment and computer readable storage medium |
CN116319076B (en) * | 2023-05-15 | 2023-08-25 | 鹏城实验室 | Malicious traffic detection method, device, equipment and computer readable storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110852438B (en) | Model generation method and device | |
CN111340221B (en) | Neural network structure sampling method and device | |
CN110766142A (en) | Model generation method and device | |
CN111523640B (en) | Training method and device for neural network model | |
WO2022110640A1 (en) | Model optimization method and apparatus, computer device and storage medium | |
CN112149699B (en) | Method and device for generating model and method and device for identifying image | |
CN112508118A (en) | Target object behavior prediction method aiming at data migration and related equipment thereof | |
CN112668238B (en) | Rainfall processing method, rainfall processing device, rainfall processing equipment and storage medium | |
US20210335500A1 (en) | Method and device for predicting a number of confirmed cases of an infectious disease, apparatus, and storage medium | |
CN116883154A (en) | Credit risk identification method, credit risk identification device, electronic equipment and readable storage medium | |
CN115293889A (en) | Credit risk prediction model training method, electronic device and readable storage medium | |
CN114049162B (en) | Model training method, demand prediction method, apparatus, device, and storage medium | |
CN115187393A (en) | Loan risk detection method and device, electronic equipment and readable storage medium | |
CN114972113A (en) | Image processing method and device, electronic equipment and readable storage medium | |
CN111291715A (en) | Vehicle type identification method based on multi-scale convolutional neural network, electronic device and storage medium | |
CN114638411A (en) | Carbon dioxide concentration prediction method, device, equipment and medium | |
CN113869599A (en) | Fish epidemic disease development prediction method, system, equipment and medium | |
CN115543638B (en) | Uncertainty-based edge calculation data collection and analysis method, system and equipment | |
CN116151961A (en) | Credit risk prediction method, electronic device and readable storage medium | |
CN113723712B (en) | Wind power prediction method, system, equipment and medium | |
CN114647721A (en) | Educational intelligent robot control method, device and medium | |
CN113255819B (en) | Method and device for identifying information | |
CN113361701A (en) | Quantification method and device of neural network model | |
CN118608279A (en) | Method, device, equipment, storage medium and program product for predicting flow time consumption | |
CN115550259B (en) | Flow distribution method based on white list and related equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |