CN117291707A - Loan application processing method, device, electronic equipment and storage medium - Google Patents

Loan application processing method, device, electronic equipment and storage medium Download PDF

Info

Publication number
CN117291707A
CN117291707A CN202311176331.6A CN202311176331A CN117291707A CN 117291707 A CN117291707 A CN 117291707A CN 202311176331 A CN202311176331 A CN 202311176331A CN 117291707 A CN117291707 A CN 117291707A
Authority
CN
China
Prior art keywords
probability
user
approval
sample
prediction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311176331.6A
Other languages
Chinese (zh)
Inventor
伏峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
CCB Finetech Co Ltd
Original Assignee
China Construction Bank Corp
CCB Finetech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp, CCB Finetech Co Ltd filed Critical China Construction Bank Corp
Priority to CN202311176331.6A priority Critical patent/CN117291707A/en
Publication of CN117291707A publication Critical patent/CN117291707A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof

Landscapes

  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Engineering & Computer Science (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The disclosure provides a loan application processing method, a loan application processing device, an electronic device and a storage medium, wherein the method comprises the following steps: carrying out credit prediction on the user applying for loan based on the credit evaluation model so as to obtain credit scores corresponding to the user; if the credit score is in a preset range, determining a first prediction probability of the user belonging to each first label through a first probability prediction model; determining target clusters to which the users belong according to the first prediction probabilities corresponding to the users; determining a second prediction probability of the user belonging to a second label by using a second probability prediction model corresponding to the target grouping; and determining the approval result of the loan application of the user according to the approval threshold value corresponding to the second probability prediction model and the second probability prediction model. Therefore, for users whose credit evaluation model can not determine the approval result, the credit evaluation model can be assisted to process the loan application of the users by utilizing the first probability prediction model and the second probability prediction model, so that the objectivity of the approval result of the loan application is ensured.

Description

Loan application processing method, device, electronic equipment and storage medium
Technical Field
The disclosure relates to the technical field of artificial intelligence and wind control, in particular to a loan application processing method, a loan application processing device, electronic equipment and a storage medium.
Background
When a client applies for loans, a credit evaluation model can be adopted to screen clients and judge default risks, but some client loan applications still cannot be approved by the credit evaluation model. In the related art, these clients are usually subjected to loan approval according to rules obtained from expert experience, but expert experience used in the rules often has subjectivity, so that approval results of loan applications are also subjective.
Disclosure of Invention
The disclosure provides a loan application processing method, a loan application processing device, electronic equipment and a storage medium. The technical scheme of the present disclosure is as follows:
according to a first aspect of an embodiment of the present disclosure, there is provided a loan application processing method, including:
carrying out credit prediction on the user applying for loan based on the credit evaluation model so as to obtain credit scores corresponding to the user;
under the condition that the credit score is in a preset range, determining a first prediction probability of the user belonging to each first label through a first probability prediction model, wherein the first labels are used for indicating an approval mode and a corresponding approval result;
Determining target clusters to which the users belong according to the first prediction probabilities corresponding to the users;
determining a second prediction probability of the user belonging to a second label by using a second probability prediction model corresponding to the target grouping, wherein the second label is used for indicating default after loan or non-default after loan;
and determining the approval result of the loan application of the user according to the approval threshold value corresponding to the second probability prediction model and the second probability prediction model.
According to a second aspect of the embodiments of the present disclosure, there is provided a loan application processing apparatus, including:
the first acquisition module is used for carrying out credit prediction on the user applying for the loan based on the credit evaluation model so as to acquire a credit score corresponding to the user;
the first determining module is used for determining a first prediction probability of each first label of the user through a first probability prediction model under the condition that the credit score is in a preset range, wherein the first labels are used for indicating an approval mode and a corresponding approval result;
the second determining module is used for determining target clusters to which the user belongs according to the first prediction probabilities corresponding to the user;
the third determining module is used for determining a second prediction probability of the user belonging to a second label by using a second probability prediction model corresponding to the target grouping, wherein the second label is used for indicating the default after loan or the non-default after loan;
And the fourth determining module is used for determining the approval result of the loan application of the user according to the approval threshold value corresponding to the second probability prediction model and the second probability prediction model.
According to a third aspect of embodiments of the present disclosure, there is provided an electronic device, comprising: a processor; a memory for storing the processor-executable instructions; wherein the processor is configured to execute the instructions to implement the method according to the above-described embodiments of the present disclosure.
According to a fourth aspect of embodiments of the present disclosure, there is provided a computer readable storage medium, which when executed by a processor of an electronic device, causes the electronic device to perform a method as described in the above embodiments of the present disclosure.
According to a fifth aspect of embodiments of the present disclosure, there is provided a computer program product comprising: a computer program which, when executed by a processor, implements a method as described in the above embodiments of the present disclosure.
The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects: for users whose credit evaluation model fails to determine the approval result, the credit evaluation model can be assisted to process the loan application of the users by utilizing the first probability prediction model and the second probability prediction model, so that the objectivity of the approval result of the loan application is ensured. In addition, the target cluster to which the user belongs is determined by using the prediction result of the first prediction probability, and the probability of the user belonging to the second label is determined by using the second probability prediction model corresponding to the target cluster, so that the accuracy of the prediction result can be improved, and the accuracy of the approval result can be further improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure and do not constitute an undue limitation on the disclosure.
Fig. 1 is a flow chart illustrating a loan application processing method, according to a first embodiment of the present disclosure.
FIG. 2 is a flow chart of a loan application processing method, shown in a second embodiment of the disclosure;
FIG. 3 is a flow chart of a loan application processing method, shown in a third embodiment of the disclosure;
fig. 4 is a schematic structural view of a loan application processing apparatus shown in a fourth embodiment of the disclosure;
fig. 5 is a schematic structural view of an electronic device according to an exemplary embodiment of the present disclosure.
Detailed Description
In order to enable those skilled in the art to better understand the technical solutions of the present disclosure, the technical solutions of the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.
It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the foregoing figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the disclosure described herein may be capable of operation in sequences other than those illustrated or described herein. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as detailed in the accompanying claims.
It should be noted that, in the technical scheme of the present disclosure, the acquisition, storage, use, processing, etc. of the data all conform to the relevant regulations of the national laws and regulations, and the public sequence is not violated.
When a client applies for loans, a credit evaluation model can be adopted to screen clients and judge default risks, but some client loan applications still cannot be approved by the credit evaluation model. In the related art, these clients are usually subjected to loan approval according to rules obtained from expert experience, but expert experience used in the rules often has subjectivity, so that approval results of loan applications are also subjective.
Accordingly, in response to at least one of the above-mentioned problems, the present disclosure proposes a loan application processing method, apparatus, electronic device, and storage medium.
The loan application processing method, apparatus, electronic device, and storage medium of the embodiments of the present disclosure are described below with reference to the accompanying drawings. Fig. 1 is a flow chart illustrating a loan application processing method, according to a first embodiment of the present disclosure.
The embodiments of the present disclosure are exemplified by the loan application processing method being configured in a loan application processing apparatus, which may be applied to any electronic device, so that the electronic device may perform a loan application processing function.
The electronic device may be any device with computing capability, for example, may be a personal computer, a mobile terminal, a server, etc., and the mobile terminal may be, for example, a vehicle-mounted device, a mobile phone, a tablet computer, a personal digital assistant, a wearable device, etc., which have various operating systems, touch screens, and/or display screens.
As shown in fig. 1, the loan application processing method may include:
and step 101, carrying out credit prediction on the user applying for the loan based on the credit evaluation model so as to acquire a credit score corresponding to the user.
In the disclosure, for a user applying for loans, credit evaluation model can be utilized to predict credit of the user's loan application related data, and credit score of the user can be obtained. The loan application related data may include, among others, an amount of applied loan, an age of the user, a income status of the user, whether there is a loan record, and the like.
Where the credit score may be used to indicate a user's credit status, such as when the credit score is relatively low, indicating that the user's credit is low, i.e., the user may be at a higher risk of default (e.g., low repayment capacity, etc.).
Step 102, determining a first prediction probability of the user belonging to each first label through a first probability prediction model under the condition that the credit score is in a preset range.
In the present disclosure, if the credit score is within a preset range, the credit evaluation model may be considered as incapable of giving an approval result, and then the first probability prediction model may be used to determine the first prediction probability that the user belongs to each first tag.
The first labels may be used for indicating an approval mode and corresponding approval results, for example, the 4 first labels respectively include that automatic approval by using a credit evaluation model (hereinafter referred to as automatic approval passing), automatic approval by using the credit evaluation model (hereinafter referred to as automatic approval failing), manual approval passing of the approval results given by the credit evaluation model (hereinafter referred to as manual approval passing), and manual approval failing of the approval results given by the credit evaluation model (hereinafter referred to as manual approval failing).
In the present disclosure, the first probability prediction model may be obtained by training a sample user who has applied for a loan and a first tag to which the sample user belongs.
In the disclosure, if the credit score of the user is greater than the upper limit of the preset range, the credit of the user can be considered to be good, and the loan application approval of the user passes; if the credit score is less than the lower limit of the preset range, the credit of the user is considered to be lower, and the loan application of the user is not passed.
For example, the preset range is [ A1, A2], if the credit score of the user A1 is greater than A1 and less than A2, which indicates that the approval result cannot be automatically given by using the credit evaluation model, the first probability prediction model may be used to predict the first prediction probability that the user A1 belongs to each first label, if the credit score of the user A2 is greater than A2, the loan application approval of the user A2 is passed, and if the credit score of the user A3 is less than A1, the loan application approval of the user A3 is not passed.
And step 103, determining target clustering to which the user belongs according to each first prediction probability corresponding to the user.
In the present disclosure, the grouping may be obtained by clustering sample users applying for loans, and sample users in the same grouping are similar.
In the disclosure, a sample user may be randomly selected from each cluster, an absolute value of a difference value between a probability that the sample user belongs to each first label and a first prediction probability that the sample user belongs to the same first label is calculated, and a cluster to which the sample user with the smallest average value of the absolute value of the difference value belongs may be used as a target cluster.
And 104, determining a second prediction probability of the user belonging to the second label by using a second probability prediction model corresponding to the target grouping.
Wherein the second tag may be used to indicate a post-loan breach or a post-loan non-breach.
In the present disclosure, each cluster has a corresponding second probabilistic predictive model, which may be trained using the feature data of the sample users in the cluster. The feature data may include a plurality of feature dimensions, different types of users or different types of loan applications, etc., and the feature dimensions may be different, for example, feature dimensions adopted by individual users and enterprise users may be different.
In the disclosure, the feature data of the user can be input into a second probability prediction model corresponding to the target grouping to predict, so as to obtain a second prediction probability of the user belonging to the second label.
The second probability prediction model may output a probability of not violating the offer of the user and a probability of not violating the offer of the user, or the second probability prediction model may output a probability of not violating the offer of the user, by calculating to obtain a probability of not violating the offer of the user, which is not limited in this disclosure.
And 105, determining the approval result of the loan application of the user according to the approval threshold value corresponding to the second probability prediction model and the second probability prediction model.
In the present disclosure, each cluster has a corresponding second probabilistic predictive model with a corresponding approval threshold.
In the disclosure, the second prediction probability may be compared with an approval threshold corresponding to the second probability prediction model corresponding to the target grouping, and according to the comparison result, an approval result of the loan application of the user is determined.
As an implementation manner, the second tag is used for indicating that the loan is not violated, if the second prediction probability is greater than or equal to the approval threshold, the approval result of the user loan application can be determined to be approval passing, and if the second prediction probability is less than the approval threshold, the approval result of the user loan application can be determined to be approval failing.
As another implementation manner, the second tag is used for indicating a post-loan default, if the second prediction probability is less than or equal to the approval threshold, it may be determined that the approval result of the user loan application is approval passing, and if the second prediction probability is greater than the approval threshold, it may be determined that the approval result of the user loan application is approval failing.
It should be noted that, when the second prediction probability is equal to the approval threshold, the approval result may be that the approval passes or the approval fails, and may be set according to actual needs, which is not limited in the disclosure.
In the embodiment of the disclosure, for a user whose credit evaluation model fails to determine an approval result, a first prediction probability that the user belongs to each first label may be determined through a first probability prediction model, then a target group to which the user belongs is determined according to the first prediction probability, a second prediction probability that the user belongs to the second label is determined according to a second probability prediction model corresponding to the target group, and then an approval result of the user loan application is determined according to an approval threshold value corresponding to the second prediction probability and the second probability prediction model. Therefore, for users whose credit evaluation model fails to determine the approval result, the credit evaluation model can be assisted to process the loan application of the users by utilizing the first probability prediction model and the second probability prediction model, so that the objectivity of the approval result of the loan application is ensured. In addition, the target cluster to which the user belongs is determined by using the prediction result of the first prediction probability, and the probability of the user belonging to the second label is determined by using the second probability prediction model corresponding to the target cluster, so that the accuracy of the prediction result can be improved, and the accuracy of the approval result can be further improved.
Fig. 2 is a flow chart illustrating a loan application processing method, according to a second embodiment of the disclosure.
As shown in fig. 2, the loan application processing method includes:
and step 201, carrying out credit prediction on the user applying for the loan based on the credit evaluation model so as to acquire a credit score corresponding to the user.
Step 202, determining a first prediction probability of the user belonging to each first tag through a first probability prediction model under the condition that the credit score is within a preset range.
In the present disclosure, any implementation manner of the embodiments of the present disclosure may be adopted in the steps 201 to 202, which is not limited and not repeated in the present disclosure.
And 203, vectorizing the user according to each first prediction probability corresponding to the user to obtain a first vector corresponding to the user.
In the present disclosure, each first prediction probability may be formed into a vector according to a certain label order, so as to obtain a first vector corresponding to a user.
For example, there are 4 first labels, namely label 1, label 2, label 3 and label 4, and the first prediction probabilities of a user belonging to the 4 labels are p1, p2, p3 and p4, respectively, so the first vector may be (p 1, p2, p3 and p 4).
In step 204, a target cluster is determined from the plurality of clusters according to the distances between the first vector and the cluster centers of the plurality of clusters, respectively.
The grouping center may be determined according to the probability that each sample user in the grouping belongs to each first label, the grouping center may be represented by a vector, and the number of elements contained in the vector corresponding to the grouping center is the same as the number of the first labels.
In the present disclosure, a distance between the first vector and a cluster center of each cluster may be calculated, and a cluster having the smallest distance may be determined as a target cluster. For example, the euclidean distance between the first vector and the center of each cluster may be calculated, and the cluster corresponding to the minimum value of the euclidean distance may be determined as the target cluster.
Step 205, determining a second prediction probability of the user belonging to the second label by using a second probability prediction model corresponding to the target cluster.
Step 206, determining the approval result of the loan application of the user according to the approval threshold value corresponding to the second probability prediction model and the second probability prediction model.
In the present disclosure, any implementation manner of the embodiments of the present disclosure may be adopted in step 205 to step 206, which is not limited by the present disclosure and is not repeated herein.
In the embodiment of the disclosure, the first prediction probability of the user belonging to each first label can be vectorized to obtain the first vector, and the target cluster of the user is determined according to the distance between the first vector and the cluster center of each cluster, so that the accuracy is improved, and the method is simple and convenient.
Fig. 3 is a flowchart illustrating a loan application processing method, according to a third embodiment of the disclosure.
As shown in fig. 3, the loan application processing method includes:
step 301, determining a first tag to which each sample user belongs based on credit prediction results of the credit evaluation model on a plurality of sample users applying a loan.
In the disclosure, credit prediction may be performed on each sample user by using a credit evaluation model to obtain a credit score of each sample user, if the credit score is greater than an upper limit of a preset range, it may be determined that a loan application of the sample user is approved, if the credit score is less than a lower limit of the preset range, the loan application of the sample user is approved, and if the credit score is within the preset range, it may be manually approved, and the approval result is approved or approved. Therefore, the first label to which the user belongs can be determined according to the loan approval mode and the approval result of the sample user.
For example, there are 4 first tags, namely, tag 1, tag 2, tag 3 and tag 4, wherein tag 1 is used for indicating that automatic approval passes, tag 2 is used for indicating that automatic approval fails, tag 3 is used for indicating that manual approval passes, tag 4 is used for indicating that manual approval fails, and if the loan application of a sample user is approved by the credit evaluation model, it can be determined that the first tag to which the sample user belongs is tag 1.
Step 302, training an initial first probability prediction model according to the first label to which each sample user belongs and the feature data corresponding to each sample user to obtain a first probability prediction model.
In the disclosure, feature data of a sample user can be input into an initial first probability prediction model to obtain a prediction label of the sample user, model loss is determined according to the difference between the prediction label and the first label, parameters of the initial first probability prediction model are adjusted according to the model loss, and the model after the parameters are adjusted is continuously trained until the condition of finishing training is met, so that the first probability prediction model is obtained.
The training ending condition may be that the model loss is smaller than a preset threshold, or that the training frequency reaches a preset frequency, and the training ending condition may be determined according to actual needs, which is not limited in the present disclosure.
For example, customers applying for loans are classified and labeled based on a credit rating model, as shown in Table 1 below.
Representing 1 customer category and label
Wherein, the set { AutoAccept } represents a client group which is automatically approved based on a credit evaluation model online_model, and the judgment logic is that the credit score calculated by the client through the online_model is larger than an automatic pass score threshold value BaseScorUp, and the first label value of the client in the set { AutoAccept } is assigned to 1.
The set { AutoRefuse } represents a group of clients that are automatically approved to be failed based on the credit assessment model online_model, and the judgment logic is that the credit score calculated by the clients through the online_model is smaller than the automatic rejection score threshold value baseScarDown, and the first label value of the clients in the set { AutoRefuse } is assigned to 2.
The set { Manul Accept } represents a client that cannot pass or fail automatic approval by means of the credit evaluation model online_model and needs human intervention to judge pass, the judgment logic is that the credit score calculated by the client through the online_model is greater than or equal to an automatic reject score threshold value baseScoreDown and less than or equal to an automatic pass score threshold value baseScoreUp, and the first label value of the client in the set { Manul Accept } is assigned to 3 after the client passes the manual approval.
The aggregate { Manusufused } represents a client group which cannot pass or fail automatic approval by means of a credit evaluation model online_model and needs human intervention to judge the failed client group, the judgment logic is that the credit score calculated by the client through the online_model is greater than or equal to an automatic reject score threshold value baseScaoreDown and less than or equal to an automatic pass score threshold value baseScaoreUp, and the first label value of the client in the aggregate { Manusufused } is assigned to 4 after the client passes the manual approval and is judged to pass the approval.
Combining the client group sets { AutoAccept }, { AutoRefuse }, { ManulAccept }, and { ManulRefuse }, to form a client set { Mergesample }, and combining the characteristics and labels of the clients, the following Table 2 can be formed:
table 2 customer sample representation
Wherein S is i Representing clients in the client set { Mergesample }, wherein feature j represents some feature dimensions corresponding to the clients, and using the feature j as a candidate argument, the tag as an argument, and the common useA first probability prediction model is constructed by the supervised classification algorithm (e.g., logistic regression algorithm, random forest algorithm, optimized distributed gradient enhancement library XGBOOST, etc.), which can be used to determine the probabilities that the clients belong to tag 1, tag 2, tag 3, tag 4, respectively.
In the embodiment of the disclosure, the first label of each sample user can be determined based on the credit evaluation model, and then the first probability prediction model is obtained by training based on the characteristic data of the sample user and the first label of the sample user, so that the probability that the loan application user belongs to each first label can be predicted through the model.
In one embodiment of the present disclosure, a first probability prediction model may be obtained by training to predict each sample user to obtain a third prediction probability that each sample user belongs to each first label, then vectorizing each sample user according to the third prediction probability that each sample user belongs to each first label to obtain a second vector corresponding to each sample user, clustering sample users according to feature data of the sample users to obtain a plurality of clusters, and determining a cluster center of each cluster according to the second vector corresponding to each sample user in each cluster.
In this disclosure, the method for obtaining the second vector is similar to that of the first vector, and therefore will not be described herein.
When determining the grouping center of each grouping, the average value of the third prediction probabilities of all sample users belonging to the same first label in the grouping can be calculated, and the grouping center is obtained according to the average value corresponding to all the first labels.
For example, a certain group includes N sample users, N is an integer greater than 1, and the second vector of the ith sample user is (p i1 ,p i2 ,p i3 ,p i4 ) I is a positive integer less than or equal to N, and the grouping center of the grouping can be
Taking the above-mentioned client set { MergeSample } as an example, the probability that all clients in the above-mentioned set { MergeSample } belong to the labels 1, 2, 3, and 4 is predicted by using the above-mentioned first probability prediction model accept_model, and the result is as follows in table 3:
TABLE 3 first probability prediction model prediction results table
Client and method for providing a customer with a service Label 1 Label 2 Label 3 Label 4
S 1 P 1,1 P 1,2 P 1,3 P 1,4
S 2 P 2,1 P 2,2 P 2,3 P 2,4
S 3 P 3,1 P 3,2 P 3,3 P 3,4
S i P i,1 P i,2 P i,3 P i,4
S m P m,1 P m,2 P m,3 P m,4
Wherein P is i,1 、P i,2 、P i,3 And P i,4 Respectively represent clients S i Probabilities belonging to tag 1, tag 2, tag 3 and tag 4 respectively, and vectorizing the clients by using their predicted probabilities belonging to tag 1, tag 2, tag 3 and tag 4 respectively, to obtain S i =<P i,1 ,P i,2 ,P i,3 ,P i,4 >。
Vectorizing all clients in the client set { Mergesample }, constructing a client clustering model by using an unsupervised clustering algorithm, such as KMEANS (K mean), noisy Density-based clustering (DBSCAN) and the like, based on vectorization results, and finally obtaining a clustering result as shown in the following table 4:
TABLE 4 grouping results Table
Wherein,<AVG_P i,1 ,AVG_P i,2 ,AVG_P i,3 ,AVG_P i,4 >the center point representing the Cluster { Cluster_i }, i.e., the vector S corresponding to all clients in the Cluster { Cluster_i }, respectively i =<P i,1 ,P i,2 ,P i,3 ,P i,4 >Average value of (a), i.e. AVG_P i,1 =AVG(P i,1 ),AVG_P i,2 =AVG(P i,2 ),AVG_P i,3 =AVG(P i,3 ),AVG_P i,4 =AVG(P i,4 )。
Wherein the union of the clusters is the set { Mergesample }, i.e., { Mergesample = { Cluster_1} U { Cluster_2} U … U { Cluster_i } U … U { Cluster_n }, and the intersection of any two clusters is empty, i.e.
In the embodiment of the disclosure, the first probability prediction model can be utilized to determine the third prediction probability that each sample user belongs to each first label, the sample users are vectorized according to the third prediction probability that the sample users in the cluster belong to each first label, the sample users are clustered to obtain a plurality of clusters, the second vector of the sample users in the cluster is utilized to determine the cluster center of the cluster, so that the sample users are classified through the cluster, the cluster center is determined based on the prediction result of the first probability prediction model, and the accuracy is improved.
In one embodiment of the disclosure, for each cluster, training the initial second probability prediction model according to the feature data of the first target sample user in the cluster and the second label to which the first target sample user belongs to obtain a third probability prediction model. The first target sample user may refer to a sample user with a first tag in the group as the first target tag, and the first target tag may include automatic approval passing and manual approval passing, that is, the first target sample user refers to a sample user with automatic approval passing and a sample user with manual approval passing in the group.
After the third probability prediction model is obtained, feature data of the second target sample user in the cluster may be input into the third probability prediction model to determine a fourth prediction probability that the second target sample user belongs to the second tag. The second sample users may refer to sample users in the group in which the first tag is a second target tag, and the second target tag may include automatic approval failed and manual approval failed, that is, the second target sample users may refer to sample users in the group in which automatic approval failed and manual approval failed.
For example, the second label is used for indicating that the loan is not violated, and a third probability prediction model corresponding to grouping can be used for predicting the probability of the sample user loan which is not passed through by automatic approval and predicting the probability of the user loan which is not passed through by manual approval.
And then, marking the fourth prediction probability of the second target sample user belonging to the second label as a weight, determining the prediction probability of the second sample user belonging to another second label, marking the prediction probability as a weight, merging the second target sample user belonging to the second label, the second target sample user belonging to another second label and other sample users except the second target sample user in the group to obtain a training sample set corresponding to the group, training the second label of the training sample set by utilizing the characteristic data set of the training sample in the training sample set, and training the initial second probability prediction model to obtain a second probability prediction model corresponding to the group. The training method of the second probability prediction model is similar to that of the first probability prediction model, and therefore will not be described herein.
Taking table 4 as an example, the second label to which the automatically approved customer belongs and the second label to which the manually approved customer belongs may be determined according to whether the automatically approved customer and the manually approved customer loan are violated in each group. For example, for ease of understanding, the second label may be labeled "good" for indicating a post-loan non-default and "bad" for indicating a post-loan default, as shown in Table 5 below:
TABLE 5 automatic approval by customer and manual approval by customer broad form
Wherein S is i The clients passing the automatic approval and the clients passing the manual approval in the Cluster { Cluster_i }, the feature j represents some feature dimensions corresponding to the clients, the feature j is used as a candidate independent variable, the labels are good and bad as dependent variables, and a third probability prediction model good_model_A is constructed by using a supervised classification algorithm (such as a logistic regression algorithm, a random forest algorithm, XGBOOST and the like).
And carrying out probability prediction on the clients which are automatically approved and not approved in the grouping { Cluster_i } and the clients which are manually approved by utilizing a third probability prediction model good_model_A, and calculating the probability P (Good) of being undestroyed after loan and the probability P (Bad) of being undestroyed after loan for each client, wherein P (Good) =1-P (Bad).
Clients that are automatically approved and manually approved in the Cluster { Cluster_i } can be duplicated into two parts, wherein one part of the second labels are marked as Good, the weight is marked as Good, and the other part of the second labels are marked as Bad, and the weight is marked as P (Bad).
Combining the automatic and manual failed clients with the weight after the duplication conversion with the automatic and manual failed clients, specifically combining the good clients with the weight in the automatic and manual failed clients with the good clients in the automatic and manual failed clients, combining the bad clients with the weight in the automatic and manual failed clients with the bad clients in the automatic and manual failed clients, and constructing a second probability prediction model good_model_b by using a supervised classification algorithm (such as a logistic regression algorithm, a random forest algorithm, an XGBOOST, etc.), wherein a second probability prediction model good_model_b can be obtained for each client group as shown in the following table 6:
TABLE 6 prediction of customer breach probability for different clusters
Grouping sequence number Grouping customer sets Second probabilistic predictive model
Grouping 1 {Cluster_1} good_model_B_1
Grouping 2 {Cluster_2} good_model_B_2
Grouping i {Cluster_i} good_model_B_i
Grouping n {Cluster_n} good_model_B_n
In the embodiment of the disclosure, the third probability prediction model can be obtained through training according to the second labels of the sample users passing through automatic approval and passing through manual approval in the clusters, and the probability of the sample users not passing through automatic approval and not passing through manual approval belonging to the second labels is predicted by using the third probability prediction model, so that a training sample set is constructed, and the accuracy of the second probability prediction model is improved through the second probability prediction model obtained through training of the training sample set.
In this disclosure, the third probability prediction model may be directly determined as the second probability prediction model, or a good customer with a weight in a customer that fails to be automatically examined and approved by a person in a group may be combined with a customer that fails to be automatically examined and approved by a person to obtain a training sample set, and the training sample set is used to train to obtain the second probability prediction model corresponding to the group, or a bad customer with a weight in the customer that fails to be automatically examined and approved by the person in the group may be combined with a customer that fails to be automatically examined and approved by the person to obtain a training sample set, and the training sample set is used to train to obtain the second probability prediction model corresponding to the group.
In one embodiment of the disclosure, the second probability prediction model corresponding to the grouping may also be used to determine an approval threshold corresponding to the second probability prediction model.
In the disclosure, for each cluster, a second probability prediction model corresponding to the cluster may be used to determine a fifth prediction probability that each sample user in the cluster belongs to a second label, and according to a sequence from high to low of the fifth prediction probability, the sample users in the cluster are subjected to equal-frequency division to obtain a plurality of bins.
As an example, the sample users may be divided by each preset number in the order of the fifth prediction probability from high to low or from low to high, and the smallest fifth prediction probability in each group is taken as the lower limit of the corresponding bin of each group, and the largest fifth prediction probability in each group is taken as the upper limit probability of the corresponding bin of each group.
For example, 100 sample users in a certain grouping are equally divided into 5 bins, that is, every 20 sample users are taken as a group, the 5 bins may be sorted in order of the fifth prediction probability from high to low, the 1 st to 20 sample users are taken as a group, the 21 st to 40 sample users are taken as a group, the 41 st to 60 sample users are taken as a group, the 61 st to 80 sample users are taken as a group, the 81 st to 100 sample users are taken as a group, and the corresponding bins of each group are determined according to the minimum value and the maximum value of the fifth prediction probability in each group, for example, for the first group, positive infinity may be taken as the upper limit probability of the bin, the first group minimum fifth prediction probability may be taken as the lower limit probability of the bin, for the fifth group, negative infinity may be taken as the upper limit probability of the bin, for other groups, the minimum fifth prediction probability in the group may be taken as the lower limit probability of the bin, and the maximum prediction probability in the group may be taken as the upper limit probability of the bin.
After obtaining a plurality of bins, for each bin, determining sample users with fifth probability greater than lower limit probability of the bins in the group, as third target sample users, determining the number of sample users violating after loan in the third target sample users in the group, determining the ratio of the number of the sample users violating after loan in the third target sample users to the number of the sample users in the third target sample users as the cumulative reject ratio corresponding to the bins, determining the cumulative passing rate corresponding to the bins by the ratio of the number of sample users for which loan application is approved in the third target sample users to the number of sample users in the groups, and determining the approval threshold according to the cumulative passing rate and the cumulative passing rate corresponding to the bins.
When the approval threshold is determined, the absolute value of the difference between the accumulated reject ratio and the actual reject ratio can be determined as the reject ratio difference, the absolute value of the difference between the accumulated pass ratio and the actual pass ratio is determined as the pass ratio difference, and the lower limit probability of the bin with the smallest sum of the reject ratio difference and the pass ratio difference is determined as the approval threshold.
The actual passing rate and the actual reject rate can be obtained by counting sample users of previous loan applications based on a credit evaluation model, the actual passing rate can be a ratio of the sum of the numbers of the sample users passing through automatic approval and manual approval to the total number of the sample users, and the actual reject rate can be a ratio of the number of sample users passing through automatic approval and violating after loan among the sample users passing through manual approval to the sum of the numbers of the sample users passing through automatic approval and manual approval.
The process of determining the approval threshold is described below in connection with table 5 above, with the second tag user indicating that there is no default to the loan.
Based on a credit evaluation model online_model, calculating two indexes of the passing rate online_accept_rate and the reject rate online_bad_rate of the current online client, wherein the calculation formula is as follows:
wherein s1 represents the number of passing clients of automatic approval, s2 represents the number of passing clients of manual approval, s3 represents the number of failing clients of automatic approval, s4 represents the number of failing clients of manual approval, s11 represents the number of violating clients after automatic approval passes the loan in the client, and s21 represents the number of violating clients after manual approval passes the loan in the client.
And predicting the probability of all clients in the Cluster { Cluster_i } by using a second probability prediction model good_model_B_i corresponding to the Cluster { Cluster_i }, sequencing the prediction results from high to low, and dividing the sequencing results into m boxes according to equal frequency, wherein m is an adjustable parameter, and the specific parameters are shown in the following table 7:
TABLE 7 statistics of failure and pass rate
Wherein score i Representing the second probabilistic predictive model good model B i prediction result, sum bad rate, corresponding to the client in Cluster i i And sum_accept_rate i For separate boxes [ score ] i ,score i-1 ) The corresponding accumulated reject ratio and accumulated passing rate are calculated as follows:
Wherein h is i1 Indicating that the model good model B i prediction is greater than score i Actual number of default clients in clients, h i2 Indicating that the model good model B i prediction is greater than score i Customer count, h i3 Indicating that the model good model B i prediction is greater than score i The actual approval among the clients of (a) passes the number of clients, h i Representing the number of clients in Cluster i.
The absolute difference of the cumulative failure rate and the online_bad_rate and the absolute difference of the cumulative passing rate and the online_accept_rate for each bin in table 6 are calculated as shown in table 8 below:
TABLE 8 bad and pass difference statistics
Where d_badi=abs (sum_bad_rate-online_bad_rate)
d_accept i =abs(sum_accept_rate i -online_accept_rate)
sum i =d_bad i +d_accept i
Then sum the difference i Sorting from small to large, and taking the smallest min (sum i ) Corresponding binning, assuming the binning is [ score ] i ,score i-1 ) Then fetch score i The second probabilistic predictive model good_model_b_i corresponding to the Cluster { cluster_i } automatically passes the approval threshold.
When the credit evaluation model fails to give the approval result of the loan application of a certain user, the probability of non-default after the user loan can be determined by utilizing the good_model_b_i corresponding to the target group to which the user belongs, if the probability is larger than the good_model_b_i automatic passing approval threshold, the approval result of the loan application of the user can be determined to be approval passing, and if the probability is smaller than or equal to the good_model_b_i automatic passing approval threshold, the approval result of the loan application of the user can be determined to be approval failing.
It should be noted that, if the second tag is used to indicate the default after loan, the second probability prediction model and the corresponding approval threshold of each group may be determined by using the above method.
Corresponding to the loan application processing method provided by the above embodiment, the present disclosure further provides a loan application processing device, and since the loan application processing device provided by the embodiment of the present disclosure corresponds to the loan application processing method provided by the above embodiment, the implementation of the loan application processing method is also applicable to the loan application processing device provided by the embodiment of the present disclosure, which is not described in detail in the embodiment of the present disclosure.
Fig. 4 is a schematic structural view of a loan application processing apparatus shown in a fourth embodiment of the disclosure.
Referring to fig. 4, the loan application processing apparatus 100 may include:
a first obtaining module 410, configured to predict credit for a user applying for a loan based on a credit evaluation model, so as to obtain a credit score corresponding to the user;
the first determining module 420 is configured to determine, according to a first probability prediction model, a first prediction probability of a user belonging to each first label when the credit score is within a preset range, where the first label is used to indicate an approval mode and a corresponding approval result;
A second determining module 430, configured to determine, according to each first prediction probability corresponding to the user, a target cluster to which the user belongs;
a third determining module 440, configured to determine a second predicted probability that the user belongs to a second label by using a second probability prediction model corresponding to the target group, where the second label is used to indicate a post-loan default or a post-loan non-default;
and a fourth determining module 450, configured to determine an approval result of the loan application of the user according to the approval threshold value corresponding to the second probability prediction model and the second probability prediction model.
Optionally, the second determining module 430 is configured to:
vectorizing the user according to each first prediction probability corresponding to the user so as to obtain a first vector corresponding to the user;
and determining the target cluster from the clusters according to the distances between the first vector and the cluster centers of the clusters.
Optionally, the second tag is used to indicate that there is no default after loan, and the fourth determining module 450 is used to:
under the condition that the second prediction probability is greater than or equal to an approval threshold value, determining that the approval result is approval passing;
and under the condition that the second prediction probability is smaller than the approval threshold value, determining that the approval result is that the approval fails.
Optionally, the second tag is used to indicate a post-loan default, and the fourth determination module 450 is used to:
under the condition that the second prediction probability is smaller than or equal to an approval threshold value, determining that the approval result is approval passing;
and under the condition that the second prediction probability is larger than the approval threshold value, determining that the approval result is that the approval fails.
Optionally, the apparatus may further include:
a fifth determining module, configured to determine, based on a credit evaluation model, a first label to which each sample user belongs, for credit prediction results of a plurality of sample users applying a loan;
the first training module is used for training the initial first probability prediction model according to the first label to which each sample user belongs and the characteristic data corresponding to each sample user to obtain a first probability prediction model.
Optionally, the apparatus may further include:
the second obtaining module is used for predicting each sample user by using the first probability prediction model so as to obtain a third prediction probability of each sample user belonging to each first label;
the third obtaining module is used for vectorizing each sample user according to the third prediction probability that each sample user belongs to each first label so as to obtain a second vector corresponding to each sample user;
The clustering module is used for clustering a plurality of sample users to obtain a plurality of clusters;
and a sixth determining module, configured to determine a group center of each group according to a second vector corresponding to the sample user in each group.
Optionally, the apparatus may further include:
the second training module is used for training the initial second probability prediction model to obtain a third probability prediction model according to the characteristic data of the first sample users in the groups and the second labels of the first target sample users aiming at each group, wherein the first target sample users refer to sample users with the first labels being the first target labels;
a seventh determining module, configured to determine, using a third probability prediction model, a fourth prediction probability that a second target sample user in the cluster belongs to a second label, where the second sample user is a sample user whose first label is a second target label;
the construction module is used for constructing a training sample set corresponding to the grouping according to the second target sample user, the fourth prediction probability that the second target sample user belongs to the second label, other sample users except the second target sample user in the grouping and the second label to which the other sample users belong;
And the third training module is used for training the initial second probability prediction model by utilizing the characteristic data of the training samples in the training sample set and the second label to which the training samples belong so as to obtain a second probability prediction model corresponding to the grouping.
Optionally, the apparatus may further include:
an eighth determining module, configured to determine, for each group, a fifth prediction probability that each sample user in the group belongs to a second label by using a second probability prediction model corresponding to the group;
the dividing module is used for carrying out equal-frequency division on sample users in the group according to the high-low sequence of the fifth prediction probability to obtain a plurality of sub-boxes;
a ninth determining module, configured to determine, for each sub-bin, an accumulated failure rate corresponding to the sub-bin according to a ratio of a number of sample users violating after loan in a third target sample user to a number of third target sample users, where the third target sample user is a sample user in the sub-bin whose fifth probability is greater than a lower limit probability of the sub-bin;
a tenth determining module, configured to determine an accumulated passing rate corresponding to the sub-bin according to a ratio of the number of sample users passing the loan application approval among the third target sample users to the number of sample users in the sub-group;
And the eleventh determining module is used for determining the approval threshold according to the accumulated passing rate and the accumulated passing rate corresponding to the sub-boxes.
In the embodiment of the disclosure, for a user whose credit evaluation model fails to determine an approval result, a first prediction probability that the user belongs to each first label may be determined through a first probability prediction model, then a target group to which the user belongs is determined according to the first prediction probability, a second prediction probability that the user belongs to the second label is determined according to a second probability prediction model corresponding to the target group, and then an approval result of the user loan application is determined according to an approval threshold value corresponding to the second prediction probability and the second probability prediction model. Therefore, for users whose credit evaluation model fails to determine the approval result, the credit evaluation model can be assisted to process the loan application of the users by utilizing the first probability prediction model and the second probability prediction model, so that the objectivity of the approval result of the loan application is ensured. In addition, the target cluster to which the user belongs is determined by using the prediction result of the first prediction probability, and the probability of the user belonging to the second label is determined by using the second probability prediction model corresponding to the target cluster, so that the accuracy of the prediction result can be improved, and the accuracy of the approval result can be further improved.
In an exemplary embodiment, an electronic device is also presented.
Wherein, electronic equipment includes:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to execute instructions to implement a loan application processing method as set forth in any of the foregoing embodiments.
As an example, fig. 5 is a schematic structural diagram of an electronic device 500 according to an exemplary embodiment of the present disclosure, where, as shown in fig. 5, the electronic device 500 may further include:
the memory 510 and the processor 520, the bus 530 connecting the different components (including the memory 510 and the processor 520), the memory 510 stores a computer program, and the processor 520 executes the program to implement the loan application processing method according to the embodiments of the disclosure.
Bus 530 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, or a local bus using any of a variety of bus architectures. By way of example, and not limitation, such architectures include Industry Standard Architecture (ISA) bus, micro channel architecture (MAC) bus, enhanced ISA bus, video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.
Electronic device 500 typically includes many types of electronic device readable media. Such media can be any available media that is accessible by electronic device 500 and includes both volatile and nonvolatile media, removable and non-removable media.
Memory 510 may also include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM) 540 and/or cache memory 550. Electronic device 500 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 560 may be used to read from or write to a non-removable, non-volatile magnetic media (not shown in FIG. 5, commonly referred to as a "hard disk drive"). Although not shown in fig. 5, a magnetic disk drive for reading from and writing to a removable non-volatile magnetic disk (e.g., a "floppy disk"), and an optical disk drive for reading from or writing to a removable non-volatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In such cases, each drive may be coupled to bus 530 through one or more data media interfaces. Memory 510 may include at least one program product having a set (e.g., at least one) of program modules configured to carry out the functions of the various embodiments of the disclosure.
A program/utility 580 having a set (at least one) of program modules 570 may be stored in, for example, memory 510, such program modules 570 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment. Program modules 570 generally perform the functions and/or methods in the embodiments described in this disclosure.
Electronic device 500 may also communicate with one or more external devices 590 (e.g., keyboard, pointing device, display 591, etc.), one or more devices that enable a user to interact with electronic device 500, and/or any devices (e.g., network card, modem, etc.) that enable electronic device 500 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 592. Also, electronic device 500 may communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet, through network adapter 593. As shown, network adapter 593 communicates with other modules of electronic device 500 via bus 530. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with electronic device 500, including, but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.
The processor 520 executes various functional applications and data processing by running programs stored in the memory 510.
It should be noted that, the implementation process and technical principle of the electronic device in this embodiment refer to the foregoing explanation of the loan application processing method in the embodiment of the disclosure, and are not repeated herein.
In an exemplary embodiment, a computer readable storage medium is also provided, e.g. a memory, comprising instructions executable by a processor of an electronic device to perform the method set forth in any of the embodiments described above. Alternatively, the computer readable storage medium may be ROM, random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.
In an exemplary embodiment, a computer program product is also provided, comprising a computer program/instruction, characterized in that the computer program/instruction, when executed by a processor, implements the method as set forth in any of the above embodiments.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any adaptations, uses, or adaptations of the disclosure following the general principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It is to be understood that the present disclosure is not limited to the precise arrangements and instrumentalities shown in the drawings, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (19)

1. A loan application processing method, characterized by comprising:
carrying out credit prediction on a user applying for loan based on a credit evaluation model so as to obtain a credit score corresponding to the user;
determining a first prediction probability of the user belonging to each first label through a first probability prediction model under the condition that the credit score is in a preset range, wherein the first labels are used for indicating an approval mode and a corresponding approval result;
determining target clusters to which the users belong according to the first prediction probabilities corresponding to the users;
determining a second prediction probability of the user belonging to a second label by using a second probability prediction model corresponding to the target group, wherein the second label is used for indicating default after loan or non-default after loan;
and determining an approval result of the loan application of the user according to the approval threshold value corresponding to the second probability prediction model and the second probability prediction model.
2. The method of claim 1, wherein determining the target group to which the user belongs according to each first prediction probability corresponding to the user comprises:
vectorizing the user according to each first prediction probability corresponding to the user so as to obtain a first vector corresponding to the user;
and determining the target cluster from the clusters according to the distances between the first vector and the cluster centers of the clusters.
3. The method of claim 1, wherein the second tag is used to indicate that there is no default after the loan, and the determining the approval result of the user's loan application according to the approval threshold value corresponding to the second probability prediction model and the second probability prediction model includes:
determining that the approval result is approval passing under the condition that the second prediction probability is greater than or equal to the approval threshold value;
and under the condition that the second prediction probability is smaller than the approval threshold value, determining that the approval result is that the approval fails.
4. The method of claim 1, wherein the second tag is configured to indicate a post-loan breach, and wherein the determining the approval result of the user's loan application based on the approval threshold for which the second predictive probability corresponds to the second probabilistic predictive model comprises:
Under the condition that the second prediction probability is smaller than or equal to the approval threshold value, determining that the approval result is approval passing;
and under the condition that the second prediction probability is larger than the approval threshold value, determining that the approval result is that the approval fails.
5. The method of any one of claims 1-4, further comprising:
determining a first label to which each sample user belongs based on credit prediction results of the credit evaluation model on a plurality of sample users applying for loans;
training an initial first probability prediction model according to a first label to which each sample user belongs and characteristic data corresponding to each sample user to obtain the first probability prediction model.
6. The method as recited in claim 5, further comprising:
predicting each sample user by using the first probability prediction model to obtain a third prediction probability of each sample user belonging to each first label;
vectorizing each sample user according to a third prediction probability that each sample user belongs to each first label so as to obtain a second vector corresponding to each sample user;
Clustering a plurality of sample users to obtain a plurality of clusters;
and determining the grouping center of each grouping according to the second vector corresponding to the sample user in each grouping.
7. The method as recited in claim 6, further comprising:
for each group, training an initial second probability prediction model according to the characteristic data of a first sample user in the group and a second label to which the first target sample user belongs to obtain a third probability prediction model, wherein the first target sample user refers to a sample user of which the first label is a first target label;
determining a fourth prediction probability of a second target sample user belonging to a second label in the group by using the third probability prediction model, wherein the second sample user refers to a sample user with a first label being the second target label;
constructing a training sample set corresponding to the grouping according to the second target sample user, a fourth prediction probability that the second target sample user belongs to a second label, other sample users except the second target sample user in the grouping and the second labels that the other sample users belong to;
And training the initial second probability prediction model by utilizing the characteristic data of the training samples in the training sample set and the second labels to which the training samples belong so as to obtain the second probability prediction models corresponding to the clusters.
8. The method as recited in claim 7, further comprising:
for each group, determining a fifth prediction probability of each sample user belonging to a second label in the group by using a second probability prediction model corresponding to the group;
performing equal frequency division on the sample users in the group according to the high-low sequence of the fifth prediction probability to obtain a plurality of sub-boxes;
for each sub-bin, determining the accumulated reject ratio corresponding to the sub-bin according to the ratio of the number of sample users violating after loan in a third target sample user to the number of the third target sample user, wherein the third target sample user is the sample user with the fifth probability larger than the lower limit probability of the sub-bin in the sub-group;
determining the accumulated passing rate corresponding to the sub-boxes according to the ratio of the number of sample users passing the loan application approval in the third target sample users to the number of sample users in the sub-group;
And determining the approval threshold according to the accumulated passing rate and the accumulated passing rate corresponding to the sub-boxes.
9. A loan application processing apparatus, comprising:
the first acquisition module is used for carrying out credit prediction on a user applying for loan based on a credit evaluation model so as to acquire a credit score corresponding to the user;
the first determining module is used for determining a first prediction probability of each first label of the user through a first probability prediction model under the condition that the credit score is in a preset range, wherein the first labels are used for indicating an approval mode and a corresponding approval result;
the second determining module is used for determining target clusters to which the user belongs according to the first prediction probabilities corresponding to the user;
the third determining module is used for determining a second prediction probability of the user belonging to a second label by using a second probability prediction model corresponding to the target grouping, wherein the second label is used for indicating a default after loan or a default after loan;
and the fourth determining module is used for determining the approval result of the loan application of the user according to the approval threshold value corresponding to the second probability prediction model and the second probability prediction model.
10. The apparatus of claim 9, wherein the second determination module is to:
vectorizing the user according to each first prediction probability corresponding to the user so as to obtain a first vector corresponding to the user;
and determining the target cluster from the clusters according to the distances between the first vector and the cluster centers of the clusters.
11. The apparatus of claim 9, wherein the second tag is configured to indicate that there is no default after the loan, and the fourth determination module is configured to:
determining that the approval result is approval passing under the condition that the second prediction probability is greater than or equal to the approval threshold value;
and under the condition that the second prediction probability is smaller than the approval threshold value, determining that the approval result is that the approval fails.
12. The apparatus of claim 9, wherein the second tag is configured to indicate a post-loan breach, and the fourth determination module is configured to:
under the condition that the second prediction probability is smaller than or equal to the approval threshold value, determining that the approval result is approval passing;
and under the condition that the second prediction probability is larger than the approval threshold value, determining that the approval result is that the approval fails.
13. The apparatus of any one of claims 9-12, further comprising:
a fifth determining module, configured to determine, based on credit prediction results of the credit evaluation model on a plurality of sample users applying for loans, a first tag to which each of the sample users belongs;
and the first training module is used for training the initial first probability prediction model according to the first label to which each sample user belongs and the characteristic data corresponding to each sample user to obtain the first probability prediction model.
14. The apparatus as recited in claim 13, further comprising:
the second obtaining module is used for predicting each sample user by using the first probability prediction model so as to obtain a third prediction probability of each sample user belonging to each first label;
the third obtaining module is used for vectorizing each sample user according to a third prediction probability that each sample user belongs to each first label so as to obtain a second vector corresponding to each sample user;
the clustering module is used for clustering a plurality of sample users to obtain a plurality of clusters;
And a sixth determining module, configured to determine a group center of each group according to a second vector corresponding to the sample user in each group.
15. The apparatus as recited in claim 14, further comprising:
the second training module is used for training the initial second probability prediction model to obtain a third probability prediction model according to the characteristic data of the first sample users in the groups and the second labels of the first target sample users, wherein the first target sample users refer to sample users with the first labels being the first target labels;
a seventh determining module, configured to determine, using the third probability prediction model, a fourth prediction probability that a second target sample user in the cluster belongs to a second label, where the second sample user is a sample user whose first label is a second target label;
the construction module is used for constructing a training sample set corresponding to the grouping according to the second target sample user, a fourth prediction probability that the second target sample user belongs to a second label, other sample users except the second target sample user in the grouping and the second label that the other sample users belong to;
And the third training module is used for training the initial second probability prediction model by utilizing the characteristic data of the training samples in the training sample set and the second label to which the training samples belong so as to acquire the second probability prediction model corresponding to the grouping.
16. The apparatus as recited in claim 15, further comprising:
an eighth determining module, configured to determine, for each group, a fifth prediction probability that each sample user in the group belongs to a second label by using a second probability prediction model corresponding to the group;
the dividing module is used for carrying out equal-frequency division on the sample users in the group according to the high-low sequence of the fifth prediction probability to obtain a plurality of sub-boxes;
a ninth determining module, configured to determine, for each bin, an accumulated failure rate corresponding to the bin according to a ratio of a number of sample users violating after loan in a third target sample user to a number of the third target sample users, where the third target sample user is a sample user in the cluster whose fifth probability is greater than a lower limit probability of the bin;
a tenth determining module, configured to determine an accumulated passing rate corresponding to the sub-box according to a ratio of the number of sample users passing the loan application approval in the third target sample user to the number of sample users in the sub-group;
And the eleventh determining module is used for determining the approval threshold according to the accumulated passing rate and the accumulated passing rate corresponding to the sub-boxes.
17. An electronic device, comprising:
a processor;
a memory for storing the processor-executable instructions;
wherein the processor is configured to execute the instructions to implement the loan application processing method of any one of claims 1 to 8.
18. A computer readable storage medium, which when executed by a processor of an electronic device, causes the electronic device to perform the loan application processing method of any one of claims 1-8.
19. A computer program product comprising a computer program which, when executed by a processor, implements a loan application processing method as claimed in any one of claims 1-8.
CN202311176331.6A 2023-09-13 2023-09-13 Loan application processing method, device, electronic equipment and storage medium Pending CN117291707A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311176331.6A CN117291707A (en) 2023-09-13 2023-09-13 Loan application processing method, device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311176331.6A CN117291707A (en) 2023-09-13 2023-09-13 Loan application processing method, device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117291707A true CN117291707A (en) 2023-12-26

Family

ID=89251003

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311176331.6A Pending CN117291707A (en) 2023-09-13 2023-09-13 Loan application processing method, device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117291707A (en)

Similar Documents

Publication Publication Date Title
CN110837931B (en) Customer churn prediction method, device and storage medium
CN110378786B (en) Model training method, default transmission risk identification method, device and storage medium
CN111291816B (en) Method and device for carrying out feature processing aiming at user classification model
US8560434B2 (en) Methods and systems for segmentation using multiple dependent variables
CN111581046A (en) Data anomaly detection method and device, electronic equipment and storage medium
CN109739844B (en) Data classification method based on attenuation weight
CN107633030A (en) Credit estimation method and device based on data model
CN112700324A (en) User loan default prediction method based on combination of Catboost and restricted Boltzmann machine
CN113095927A (en) Method and device for identifying suspicious transactions of anti-money laundering
CN111798047A (en) Wind control prediction method and device, electronic equipment and storage medium
CN114186626A (en) Abnormity detection method and device, electronic equipment and computer readable medium
CN110737641A (en) Construction method, device and system of confidence and audit models
CN113704389A (en) Data evaluation method and device, computer equipment and storage medium
CN115545886A (en) Overdue risk identification method, overdue risk identification device, overdue risk identification equipment and storage medium
Ke et al. Loan repayment behavior prediction of provident fund users using a stacking-based model
CN114119191A (en) Wind control method, overdue prediction method, model training method and related equipment
Han Researches of detection of fraudulent financial statements based on data mining
CN115641198A (en) User operation method, device, electronic equipment and storage medium
CN117291707A (en) Loan application processing method, device, electronic equipment and storage medium
CN115034762A (en) Post recommendation method and device, storage medium, electronic equipment and product
CN114722941A (en) Credit default identification method, apparatus, device and medium
Yang et al. An evidential reasoning rule-based ensemble learning approach for evaluating credit risks with customer heterogeneity
CN114626940A (en) Data analysis method and device and electronic equipment
CN114170000A (en) Credit card user risk category identification method, device, computer equipment and medium
CN110852392A (en) User grouping method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination