CN112102073A

CN112102073A - Credit risk control method and system, electronic device and readable storage medium

Info

Publication number: CN112102073A
Application number: CN202011035747.2A
Authority: CN
Inventors: 李少帅; 张博; 张胜庆; 曹家楷; 黄慕宇; 张帆
Original assignee: Changan Automobile Finance Co ltd
Current assignee: Changan Automobile Finance Co ltd
Priority date: 2020-09-27
Filing date: 2020-09-27
Publication date: 2020-12-18

Abstract

The embodiment of the invention provides a credit risk control method and system, electronic equipment and a readable storage medium, wherein the credit risk control method comprises the following steps: acquiring credit information of a user based on the basic identity information and the authorization information, extracting original features in the credit information, eliminating invalid features, and taking obtained important features as feature data; extracting an initial application limit from basic credit application information of the user; inputting the characteristic data of the user into a preset credit risk scoring model to obtain a credit scoring result of the user; obtaining a risk adjustment factor based on the established and trained risk adjustment factor model; and correcting the initial application amount of the user by using a risk adjustment factor. The embodiment of the scheme adjusts the credit granting level of the user according to the risk level of the user so as to further make a balance between risk and income, and can efficiently, automatically and reasonably realize the maximization of the benefit of a lender on the premise of controllable risk.

Description

Credit risk control method and system, electronic device and readable storage medium

Technical Field

The invention relates to a risk pricing system applied to a credit scene, and belongs to the technical field of intelligent wind control.

Background

The traditional credit limit and credit interest rate are determined by depending on manual examination, after a user submits a loan application, on one hand, lenders such as banks and the like consume a large amount of manpower and material resources to perform data examination, back-debugging and visiting and the like, and meanwhile, the correction of the user loan limit is performed by depending on related industry experience, so that automatic, personalized and reasonable risk pricing is difficult to realize according to the credit qualification of the user, the benefit maximization of the lenders is difficult to realize on the premise that the risk is controllable, and on the other hand, lenders such as personal users/legal users and the like often consume weeks from the credit application submission to the approval result, and the experience is extremely poor.

Along with the continuous precipitation of multi-dimensional data of users and the development of artificial intelligence technology, the intelligent wind control technology is receiving more and more attention and application, and at present, the intelligent wind control technology mainly focuses on comprehensively measuring credit risk control of users by aggregating data (such as basic attribute data, historical credit behavior data, social behavior data and the like) of different dimensions of users so as to replace manual auditing, that is, the output of the intelligent wind control is the risk rating result of the user, but the benefit maximization of the lender cannot be realized, for example, the A client applies for 10 ten thousand yuan of loan, the intelligent wind control rating result shows that the client applies for 10 ten thousand yuan of loan, the probability of overdue occurring is low, but in reality a customers in the 15 ten thousand dollar loan situation, the probability of overdue still is low, i.e. the lender can further pursue the compound addition of benefits.

The pain point and difficulty of the existing intelligent wind control system are how to effectively improve the ability of distinguishing users by using a good algorithm, namely how to effectively improve the identification rate of high-credit-risk users and avoid accidentally injuring low-risk users, namely how to simultaneously improve the recall rate and accuracy of the algorithm, so that the benefit maximization of a lender is ensured on the premise of controllable risk, and the 'thousands of people and thousands of prices' of the users are efficiently, automatically and reasonably realized.

Disclosure of Invention

The embodiment of the invention provides a credit risk control method and system, electronic equipment and a readable storage medium, which can realize the recall rate and the accuracy of an improved algorithm and efficiently, automatically and reasonably realize the maximization of the interest of a lender.

The embodiment of the invention provides a credit risk control method, which comprises the following steps:

s1, obtaining credit information of the user based on the basic identity information and the authorization information submitted by the user, wherein the credit information comprises basic credit application information, behavior expression information and financial product related information;

s2, extracting N original features in the credit information, and processing the N original features by adopting a box separation algorithm with the maximum K-S value to obtain box separation result features, wherein N is a positive integer; the box separation result characteristics are processed by adopting a cross characteristic derivation algorithm to obtain derived cross characteristics; combining the derived cross features, the box separation result features and the N original features, and eliminating invalid features to obtain important features serving as feature data; extracting an initial application limit from basic credit application information of the user;

s3, inputting the characteristic data of the user into a preset credit risk scoring model to obtain a credit scoring result of the user; evaluating the credit risk level of the user according to the credit scoring result;

s4, based on the established and trained risk adjustment factor model, when the credit risk level of the user is not the first level, inputting the credit scoring result of the user into the risk adjustment factor model to obtain a risk adjustment factor; and correcting the initial application amount of the user by using the risk adjustment factor to obtain a corrected pricing level.

According to the credit risk control method provided by the embodiment of the invention, the extracting N original features in the credit information, and processing the N original features by adopting a binning algorithm with a maximum K-S value to obtain binning result features, wherein N is a positive integer, specifically comprises:

extracting all N attribute information in the credit information to form N original features;

and performing the box separation algorithm processing with the maximum K-S value on the N original characteristics based on the following formula:

wherein, { f₁,f₂,f₃,…,f_i,…,f_NIs the set of N original features, f_iIs the ith original feature in the N original features, i is more than 0 and less than or equal to N,

for a set of binned result features, f_i ^cutTo correspond to the original feature f_iResult of binning of F_{cut_bin}And the box separation algorithm with the maximum K-S value is adopted.

According to the credit risk control method provided by the embodiment of the invention, the cross feature derivation algorithm is adopted to process the binning result features to obtain derived cross features, and the method specifically comprises the following steps:

performing cross feature derivation algorithm processing on the binning result features based on the following formula:

wherein,

is a collection of binned result features,

for the set of derived cross features, T is a positive integer, P_genIs a cross feature derivation algorithm.

According to the credit risk control method of the embodiment of the invention, the deriving cross features, the binning result features and the N original features are combined to remove invalid features to obtain important features, and the method specifically includes:

combining the derived cross features, the box separation result features and the N original features to obtain combined features;

then, any one algorithm or any combination of algorithms of a chi-square verification algorithm, an information gain algorithm, an IV value algorithm, a gradient lifting tree algorithm, a characteristic PSI index algorithm, a characteristic variance value algorithm, a Pearson correlation coefficient algorithm and a maximum information coefficient algorithm is adopted to carry out importance evaluation on the combined characteristics;

and based on the evaluation result, eliminating invalid features and keeping important features.

According to the credit risk control method provided by the embodiment of the invention, the implementation process of inputting the characteristic data of the user into a preset credit risk scoring model to obtain the credit scoring result of the user comprises the following steps:

S_score＝F_score[P_{feature-engineer}(f₁,f₂,f₃…)]

wherein f is₁，f₂，f₃… are several characteristic data of a user, P_{feature-engineer}Is a feature engineering algorithm, S_scoreScoring the credit of the user, wherein the feature engineering algorithm comprises data preprocessing, feature derivation and feature selection, and F_scoreIs a preset credit risk scoring model.

According to the credit risk control method provided by the embodiment of the invention, the implementation process of evaluating the credit risk level of the user according to the credit scoring result is as follows:

wherein

Scoring results for several users' credits, L_creditIs the credit risk rating, L, of the user_creditIs in the set { L }_reject,L_careful,L_common,L_low,L_bypassIn which L is_reject、L_careful、L_common、L_low、L_bypassRespectively representing different levels of credit risk rating, F_creditRepresenting a preset credit risk level model.

According to the credit risk control method provided by the embodiment of the invention, the specific implementation process of inputting the credit scoring result of the user into the risk adjustment factor model to obtain the risk adjustment factor is as follows:

R＝G_factor(S_score,α,β，…)

wherein S_scoreFor the credit score result of the user, alpha and beta are risk adjustment factor model parameters or hyper-parameters, R is the output user risk adjustment factor, G_factorThe model is adjusted for a preset user risk.

The embodiment of the invention provides a credit risk control system, which comprises:

the credit information acquisition module is used for acquiring credit information of the user based on basic identity information and authorization information submitted by the user, wherein the credit information comprises basic credit application information, behavioral expression information and financial product related information;

the characteristic information extraction module is connected with the credit information acquisition module and used for extracting N original characteristics in the credit information and processing the N original characteristics by adopting a binning algorithm with the maximum K-S value to obtain binning result characteristics, wherein N is a positive integer; the box separation result characteristics are processed by adopting a cross characteristic derivation algorithm to obtain derived cross characteristics; combining the derived cross features, the box separation result features and the N original features, and eliminating invalid features to obtain important features serving as feature data; extracting an initial application limit from basic credit application information of the user;

the risk level evaluation module is connected with the characteristic information extraction module and used for inputting the characteristic data of the user into a preset credit risk scoring model to obtain a credit scoring result of the user; evaluating the credit risk level of the user according to the credit scoring result;

the risk adjustment control module is respectively connected with the characteristic information extraction module and the risk grade evaluation module and is used for inputting a credit scoring result of the user into the risk adjustment factor model to obtain a risk adjustment factor when the credit risk grade of the user is not a first grade, namely the credit granting decision is not refused, based on the established and trained risk adjustment factor model; and correcting the initial application amount of the user by using the risk adjustment factor to obtain a corrected pricing level.

An embodiment of the present invention provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the credit risk control method when executing the program.

Embodiments of the present invention provide a non-transitory computer readable storage medium having stored thereon a computer program that, when executed by a processor, performs the steps of the credit risk control method described herein.

The credit risk control method and system, the electronic device and the readable storage medium provided by the embodiment of the invention extract the feature data from the credit information of the user, then perform the binning algorithm processing on the feature data with the maximum K-S value to obtain the binning result features, perform the cross feature derivation algorithm processing on the binning result features to obtain the derived cross features, and finally combine the feature data, the binning result features and the derived cross features and eliminate the invalid features to obtain the important features which are input into the preset credit rating and grade model to obtain the credit grade of the user. The embodiment of the invention creatively combines the characteristic data, the binning result characteristic and the derived cross characteristic, and the binning algorithm adopts the binning algorithm with the maximum K-S value, thereby improving the capability of the risk assessment algorithm in distinguishing good and bad users. Therefore, the method provided by the invention realizes the improvement of the recall rate and the accuracy rate of the algorithm of the intelligent wind control system.

According to the embodiment of the invention, the credit scoring result of the user is input into the risk adjustment factor model to obtain the risk adjustment factor, and the credit granting level of the user is corrected by using the risk adjustment factor. After the risk factor is modified, the credit risk level of the user is not changed basically, but through differentiated risk pricing: if the credit qualification of the user is good, the credit granting level of the user is further improved, so that the income is improved, and if the credit qualification of the user is poor, the credit granting level of the user is further reduced.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.

FIG. 1 is a schematic diagram of a credit risk control method provided by an embodiment of the invention;

FIG. 2 is a schematic diagram of a credit risk control system provided by an embodiment of the invention;

FIG. 3 is a flowchart of the operation of an automated credit risk assessment system provided by an embodiment of the invention;

fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Fig. 1 is a schematic flow chart of a credit risk control method according to an embodiment of the present invention, as shown in fig. 1, the method includes:

and S1, obtaining credit information of the user based on the basic identity information and the authorization information submitted by the user, wherein the credit information comprises basic credit application information, behavior expression information and financial product related information.

Specifically, the feature data of the user in S1 includes feature data extracted from credit information of the user, which is obtained based on basic identity information and authorization information submitted by the user. The method comprises the steps that a user firstly submits own basic identity information when risk assessment is carried out, the basic identity information generally comprises the name, the identity card number and the mobile phone number of the user, then the gender and the age of the user are analyzed through the identity card number of the user under the condition that the user passes the authorization, then user credit investigation behavior performance data and user third-party platform behavior performance data are called based on the identity card number and the mobile phone number of the user, and the sum of the data is credit information of the user. The credit information is divided into three categories including basic credit application information, behavioral performance information and financial product related information. Wherein the basic credit application information includes the user's age, gender, income level, education level, etc.; the performance information includes multi-headed loan intent data, historical overdue performance data, bank card data, credit card data, guarantor data, asset disposition data, and the like; the financial product related information includes a financial product interest rate, a loan amount, a loan term, and the like.

S2, extracting N original features in the credit information, and processing the N original features by adopting a box separation algorithm with the maximum K-S value to obtain box separation result features, wherein N is a positive integer; the box separation result characteristics are processed by adopting a cross characteristic derivation algorithm to obtain derived cross characteristics; combining the derived cross features, the box separation result features and the N original features, and eliminating invalid features to obtain important features serving as feature data; and extracting the initial application limit from the basic credit application information of the user.

Specifically, feature data is extracted from the credit information, wherein each attribute data in the credit information corresponds to one feature data. Specifically, the above-described attributes are extracted from three types of information, i.e., basic credit application information, performance information, and financial product-related information, respectively. The attributes in the underlying credit application information include: the name, the identification number, the mobile phone number, the marital status, the communication address, the household address, the province, the city and the region where the user applies for loan, the gender, the age, the academic history, the highest academic level, whether the user is in the local household, the industry of the unit, the working year (the unit), the income of the user per month and the expense of the family per month; attributes in the performance information include: average limit usage of credit card products in the last 6 months, maximum overdue number of credit products in the last 12 months, maximum overdue number of credit products in the last 24 months, maximum account age of all credit products, cumulative overdue number of credit products in the last 6 months, cumulative overdue number of credit products in the last 12 months, cumulative overdue number of credit products in the last 24 months, cumulative overdue number of credit approval inquiry times in the last 3 months, cumulative credit approval inquiry times in the last 6 months, cumulative credit approval inquiry times in the last 12 months, whether a user (credit report) has a credit, current overdue number of user loans, user (credit report) loan status, user credit card current overdue number, user (credit report) credit card status, user maximum overdue number of loans in the last 24 months, the method comprises the steps of accumulating overdue amount of loan of a user in the last 24 months, the highest overdue amount of credit card of the user in the last 24 months, accumulating the overdue amount of credit card of the user in the last 24 months, the number of credit cards with the use rate exceeding 80 percent, the total amount of account information of the user, the total amount of asset disposal information of the user, whether the user has a mandatory execution record, whether the user has an administrative punishment record, the history overdue proportion of loan, the history overdue proportion of a single credit card, the history overdue proportion of a plurality of credit cards, the loan debt of the user, the maximum account age of credit card products (credit card products comprise credit cards and quasi-credit cards), the house loan amount of strokes, the average use rate of the credit card products in the last 6 months (general credit granting credit) (credit card products comprise credit cards and quasi-credit cards), the maximum amount of the credit card products in the last 24 months (free account is not considered), the accumulated overdue number (not considering the dead account) of a single loan product in the last 24 months, five-level classification of guarantee loan, the last 90-day loan query number, the last 180-day loan query number, the last 90-day loan query platform number, the last 180-day loan query platform number, the total account number of credit cards, the total amount of credit card debt, the last overdue 4 or more of credit cards, the last 9-month credit examination and approval query number, the last 3-month credit examination and approval query mechanism number, the last 6-month credit examination and approval query mechanism number, the last 9-month credit examination and approval query mechanism number, the last 12-month credit examination and approval query mechanism number, the last 3-12-month credit examination and approval query number, the last 12-month comprehensive credit examination and approval query number, all the consumption loan number, the total sum of all the consumption loans, the outstanding consumption loan number, the method comprises the following steps of (1) not clearing the total amount of the consumption loan, stopping payment of the account state of a credit card, freezing the account state of the credit card, classifying the loan in five levels, compensating by a guarantor, guaranteeing the classification in five levels, the maximum overdue period within the last 12 months, the number of outstanding automobile loan strokes, the online time of a mobile phone, the online state of the mobile phone, the card type of the mobile phone, detecting the three elements of the mobile phone, detecting the province and the city of the mobile phone number, whether a user is a law deceased person, whether the user is a law executed person, the number of loan application platforms of the user in the last 7 days, the number of times of the mobile phone number of the user as the mobile phone number of a contact person in 3 months, how many application information the identity card is associated in 7 days, and how many times of; the attributes in the financial product related information include: the loan product earning rate, the loan product interest rate, the loan amount, the loan term, the first payment proportion and the first payment amount. And taking each attribute as a feature data of the user, wherein all the attributes form a feature data set of the user. It should be noted here that text discrete type feature data, such as gender, marriage, academic calendar, academic degree, province of the customer, city, region, local household, industry of the unit, whether the user (credit report) has credit, whether the user has administrative penalty records, whether the user has mandatory execution records, whether the user is a law distressed person, whether the user is a law executed person, and the like, exist in the feature data, and the feature data are subjected to tag encoding and unique hot encoding and then converted into digital discrete features. And then, processing the N pieces of feature data by adopting a binning algorithm with the maximum K-S value to obtain binning result features, wherein N is the total number of the feature data and is a positive integer. The box separation algorithm has various processing modes, and the box separation algorithm with the maximum K-S value is adopted for processing the characteristic data, so that users who are easy to overdue and default and users who are not easy to overdue and default can be more accurately distinguished from the user risk assessment result.

And processing the box separation result characteristics by adopting a cross characteristic derivation algorithm to obtain derived cross characteristics.

Specifically, the binning result features are input into a cross feature derivation model for cross feature derivation algorithm processing, and the cross features are output as derived cross features.

And combining the derived cross features, the box separation result features and the N feature data, and removing invalid features to obtain important features.

Specifically, the derived cross features, the binning result features and the N feature data are combined, and then invalid features are removed. The common method for eliminating the invalid features is to adopt a chi-square verification algorithm, an information gain algorithm, an IV value algorithm, a gradient lifting tree algorithm, a feature PSI index algorithm, a feature variance value algorithm, a pearson correlation coefficient algorithm, a maximum information coefficient algorithm and the like to evaluate the feature importance, wherein one algorithm can be adopted for evaluation, or any combination of algorithms can be used for evaluation, namely, the strength of the prediction capability of the feature is calculated, then according to a preset importance threshold, the feature with the calculated prediction capability value exceeding the importance threshold is reserved as the important feature, the feature with the calculated prediction capability value not exceeding the importance threshold is eliminated as the invalid features, and the selection of the importance evaluation algorithms is not specifically limited.

S3, inputting the characteristic data of the user into a preset credit risk scoring model to obtain a credit scoring result of the user; evaluating the credit risk level of the user according to the credit scoring result; and obtaining a credit granting decision according to the credit risk level of the user.

Specifically, the feature data of the user is input into a pre-constructed credit score and grade model to obtain the credit grade of the user. The pre-constructed credit score and grade model is also the parameters of the model set on the basis of a large number of experiments. The output credit levels of the user are divided into five categories, namely a first risk level, a second risk level, a third risk level, a fourth risk level and a fifth risk level, the first risk level triggers automatic rejection response, namely the user is considered to have overdue default after the user has a credit on a maximum probability, and the credit application of the user is directly rejected; the second risk level triggers and suggests a manual judicious audit response, namely the user is considered to be defaulted after the loan with high probability, so that the manual judicious audit is reminded, and the risk points of the user are synchronously transmitted to be used as manual reference; triggering and proposing a manual conventional audit response by the third risk level, namely considering that the user will generate default after loan on certain probability, reminding manual conventional verification, and synchronously transmitting the risk points of the user as manual references; the fourth risk level triggers and suggests manual quick passing response, namely the user is considered to be defaulted after loan with small probability, so that the user is reminded to pass the audit of the user quickly, and the risk points of the user are synchronously transmitted to be used as manual reference; the fifth risk level triggers automatic passing response, namely the credit qualification of the user is considered to be excellent, and overdue default after the credit can happen with very small probability, so the user directly passes through the credit application.

For example, the initial application amount that the user initially wants to apply for is 20 ten thousand, after the risk pricing model is evaluated, the risk adjustment factor of the user is output to be 0.78, then the loan amount that the user can apply for is 20x0.78 ═ 15.6 ten thousand after the user is subjected to risk correction, it is to be noted that the purpose of introducing risk pricing is to make a balance between risk and income, if the user credit quality is good, the credit level can be further improved, thereby the income is improved, if the user credit quality is poor, the credit level can be further reduced, thereby the overdue risk is reduced, as shown in table 1, the risk pricing results of 10 clients of a certain automotive finance company:

TABLE 1

The method provided by the embodiment of the invention comprises the steps of extracting characteristic data from credit information of a user, then carrying out box separation algorithm processing on the characteristic data with the maximum K-S value to obtain box separation result characteristics, carrying out cross characteristic derivation algorithm processing on the box separation result characteristics to obtain derived cross characteristics, and finally combining the characteristic data, the box separation result characteristics and the derived cross characteristics and eliminating invalid characteristics to obtain important characteristics and inputting the important characteristics into a preset credit score and grade model to obtain the credit grade of the user. The embodiment of the invention creatively combines the characteristic data, the binning result characteristic and the derived cross characteristic, and the binning algorithm adopts the binning algorithm with the maximum K-S value, thereby improving the capability of the risk assessment algorithm in distinguishing good and bad users. Therefore, the method provided by the invention realizes the improvement of the recall rate and the accuracy rate of the algorithm of the intelligent wind control system, and in addition, the embodiment of the scheme carries out the personalized risk pricing of the user on the basis of the credit risk scoring result, namely, the credit granting level of the user is adjusted according to the risk level of the user so as to further make a balance between the risk and the income. According to the embodiment of the invention, the credit scoring result of the user is input into the risk adjustment factor model to obtain the risk adjustment factor, and the credit granting level of the user is corrected by using the risk adjustment factor. After the risk factor is modified, the credit risk level of the user is not changed basically, but through differentiated risk pricing: if the credit qualification of the user is good, the credit granting level of the user is further improved, so that the income is improved, and if the credit qualification of the user is poor, the credit granting level of the user is further reduced.

According to the credit risk control method provided by the embodiment of the invention, the implementation process of converting the credit risk of the user into the credit scoring result based on the characteristic data of the user is as follows:

S_score＝F_score[P_{feature-engineer}(f₁,f₂,f₃…)]

The implementation process of evaluating the credit risk level of the user according to the credit scoring result is as follows:

wherein

Specifically, the credit scoring result is input into a pre-constructed credit risk level model to obtain the credit level of the user. The pre-constructed credit risk level is also each parameter of the model set on the basis of a large number of experiments. Here, the output credit levels of the user are classified into five categories, a first risk level, a second risk level, a third risk level, a fourth risk level, and a fifth risk level.

When the credit risk level of the user is a first risk level, the credit risk level is L_rejectIf the credit granting decision is to reject the credit application of the user;

when the credit risk level of the user is a fifth risk level, the credit risk level is L_bypassIf the credit granting decision is the credit application passing through the user;

when the credit risk level of the user is a second risk level, a third risk level and a fourth risk level, the credit risk level is L_careful、L_common、L_lowAnd if the credit application is not the credit application of the user, the credit application is determined to be manually checked, and the credit risk level of the user is sent to the manual work to be used as a checking reference.

Specifically, the first risk level triggers automatic rejection response, namely the user is considered to have overdue default after the loan on the maximum probability, so the credit application is directly rejected; the second risk level triggers and suggests a manual judicious audit response, namely the user is considered to be defaulted after the loan with high probability, so that the manual judicious audit is reminded, and the risk points of the user are synchronously transmitted to be used as manual reference; triggering and proposing a manual conventional audit response by the third risk level, namely considering that the user will generate default after loan on certain probability, reminding manual conventional verification, and synchronously transmitting the risk points of the user as manual references; the fourth risk level triggers and suggests manual quick passing response, namely the user is considered to be defaulted after loan with small probability, so that the user is reminded to pass the audit of the user quickly, and the risk points of the user are synchronously transmitted to be used as manual reference; the fifth risk level triggers automatic passing response, namely the credit qualification of the user is considered to be excellent, and overdue default after the credit can happen with very small probability, so the user directly passes through the credit application.

Further, the specific implementation process of the risk adjustment factor is as follows:

R＝G_factor(S_score,α,β，…)

wherein S_scoreFor the credit score result of the user, alpha and beta are risk adjustment factor model parameters or hyper-parameters, R is the output user risk adjustment factor, G_factorThe model is adjusted for a preset user risk. The specific contents of the user risk adjustment model comprise: the risk adjustment factor R is obtained by obtaining Lcredit according to Fcredit, and obtaining credit score intervals corresponding to five risk levels, for example, the credit score interval corresponding to the first risk level is [0,430 ], the credit score interval corresponding to the second risk level is [430,630 ], the third risk level is [630,675 ], the fourth risk level is [675,700 ], the fifth risk level is [700,]then, a hyperparameter, i.e. a range of the risk adjustment factor corresponding to each risk level, for example, a credit score range of the fifth risk level is [700 ], is set to an upper score limit of 1000, i.e. if the score obtained by the customer exceeds 1000 points, for example, the score obtained is 1200 points, the score is also reset to 1000 points, i.e. the credit score range of the fifth risk level is [700,1000 ]]According to the scoring conditions of all clients with the fifth risk level in the training sample, the score of 50% of the obtained credit score is 732, then the interval of the risk adjustment factor of the hyper-parameter is set to be 1.0-1.3, (the setting of the interval of the hyper-parameter needs to be trained according to a model and is determined after continuous optimization), the score of 50% of the interval of the risk adjustment factor of the hyper-parameter is 1.15, and then the score of the risk adjustment factor of the hyper-parameter is determined according to theThe difference value is used to calculate a risk adjustment factor for a client within the fifth risk level, i.e.

The factor is the risk adjustment factor of the client, after the risk adjustment factor of the client is obtained, the loan amount of the client can be corrected, and then the loan interest rate of the client is corrected according to the corrected loan amount, the income rate, the loan term and the interest amount according to the financial function, specifically: firstly, calculating the payment amount per period according to the earning RATE, the loan term number and the corrected loan amount, calculating by adopting a ready-made PMT financial function, after obtaining the payment amount per period, correcting the payment amount per period according to the interest amount, wherein for example, the payment amount per period is 3000, and the interest amount is 1200, then the corrected payment amount per period is 3000-1200/12 which is 2900, and then calculating the loan RATE of the client by adopting a ready-made RATE financial function according to the corrected payment amount per period. The calculation logics of other risk grade conditions are the same, and the hyper-parameters of each risk grade needing to be supplemented, namely the risk adjustment factor interval:

the first risk level is a directly declined customer and is therefore no longer subject to loan line and loan interest rate correction.

The risk adjustment factor interval of the second risk level is 0.6-1.1

The risk adjustment factor interval of the third risk level is 0.8-1.1

The risk adjustment factor interval of the fourth risk level is 1.0-1.2

The risk adjustment factor interval of the fifth risk grade is 1.0-1.3

Generally, the credit level of the user is modified without changing the credit risk level L_creditThat is, for example, the user's credit risk level L before risk pricing_creditIs L_bypassThen, after the credit level of the user is corrected, the credit risk level L is determined_creditIs still maintained as L_bypassOtherwise, from the credit decision process, if the user believes the waterAfter the credit risk level is changed after the correction, the credit risk level of the user needs to be re-evaluated by the credit risk scoring module 10, then the risk equivalence is performed again, then the credit risk evaluation is performed again, then the risk pricing is performed, and the circulation is repeated, so that a dead circulation is formed, and the risk pricing loses significance. After the risk pricing is finished, the decision engine module carries out final credit granting decision according to the risk grade division result of the credit scoring module.

The trust decision specific operation is shown in table 2 below:

TABLE 2

As is known, the capability of a credit scoring and level model for distinguishing good and bad users depends on the effectiveness of the model entering features to a great extent, and through various attempts, the embodiment of the invention finally discovers that the original features are subjected to box division by adopting a box division algorithm based on the maximum K-S value, then the box division results are subjected to cross feature derivation, and the K-S value (the K-S value reaches more than 65) of the credit risk scoring model can be obviously improved after the important features screened by combining the original features, the box division result features and the derived features are put into a model, so that the capability of distinguishing the good and bad users can be improved. Further, before the S1, the method further includes: verifying the distinguishing capability of the credit risk scoring model on whether the user violates the rules by using the K-S value specifically comprises the following steps:

after the wind control model predicts the whole samples, dividing the whole samples into two groups of samples according to whether the samples violate, and detecting whether the wind control scores of the two groups of samples have significant difference by using a K-S value, wherein the calculation method of the K-S value comprises the following steps:

KS_value＝Max[abs(∑P_good-∑P_bad)]

where ∑ P_good，∑P_badFor the cumulative ratio of the number of good samples and the cumulative ratio of the number of bad samples, abs represents an absolute value, and Max represents a maximum value.

Table 3 is a table of evaluation criteria for K-S values, as shown in Table 3: the evaluation criteria for K-S values are as follows:

TABLE 3

K-S value	[0,0.2)	[0.2,0.4)	[0.4,0.6)	[0.6,0.75)	[0.75,1)
						Evaluation results	Difference (D)	In	Good wine	Superior food	Abnormality (S)

Based on any of the above embodiments, an embodiment of the present invention further provides an automatic credit risk assessment system, and fig. 3 is a flowchart of the automatic credit risk assessment system according to the embodiment of the present invention. As shown in fig. 3, the system first receives a credit application submitted by a user, and then the system determines whether the credit application submitted by the user satisfies an admission condition, and determines whether to perform risk assessment according to whether personal information submitted by the user is correct and whether the user authorizes the system. And then the credit characteristic derivation module extracts original characteristics from the credit information of the user, performs box separation on the original characteristics by adopting a box separation algorithm based on the maximum K-S value, performs cross characteristic derivation on the box separation result, and screens the characteristics obtained by combining the original characteristics, the box separation result characteristics and the derived characteristics to obtain important characteristics. Inputting the important features into a credit risk rating module, outputting the grade of the user to one of the five grades in the embodiment of the invention, and then performing system subsequent processing: the system automatically rejects or the system automatically passes or the system suggests a manual judicial review or the system suggests a manual routine review or the system suggests a manual quick pass. And finally, manually deciding whether the final result passes or not when the system is converted into manual processing. The end result is either the system agreeing to the user's credit application or refusing the user's credit application.

In the existing wind control system, credit approval depends heavily on manual approval, on one hand, a fund lender consumes a large amount of manpower and material resources to perform data approval, electric regulation visit and the like, the cost is extremely high, and on the other hand, the fund lender usually consumes several weeks from credit application submission to approval result acquisition, and the experience is extremely poor. In order to verify the effectiveness of the automatic credit risk assessment system, 290000 personal credit application pieces of automobile finance company are tested, and the result shows that the K-S value of the system reaches more than 65, which indicates that the system has excellent good/bad user distinguishing capability.

Based on the above embodiment, in the method, the extracting N feature data in the credit information, and processing the N feature data by using a binning algorithm with a maximum K-S value to obtain binning result features, where N is a positive integer specifically includes:

extracting all N attribute information in the credit information to form N characteristic data;

and performing the box separation algorithm processing with the maximum K-S value on the N characteristic data based on the following formula:

wherein, { f₁,f₂,f₃,…,f_i,…,f_NIs the set of N feature data, f_iFor the N characteristicsI is more than 0 and less than or equal to N in the ith characteristic data in the data,

for a set of binned result features, f_i ^cutTo correspond to the characteristic data f_iResult of binning of F_{cut_bin}And the box separation algorithm with the maximum K-S value is adopted.

Specifically, all attribute information in the credit information is extracted, and N attribute information is provided to form N characteristic data: f. of₁,f₂,f₃,…,f_N. Then, inputting the N characteristic data into a box separation model with the maximum K-S value, processing by adopting a box separation algorithm with the maximum K-S value, and outputting N box separation result characteristics:

f in the above formula_{cut_bin}The output box separation result characteristic f of the box separation algorithm with the maximum K-S value_i ^cutIs corresponding to the characteristic data f_iAnd (5) outputting the processed data through the box-dividing algorithm with the maximum K-S value.

Based on any one of the above embodiments, in the method, the processing the binning result features by using a cross feature derivation algorithm to obtain derived cross features specifically includes:

wherein,

is a collection of binned result features,

Specifically, the binning result feature set is input into a cross feature derivation model for cross feature derivation algorithm processing, and derived cross features are output. The greedy algorithm is preferably used for cross feature derivation, namely, the greedy algorithm is used for carrying out Cartesian product calculation on feature binning results.

Based on any one of the above embodiments, in the method, the step of combining the derived cross features, the binning result features, and the N feature data and removing invalid features to obtain important features specifically includes:

combining the derived cross features, the box separation result features and the N pieces of feature data to obtain combined features;

Specifically, the derived cross feature, the binning result feature and the N feature data are combined to obtain a combined feature

Wherein f is₁，f₂，f₃，…，f_NFor the number N of feature data sets,

for the T derived cross-features,

and N binning result features, wherein N is the total number of feature data, T is the total number of derived cross features, and both N and T are positive integers. And then eliminating invalid features in the combined features. The method for eliminating invalid features comprises a chi-square verification algorithm, an information gain algorithm and an IV value algorithmThe importance evaluation of the combined features is carried out by any one algorithm or any combination of algorithms of a gradient lifting tree algorithm, a feature PSI index algorithm, a feature variance value algorithm, a Pearson correlation coefficient algorithm and a maximum information coefficient algorithm, in fact, other algorithms which are not listed for judging the importance of the combined features are also available, and other algorithms which are not listed are added, so that various algorithms or combinations of algorithms can be formed for evaluating the importance of the features, namely, the strength of the prediction capability of the features is calculated, then the features of which the calculated prediction capability value exceeds the importance threshold value are reserved as important features according to a preset importance threshold value, and the features of which the calculated prediction capability value does not exceed the importance threshold value are eliminated as invalid features.

Based on any one of the above embodiments, in the method, the inputting the important features into a preset credit risk scoring model and a preset credit risk level model to obtain the credit level of the user specifically includes:

inputting the important features into an analysis model based on a LightGBM algorithm, and outputting the probability that the user is overdue default after easy loan as p and the probability that the user is overdue default after difficult loan as 1-p;

the LightGBM algorithm-based analysis model is trained based on a 10-fold cross validation method, 10 basic analysis models are obtained after the training to form the LightGBM algorithm-based analysis model, and p is the average value of the probability of overdue default after the 10 basic analysis models output 10 basic users are easy to loan;

determining a post-loan overdue default proportion index Odds based on the formula Odds ═ p/(1-p);

determining a credit Score (i.e., the credit Score result S) of the user based on the formula Score a-blog (odds)_score) Wherein A and B are proportional indexes theta based on overdue default after-specific credit₀Corresponding to theta₀Credit score value P of₀And theta₀Doubling the corresponding credit score reduction value P_dA determined constant;

and determining the credit rating of the user based on the credit rating of the user and the division region of the credit rating corresponding to the rating.

Specifically, the credit scoring and ranking model provided by the embodiment of the invention is constructed based on a LightGBM integrated learning algorithm, the LightGBM is an improved implementation under a GBDT algorithm framework, and is a fast, distributed and high-performance GBDT framework based on a decision tree algorithm, when high-dimensional big data is faced, the efficiency and expandability of the GBDT framework algorithm can be remarkably improved, after the characteristics of a user (namely, important characteristics after eliminating invalid characteristics) are input into an analysis model based on the LightGBM algorithm, the analysis model based on the LightGBM algorithm can output the probability p (the value range is 0-1) that the user is in a positive type (namely, the user is considered to have a post-loan overdue default on the maximum probability), and the probability that the user is in a negative type (the user is a normal user, and the minimum probability has the post-loan overdue default) is 1-p. The LightGBM algorithm-based analysis model is trained based on a 10-fold cross validation method, 10 basic analysis models are obtained after the training to form the LightGBM algorithm-based analysis model, and p is the average value of the probabilities of overdue violations after the 10 basic analysis models output by the 10 basic analysis models are easy to loan. The analysis model based on the LightGBM algorithm consists of 10 basic analysis models obtained by training through a 10-fold cross validation method, so that the analysis model based on the LightGBM algorithm has higher model stability and robustness compared with a single analysis model on the premise of ensuring extremely high accuracy and recall rate, and then further obtains the past due default proportional index odds based on the following formula:

odds ═ p/(1-p), then the user's credit Score can be determined by the following equation: score is a-blog (odds), where a is a compensation constant, B is a scale constant, and a and B pass a predetermined overdue default scaling index θ₀Corresponding to theta₀Credit score value P of₀And theta₀Doubling the corresponding credit score reduction value P_dAnd (4) determining. After the credit rating of the user is calculated, the credit rating of the user can be determined according to the division regions of the credit rating corresponding to the rating.

Based on the above mentioned taskIn one embodiment, the method wherein A and B are based on a predetermined past due default scaling factor θ₀Corresponding to theta₀Credit score value P of₀And theta₀Doubling the corresponding credit score reduction value P_dThe determined constants specifically include:

setting theta₀＝20，P₀＝600，P_d50, a and B are determined based on the following formula:

A＝P₀+Blog(θ₀)

correspondingly, the division interval of the credit grade corresponding scores is as follows:

the interval of credit score corresponding to the first risk level is [0,430 ];

the interval of the credit score corresponding to the second risk level is [430,630 ];

the interval of credit score corresponding to the third risk level is [630,690 ];

the interval of the credit score corresponding to the fourth risk level is [690,710 ];

the interval of the credit score corresponding to the fifth risk level is [710 ].

Specifically, a large number of experimental samples show that the overdue default ratio after the loan is about 20, so that the embodiment of the present invention sets the index θ of the overdue default ratio after the preset specific loan₀Is 20, then a preset overdue default proportion index theta after specific credit is set₀Corresponding credit score value P₀Is 600 minutes, then set to theta₀Doubling the corresponding credit score reduction value P_dIt was 50 minutes. Due to setting of overdue default scaling index theta after a specific loan₀Is 20, theta₀Corresponding credit score value P₀Is 600, theta₀Doubling the corresponding credit score reduction value P_dTo 50, substituting the above three values into the following two equations Odds ═ p/(1-p) and Score ═ a-blog (Odds) yields the following two equations:

P₀＝A-Blog(θ₀)

P₀-P_d＝A-Blog(2θ₀)

solving the above two equations can obtain:

A＝P₀+Blog(θ₀)

before the invention of the patent, the risk pricing mainly depends on manual experience, in order to verify the effectiveness of the credit risk automatic evaluation system of the patent, tests are carried out on 360000 personal credit application pieces of automobile finance of a certain company, and the test results are shown in the following table 4:

TABLE 4

Namely, after the risk factor is corrected, the credit risk level of the user is basically not changed, but the benefit of the lender can be automatically and reasonably maximized on the premise of risk control through differentiated risk pricing (if the credit qualification of the user is good, the credit granting level of the user is further improved, so that the income is improved, and if the credit qualification of the user is poor, the credit granting level of the user is further reduced).

As shown in fig. 2, an embodiment of the present invention provides a credit risk control system, including:

the credit information acquisition module 10 is used for acquiring credit information of the user based on basic identity information and authorization information submitted by the user, wherein the credit information comprises basic credit application information, behavioral expression information and financial product related information;

the feature information extraction module 20 is connected to the credit information acquisition module 10, and is configured to extract N original features in the credit information, and process the N original features by using a binning algorithm with a maximum K-S value to obtain binning result features, where N is a positive integer; the box separation result characteristics are processed by adopting a cross characteristic derivation algorithm to obtain derived cross characteristics; combining the derived cross features, the box separation result features and the N original features, and eliminating invalid features to obtain important features serving as feature data; extracting an initial application limit from basic credit application information of the user;

the risk level evaluation module 30 is connected to the feature information extraction module 20, and is configured to input feature data of the user into a preset credit risk scoring model to obtain a credit scoring result of the user; evaluating the credit risk level of the user according to the credit scoring result;

a risk adjustment control module 40, respectively connected to the characteristic information extraction module 20 and the risk level evaluation module 30, for inputting the credit scoring result of the user into the risk adjustment factor model to obtain a risk adjustment factor based on the established and trained risk adjustment factor model when the credit risk level of the user is not the first level; and correcting the initial application amount of the user by using the risk adjustment factor to obtain a corrected pricing level.

The working principle of the credit risk control system of the embodiment of the present application is corresponding to that of the credit risk control method of the embodiment, and details are not repeated here.

Fig. 4 illustrates a physical structure diagram of an electronic device, which may include: a processor (processor)810, a communication Interface 820, a memory 830 and a communication bus 840, wherein the processor 810, the communication Interface 820 and the memory 830 communicate with each other via the communication bus 840. The processor 810 may invoke logic instructions in the memory 830 to perform a credit risk control method comprising:

s1, converting the credit risk of the user into a credit scoring result based on the characteristic data of the user;

s2, evaluating the credit risk level of the user according to the credit scoring result;

and S3, obtaining a credit granting decision according to the credit risk level of the user.

In addition, the logic instructions in the memory 830 may be implemented in software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

In another aspect, embodiments of the present invention also provide a computer program product including a computer program stored on a non-transitory computer readable storage medium, the computer program including program instructions which, when executed by a computer, the computer is capable of performing a credit risk control method, the method including:

In yet another aspect, an embodiment of the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program that, when executed by a processor, is implemented to perform a credit risk control method, the method including:

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A credit risk control method, comprising:

2. The credit risk control method according to claim 1, wherein the extracting N original features in the credit information, and processing the N original features by using a binning algorithm with a maximum K-S value to obtain binning result features, wherein N is a positive integer specifically includes:

3. The credit risk control method according to claim 1, wherein the processing the binned result features with a cross feature derivation algorithm to obtain derived cross features comprises:

wherein,

is a collection of binned result features,

4. The credit risk control method according to claim 1, wherein the step of combining the derived cross features, the binning result features and the N original features and removing invalid features to obtain important features comprises:

5. The credit risk control method according to claim 1, wherein the inputting the characteristic data of the user into a preset credit risk scoring model to obtain the credit scoring result of the user is realized by:

S_score＝F_score[P_{feature-engineer}(f₁,f₂,f₃…)]

6. The credit risk control method of claim 1, wherein the assessing the credit risk level of the user based on the credit scoring results is performed by:

wherein

7. The credit risk control method according to claim 1, wherein the credit scoring result of the user is input into a risk adjustment factor model, and the risk adjustment factor is obtained by:

R＝G_factor(S_score,α,β，…)

wherein S_scoreFor the credit scoring result of the user, alpha, beta are risk adjustment factor model parameters or risk adjustment factor model hyper-parameters, R is the output user risk adjustment factor, G_factorThe model is adjusted for a preset user risk.

8. A credit risk control system, comprising:

the risk adjustment control module is respectively connected with the characteristic information extraction module and the risk grade evaluation module and is used for inputting a credit scoring result of the user into the risk adjustment factor model to obtain a risk adjustment factor based on the established and trained risk adjustment factor model when the credit risk grade of the user is not a first grade; and correcting the initial application amount of the user by using the risk adjustment factor to obtain a corrected pricing level.

9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the steps of the credit risk control method according to any one of claims 1-7 are implemented when the program is executed by the processor.

10. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the steps of the credit risk control method of any of claims 1-7.