CN108898476A - A kind of loan customer credit-graded approach and device - Google Patents

A kind of loan customer credit-graded approach and device Download PDF

Info

Publication number
CN108898476A
CN108898476A CN201810614063.4A CN201810614063A CN108898476A CN 108898476 A CN108898476 A CN 108898476A CN 201810614063 A CN201810614063 A CN 201810614063A CN 108898476 A CN108898476 A CN 108898476A
Authority
CN
China
Prior art keywords
information
classification
client
customer
customer information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810614063.4A
Other languages
Chinese (zh)
Inventor
张静
狄潇然
田林
张亚泽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bank of China Ltd
Original Assignee
Bank of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bank of China Ltd filed Critical Bank of China Ltd
Priority to CN201810614063.4A priority Critical patent/CN108898476A/en
Publication of CN108898476A publication Critical patent/CN108898476A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof

Abstract

The embodiment of the present invention discloses a kind of loan customer credit-graded approach and device, is related to data processing field, can be improved the data dimension of credit scoring model analysis, improves the accuracy of the credit scoring result of loan customer.This method includes:The customer information of at least one classification of sample client is obtained, the customer information of at least one classification is associated with the unique identifier of sample client;Pretreatment is carried out to the customer information of each classification and obtains sampled data;It is modeled according to the corresponding sampled data of the customer information of each classification and the promise breaking information of sample client, generates the corresponding Rating Model of customer information of each classification;The corresponding Rating Model of the customer information of each classification is subjected to Model Fusion and obtains credit scoring model;The customer information of client to be evaluated is inputted into credit scoring model, calculates the promise breaking information of client to be evaluated.

Description

A kind of loan customer credit-graded approach and device
Technical field
The embodiment of the present invention is related to data processing field more particularly to a kind of loan customer credit-graded approach and dress It sets.
Background technique
The internets such as P2P credit product quickly emerges, and application way quick with its, convenient and fast allows borrower to have more Selection, while efficiently examination & approval, high-quality financial service are even more that client's likability is allowed to go straight up to.Traditional commerce bank loan examination and approval procedures Complexity, time-consuming, and human cost is high, financial service low efficiency, while subjective factor is strong, and risk is big, amount of actually making loans and visitor The true credit matching degree in family is not high, the huge punching that these factors are subject to the credit operation of bank in Internet era It hits.
Traditional commerce bank usually utilizes the information such as user's reference report, using complicated audit process, to the letter of user With being evaluated, credit operation is examined.And internet is financial, is mainly levied using history consumer behavior and individual subscriber Letter mainly constructs credit scoring model using statistical methods such as discriminant analysis, linear regression and Logistic recurrence, existing The data dimension of the credit scoring model analysis constructed in technology is lower, causes model accuracy poor, final to influence to loan visitor The credit scoring result at family.In addition, the modeling method of existing scheme is mostly linear modeling approach, and user is influenced in practice It is not simple linear relationship between factor and the personal credit scoring of people's credit, it is typically nonlinear, therefore cannot be quasi- The really relationship between reproduction user information and credit scoring, causes model accuracy poor.
Summary of the invention
The embodiment of the present invention provides a kind of loan customer credit-graded approach and device, can be improved credit scoring model The data dimension of analysis improves the accuracy of the credit scoring result of loan customer.
In a first aspect, a kind of loan customer credit-graded approach is provided, including:
The customer information of at least one classification of sample client is obtained, the classification of the customer information includes:Client is basic Information, customer capital liability information, client's reference information, client's revenue and expenditure information, client's social information, customer historical behavior letter Breath, client's consumption information, client other supplemental informations;The wherein customer information of at least one classification and the sample client Unique identifier association;
Pretreatment is carried out to the customer information of each classification and obtains sampled data;
It is carried out according to the corresponding sampled data of customer information of each classification and the promise breaking information of the sample client Modeling, generates the corresponding Rating Model of customer information of each classification;
The corresponding default risk evaluation model of the customer information of each classification is carried out Model Fusion acquisition credit to comment Sub-model;
The customer information of client to be evaluated is inputted into the credit scoring model, calculates the promise breaking letter of the client to be evaluated Breath.
Second aspect provides a kind of loan customer credit scoring device, including:
Input module, the customer information of at least one classification for obtaining sample client, the classification of the customer information Including:Client's essential information, customer capital liability information, client's reference information, client's revenue and expenditure information, client's social information, visitor Family historical behavior information, client's consumption information, client other supplemental informations;Wherein the customer information of at least one classification with The unique identifier of the sample client is associated with;
Preprocessing module pre-processes the customer information of each classification for what is obtained to the input module Obtain sampled data;
Modeling module, the corresponding sampling of customer information of each classification for being obtained according to the preprocessing module Data and the promise breaking information of the sample client model, and generate the corresponding Rating Model of customer information of each classification;
The modeling module is also used to the corresponding Rating Model of the customer information of each classification carrying out Model Fusion Obtain credit scoring model;
Grading module, for the customer information of client to be evaluated to be inputted the credit scoring that the modeling module obtains Model calculates the promise breaking information of the client to be evaluated.
The third aspect provides a kind of computer readable storage medium for storing one or more programs, one or more A program includes instruction, and the loan that described instruction executes the computer as described in relation to the first aspect is objective Family credit-graded approach.
In the above scheme, loan customer credit scoring device obtains the client of at least one classification of sample client first The classification of information, the customer information includes:Client's essential information, customer capital liability information, client's reference information, Ke Hushou Branch information, client's social information, customer historical behavioural information, client's consumption information, client other supplemental informations;Wherein it is described extremely The customer information of a few classification is associated with the unique identifier of the sample client;To the customer information of each classification into Row pretreatment obtains sampled data;According to the corresponding sampled data of customer information of each classification with the sample client's Promise breaking information is modeled, and the corresponding Rating Model of customer information of each classification is generated;The client of each classification is believed It ceases corresponding default risk evaluation model and carries out Model Fusion acquisition credit scoring model;The customer information of client to be evaluated is defeated Enter the credit scoring model, calculates the promise breaking information of the client to be evaluated.Since this method can be first against each class Other customer information is modeled in the dimension of customer information of all categories, the multiple scoring moulds that then will be generated in multiple dimensions Type is merged, and final credit scoring model is generated, and is improved the data dimension that can be improved credit scoring model analysis, is mentioned The accuracy of the credit scoring result of high loan customer.
Detailed description of the invention
In order to illustrate the technical solution of the embodiments of the present invention more clearly, below will be in embodiment or description of the prior art Required attached drawing is briefly described, it should be apparent that, the accompanying drawings in the following description is only some realities of the invention Example is applied, it for those of ordinary skill in the art, without creative efforts, can also be according to these attached drawings Obtain other attached drawings.
Fig. 1 provides a kind of loan customer credit-graded approach flow diagram for the embodiment of the present invention;
Fig. 2 is a kind of structural schematic diagram of loan customer credit scoring device provided in an embodiment of the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.
The embodiment of the present invention is applied to following technical term:
T% distribution:Sample is arranged from small to large according to attribute value, its regularity of distribution is counted, according to sample distribution Rule finds on t1% quantile under quantile and t2%, and the sample between two quantiles is normal sample, two quantiles Except sample be exceptional sample.The corresponding value of two quantiles can be identical or different, such as t1%=t2%, or T1% ≠ t2%.
Mean value or the filling of median vacancy value:It is the attribute of numeric type for value, such as age, the attributes such as assets, Mean value or the median of variable can be used to fill the missing values of the dimension, it can be with for normal (symmetrical) data distribution Using mean value, tilt data distribution should will just be filled using median;It is the attribute of character type, such as duty for value Industry, educational background etc., the accounting of its statistics available different attribute value, using the highest attribute value of accounting as default value, to fill this The missing values of dimension.
Characteristic interval ratio method:According to feature value by Feature Mapping to prior ready-portioned respective bins, further according to area The positive example sample accounting of interior sample completes data vector as the weight of character pair value.For discrete type feature, There is natural division in feature value, such as:Sex character has two male, female discrete features values, and education level feature has text Limited feature value such as blind, semiliterate, primary school, junior middle school, senior middle school, training, undergraduate course, postgraduate, so discrete type is characterized in not It needs directly to count positive example sample accounting, the sample in the sample of each feature value according to feature value distribution demarcation interval Vectorization weight of this accounting as character pair value.For continuous type feature, first have to determine according to feature value range Demarcation interval.The demarcation interval mode generally used is divided according to tercile.Assuming that a certain continuous type feature, it is taken It is worth Normal Distribution, is divided into 10 sections if necessary, then 10 that just find the distribution of this feature value is waited hundred Quantile, then according to the corresponding feature value of each tercile, this feature is divided into 10 sections.After having divided section, Adjacent sample is accounted for difference and merged than the section less than specified threshold ε by the positive example sample accounting for needing to calculate each section.Generally Using χ2Examine the reasonability for being verified division.Assuming that a certain continuous type feature, its feature value is finally divided into K area Between, use giIndicate the positive example number of samples in i-th of section, biIndicate the negative data number in i-th of section, g indicates full dose data Positive example number of samples is concentrated, b indicates negative data number in full dose data set.It enables:
Then statistic,Obey the χ of K-12Distribution.By to S2The χ for being K-1 with freedom degree2Point The critical value of cloth is compared, and come the positive example sample number of examining between each section, whether there were significant differences with negative data number. If meeting χ2It examines, then just illustrating that the mode of this demarcation interval is reasonable, if conditions are not met, just adjustment demarcation interval Mode, until meeting χ2Until inspection.Continuous type feature is needed after having divided section according to feature value range Positive example sample accounting in sample is counted on each section, which weighs as the vectorization for falling in corresponding section feature value Weight.
Sampling, sampling are exactly to make of all categories to reach flat by duplication minority class sample or the most class samples of reduction Weighing apparatus.Sampling includes sub- sampling and oversampling, and sub- sampling is to balance two class samples by reducing the quantity of most class samples.Cross pumping Sample is the method for reaching the data volume balance of most class sample numbers by duplication minority class sample.
Information gain (information gain, IG), information gain are that characterization attributes feature occurs and not in the sample There is the size to determine information content provided by characteristic attribute, is that an effect of the characteristic attribute in classification embodies, passes through The small characteristic attribute of discarding information gain retains the big characteristic attribute of information gain, realizes Characteristic Attribute Reduction.Feature tkLetter Ceasing gain calculation formula is:
In formula:P(ci) it is ciThe prior probability of class, P (tk) it is characterized attribute tkAppearance on entire training set it is general Rate,It is characterized tkThe probability not occurred, P (cl|tk) it is characterized attribute tkUnder the conditions of existing, sample belongs to clClass it is general Rate,It is characterized attribute tkUnder conditions of being not present, sample belongs to clThe probability of class.
Principal component analysis (Principal Component Analysis, PCA), PCA is a kind of statistical method, by just Alternation changes commanders one group, and there may be the variables of correlation to be converted to one group of linearly incoherent variable, this group of variable after conversion Principal component.Theorem:Any one m × m real symmetric matrix Xm×mIf Xm×mOrder be r, then there is orthogonal matrix Qm×r, so that:In formula, Qm×rFor m × r rank orthogonal matrix, column vector is matrix Xm×mFeature vector, Λr×r For r × r rank diagonal matrix, the element on diagonal line is matrix Xm×mCharacteristic value.Therefore, it is carried out to any one matrix A When PCA Feature Dimension Reduction, first have to be converted into symmetrical matrix X, it is general using the covariance matrix for calculating the matrix;Then again The characteristic value and feature vector of calculating matrix X, constitutive characteristic matrix Λ and orthogonal matrix Q;Finally by following matrixing:Vectorization matrix is mapped to lower dimensional space from higher dimensional space, thus real Existing matrix dimensionality reduction.
Generalized addictive models (Generalized additive model, GAM), the primary expression form of GAM model is such as Under:
In formula:The expectation of E (Y) expression dependent variable;G () contiguous function and second order can be led;α0For intercept or constant term;xi Indicate independent variable, si() is smooth function.In GAM model, dependent variable Y obeys the Arbitrary distribution of exponential family, such as normal state point Cloth, Poisson distribution, bi-distribution, Gamma distribution etc. belong to exponential family.
Gradient promotes decision tree (Gradient Boosting Decision Tree, GBDT), and GBDT is a kind of iteration Decision Tree algorithms, the algorithm are made of more decision trees, and the conclusion of all trees, which adds up, does final result, it be decision tree with The application that Boosting method combines has stronger generalization ability and finds a variety of features for having distinction and feature group Close inherent advantage.
Shown in referring to Fig.1, the embodiment of the present invention provides a kind of loan customer credit-graded approach, includes the following steps:
101, the customer information of at least one classification of sample client is obtained, the classification of customer information includes:Client is basic Information, customer capital liability information, client's reference information, client's revenue and expenditure information, client's social information, customer historical behavior letter Breath, client's consumption information, client other supplemental informations;The customer information of wherein at least one classification and unique knowledge of sample client Other code association.
Wherein, the customer information of all categories of sample client can be directly in the internal data system of bank in step 101 Or extracted in database, wherein the unique identifier of sample client can be for User ID for example:User account or user identity Information is demonstrate,proved, the unique identifier after extracting in a step 101 to the customer information of 8 dimensions, with the sample client as major key Splicing is associated together, and is stored in hive file.
102, pretreatment is carried out to the customer information of each classification and obtains sampled data.
Wherein in the preprocessing process of step 102, specifically includes and the customer information of each classification is handled as follows:
S1, data cleansing is carried out to the customer information of each classification.
Illustratively, data cleansing can be carried out using customer information of the t% distribution to each classification, to clean exception Value, such as the sample between quantile upper and lower in the sample distribution counted according to t% distribution rule is retained, by upper and lower quartile Sample except point is removed.
S2, the filling of vacancy value is carried out to the customer information of each classification after progress data cleansing.
The customer information of each classification can be carried out using mean value or median vacancy value completion method in step s 2 The filling of vacancy value.
S3, the customer information for being worth filled each classification to progress vacancy carry out data vector processing.
It can be carried out at data vector using customer information of the characteristic interval ratio method to each classification in step s3 Reason.
S4, the customer information for carrying out data vectorization treated each classification is sampled, obtains each classification The sampled data of customer information.
The customer information of each classification can be sampled using lack sampling or oversampler method in step s 4.
Wherein the S1-S4 in above-mentioned steps 102 can be based on Spark platform, read in Hive with Scala language each The customer information of classification, and the data mining of the rule such as S1-S4 is carried out to customer information of all categories.
103, it is modeled according to the promise breaking information of the corresponding sampled data of customer information of each classification and sample client, Generate the corresponding Rating Model of customer information of each classification.
Wherein in the modeling process of step 103, specifically includes and the customer information of each classification is handled as follows:
S1, feature selecting, selected characteristic data are carried out in the corresponding sampled data of customer information of each classification.
Illustratively, the customer information of information gain IG algorithm each classification pretreated to step 102 can be used Corresponding sampled data carries out feature reducing, selected characteristic data.
S2, dimension-reduction treatment is carried out to characteristic.
Principal Component Analysis PCA can be used to the dimension-reduction treatment of characteristic in step s 2.By step S1 and S2 Redundancy can be eliminated.
S3, according to the corresponding characteristic of customer information of each classification and the promise breaking information of sample client using scheduled The modeling of Model Self-Learning algorithm, generates the corresponding Rating Model of customer information of each classification.
After to characteristic dimension-reduction treatment, it can be determined in step s3 using Generalized Additive Models GAM or gradient promotion Plan tree GBDT models the corresponding characteristic of the customer information of each classification and the promise breaking information of sample client, trained To eight Rating Models.Wherein the S1-S3 in above-mentioned steps 103 can be based on Spark platform, realize and believe with Scala language Gain IG algorithm and Principal Component Analysis PCA are ceased, Generalized Additive Models GAM is realized with R language, is realized with Scala language Gradient promotes decision tree GBDT.
104, the corresponding Rating Model of the customer information of each classification is subjected to Model Fusion and obtains credit scoring model.
Generate in step 103 eight Rating Models can be subjected to fusion generation using Genetic Algorithms in step 104 Credit scoring model.The process can be realized based on Spark platform with Scala language.
105, the customer information of client to be evaluated is inputted into credit scoring model, calculates the promise breaking information of client to be evaluated.
In the above scheme, loan customer credit scoring device obtains the client of at least one classification of sample client first The classification of information, the customer information includes:Client's essential information, customer capital liability information, client's reference information, Ke Hushou Branch information, client's social information, customer historical behavioural information, client's consumption information, client other supplemental informations;Wherein it is described extremely The customer information of a few classification is associated with the unique identifier of the sample client;To the customer information of each classification into Row pretreatment obtains sampled data;According to the corresponding sampled data of customer information of each classification with the sample client's Promise breaking information is modeled, and the corresponding Rating Model of customer information of each classification is generated;The client of each classification is believed It ceases corresponding default risk evaluation model and carries out Model Fusion acquisition credit scoring model;The customer information of client to be evaluated is defeated Enter the credit scoring model, calculates the promise breaking information of the client to be evaluated.Since this method can be first against each class Other customer information is modeled in the dimension of customer information of all categories, the multiple scoring moulds that then will be generated in multiple dimensions Type is merged, and final credit scoring model is generated, and is improved the data dimension that can be improved credit scoring model analysis, is mentioned The accuracy of the credit scoring result of high loan customer.In addition, promoting decision tree GBDT using Generalized Additive Models GAM or gradient Training Rating Model, may be implemented to overcome between the factor for influencing individual subscriber credit and personal credit scoring and nonlinear dependence System.In addition the quality of data can be promoted by pre-processing to the customer information of each classification, and by characteristic selection and Characteristic dimensionality reduction can eliminate redundancy, reduce computation complexity and improve the accuracy of model.
Referring to shown in Fig. 2, a kind of loan customer credit scoring device is provided, including:
Input module 21, the customer information of at least one classification for obtaining sample client, the class of the customer information Do not include:Client's essential information, customer capital liability information, client's reference information, client's revenue and expenditure information, client's social information, Customer historical behavioural information, client's consumption information, client other supplemental informations;The wherein customer information of at least one classification It is associated with the unique identifier of the sample client;
Preprocessing module 22, it is pre- for being carried out to the customer information to each classification that the input module 21 obtains Processing obtains sampled data;
The customer information of modeling module 23, each classification for being obtained according to the preprocessing module 22 is corresponding Sampled data and the promise breaking information of the sample client model, and generate the corresponding scoring mould of customer information of each classification Type;
The modeling module 23 is also used to the corresponding Rating Model of the customer information of each classification carrying out model to melt It closes and obtains credit scoring model;
Grading module 24 is commented for the customer information of client to be evaluated to be inputted the credit that the modeling module obtains Sub-model calculates the promise breaking information of the client to be evaluated.
In a kind of illustrative scheme, the preprocessing module 22 is specifically used for the customer information to each classification Carry out data cleansing;The filling of vacancy value is carried out to the customer information of each classification after progress data cleansing;To into The customer information that row vacancy is worth filled each classification carries out data vector processing;To progress data vector The customer information for each classification that treated is sampled, and the sampled data of the customer information of each classification is obtained.
In a kind of illustrative scheme, the modeling module 23 is corresponding specifically for the customer information in each classification Sampled data in carry out feature selecting, selected characteristic data;According to the corresponding characteristic of customer information of each classification It is modeled according to the promise breaking information of the sample client using scheduled Model Self-Learning algorithm, generates the customer information of each classification Corresponding Rating Model.
In a kind of illustrative scheme, the modeling module 23 is also used to carry out dimension-reduction treatment to the characteristic.
In a kind of illustrative scheme, the scheduled Model Self-Learning algorithm includes at least following any one:Extensively Adopted additive model GAM and gradient promote decision tree GBDT.
It should be noted that input module 21, preprocessing module 22, modeling module 23 and grading module 24 can be single The processor solely set up also can integrate and realize in some processor of controller, in addition it is also possible to program code Form is stored in the memory of controller, is called by some processor of controller and is executed the function of the above each unit. Processor described here can be central processing unit (Central Processing Unit, CPU) or specific Integrated circuit (Application Specific Integrated Circuit, ASIC), or be arranged to implement this Shen Please embodiment one or more integrated circuits.
It should be understood that magnitude of the sequence numbers of the above procedures are not meant to execute suitable in the various embodiments of the application Sequence it is successive, the execution of each process sequence should be determined by its function and internal logic, the implementation without coping with the embodiment of the present application Process constitutes any restriction.
In addition, a kind of calculating readable media (or medium) is also provided, including carrying out in above-described embodiment when executed The computer-readable instruction of the operation of method.
In addition, also providing a kind of computer program product, including above-mentioned computer-readable media (or medium).
It should be understood that in various embodiments of the present invention, magnitude of the sequence numbers of the above procedures are not meant to execute suitable Sequence it is successive, the execution of each process sequence should be determined by its function and internal logic, the implementation without coping with the embodiment of the present invention Process constitutes any restriction.
Those of ordinary skill in the art may be aware that list described in conjunction with the examples disclosed in the embodiments of the present disclosure Member and algorithm steps can be realized with the combination of electronic hardware or computer software and electronic hardware.These functions are actually It is implemented in hardware or software, the specific application and design constraint depending on technical solution.Professional technician Each specific application can be used different methods to achieve the described function, but this realization is it is not considered that exceed The scope of the present invention.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description, The specific work process of device and unit, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
In several embodiments provided herein, it should be understood that disclosed system, apparatus and method, it can be with It realizes by another way.For example, apparatus embodiments described above are merely indicative, for example, the unit It divides, only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units or components It can be combined or can be integrated into another system, or some features can be ignored or not executed.Another point, it is shown or The mutual coupling, direct-coupling or communication connection discussed can be through some interfaces, the indirect coupling of equipment or unit It closes or communicates to connect, can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme 's.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit It is that each unit physically exists alone, can also be integrated in one unit with two or more units.
It, can be with if the function is realized in the form of SFU software functional unit and when sold or used as an independent product It is stored in a computer readable storage medium.Based on this understanding, technical solution of the present invention is substantially in other words The part of the part that contributes to existing technology or the technical solution can be embodied in the form of software products, the meter Calculation machine software product is stored in a storage medium, including some instructions are used so that a computer equipment (can be a People's computer, server or network equipment etc.) it performs all or part of the steps of the method described in the various embodiments of the present invention. And storage medium above-mentioned includes:USB flash disk, mobile hard disk, read-only memory (full name in English:Read-only memory, English letter Claim:ROM), random access memory (full name in English:Random access memory, English abbreviation:RAM), magnetic disk or light The various media that can store program code such as disk.
The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any Those familiar with the art in the technical scope disclosed by the present invention, can easily think of the change or the replacement, and should all contain Lid is within protection scope of the present invention.Therefore, protection scope of the present invention should be based on the protection scope of the described claims.

Claims (11)

1. a kind of loan customer credit-graded approach, which is characterized in that including:
The customer information of at least one classification of sample client is obtained, the classification of the customer information includes:Client's essential information, Customer capital liability information, client's reference information, client's revenue and expenditure information, client's social information, customer historical behavioural information, client Consumption information, client other supplemental informations;The customer information of at least one classification described in wherein is unique with the sample client's Identification code association;
Pretreatment is carried out to the customer information of each classification and obtains sampled data;
It is modeled according to the promise breaking information of the corresponding sampled data of customer information of each classification and the sample client, Generate the corresponding Rating Model of customer information of each classification;
The corresponding Rating Model of the customer information of each classification is subjected to Model Fusion and obtains credit scoring model;
The customer information of client to be evaluated is inputted into the credit scoring model, calculates the promise breaking information of the client to be evaluated.
2. loan customer credit-graded approach according to claim 1, which is characterized in that described to described in each classification Customer information carries out pretreatment and obtains sampled data, including:
Data cleansing is carried out to the customer information of each classification;
The filling of vacancy value is carried out to the customer information of each classification after progress data cleansing;
The customer information for being worth filled each classification to progress vacancy carries out data vector processing;
The customer information for carrying out data vectorization treated each classification is sampled, the described of each classification is obtained The sampled data of customer information.
3. loan customer credit-graded approach according to claim 1, which is characterized in that according to the visitor of each classification The corresponding sampled data of family information and the promise breaking information of the sample client model, and generate the customer information pair of each classification The Rating Model answered, including:
Feature selecting, selected characteristic data are carried out in the corresponding sampled data of customer information of each classification;
According to the corresponding characteristic of customer information of each classification and the promise breaking information of the sample client using predetermined Model Self-Learning algorithm modeling, generate the corresponding Rating Model of customer information of each classification.
4. loan customer credit-graded approach according to claim 1, which is characterized in that according to the visitor of each classification Before the corresponding characteristic of family information and the promise breaking information of the sample client are modeled using scheduled Model Self-Learning algorithm, Further include:
Dimension-reduction treatment is carried out to the characteristic.
5. loan customer credit-graded approach according to claim 3, which is characterized in that the scheduled Model Self-Learning Algorithm includes at least following any one:Generalized Additive Models GAM and gradient promote decision tree GBDT.
6. a kind of loan customer credit scoring device, which is characterized in that including:
Input module, the customer information of at least one classification for obtaining sample client, the classification of the customer information include: Client's essential information, customer capital liability information, client's reference information, client's revenue and expenditure information, client's social information, customer historical Behavioural information, client's consumption information, client other supplemental informations;The wherein customer information of at least one classification and the sample The unique identifier of this client is associated with;
Preprocessing module carries out pretreatment acquisition to the customer information of each classification for what is obtained to the input module Sampled data;
Modeling module, the corresponding sampled data of customer information of each classification for being obtained according to the preprocessing module It is modeled with the promise breaking information of the sample client, generates the corresponding Rating Model of customer information of each classification;
The modeling module is also used to the corresponding Rating Model of the customer information of each classification carrying out Model Fusion acquisition Credit scoring model;
Grading module, for the customer information of client to be evaluated to be inputted the credit scoring model that the modeling module obtains Calculate the promise breaking information of the client to be evaluated.
7. loan customer credit scoring device according to claim 6, which is characterized in that the preprocessing module is specifically used Data cleansing is carried out in the customer information to each classification;To the visitor of each classification after progress data cleansing Family information carries out the filling of vacancy value;To carry out vacancy be worth filled each classification the customer information carry out data to Quantification treatment;The customer information for carrying out data vectorization treated each classification is sampled, each classification is obtained The customer information sampled data.
8. loan customer credit scoring device according to claim 6, which is characterized in that the modeling module is specific to use Feature selecting, selected characteristic data are carried out in the corresponding sampled data of customer information in each classification;According to described each The corresponding characteristic of the customer information of classification and the promise breaking information of the sample client use scheduled Model Self-Learning algorithm Modeling, generates the corresponding Rating Model of customer information of each classification.
9. loan customer credit scoring device according to claim 6, which is characterized in that the modeling module is also used to Dimension-reduction treatment is carried out to the characteristic.
10. loan customer credit scoring device according to claim 6, which is characterized in that the scheduled model is learnt by oneself It practises algorithm and includes at least following any one:Generalized Additive Models GAM and gradient promote decision tree GBDT.
11. a kind of computer readable storage medium for storing one or more programs, which is characterized in that one or more of journeys Sequence includes instruction, and described instruction executes the computer as described in any one of claim 1 to 5 Loan customer credit-graded approach.
CN201810614063.4A 2018-06-14 2018-06-14 A kind of loan customer credit-graded approach and device Pending CN108898476A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810614063.4A CN108898476A (en) 2018-06-14 2018-06-14 A kind of loan customer credit-graded approach and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810614063.4A CN108898476A (en) 2018-06-14 2018-06-14 A kind of loan customer credit-graded approach and device

Publications (1)

Publication Number Publication Date
CN108898476A true CN108898476A (en) 2018-11-27

Family

ID=64345939

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810614063.4A Pending CN108898476A (en) 2018-06-14 2018-06-14 A kind of loan customer credit-graded approach and device

Country Status (1)

Country Link
CN (1) CN108898476A (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110060144A (en) * 2019-03-18 2019-07-26 平安科技(深圳)有限公司 Amount model training method, amount appraisal procedure, device, equipment and medium
CN110135973A (en) * 2019-04-23 2019-08-16 北京淇瑀信息科技有限公司 A kind of intelligent credit method based on IM and intelligent credit device
CN111161080A (en) * 2019-12-10 2020-05-15 中国建设银行股份有限公司 Information processing method and device
CN111695084A (en) * 2020-04-26 2020-09-22 北京奇艺世纪科技有限公司 Model generation method, credit score generation method, device, equipment and storage medium
CN111768285A (en) * 2019-04-01 2020-10-13 杭州金智塔科技有限公司 Credit wind control model construction system and method, wind control system and storage medium
CN112017062A (en) * 2020-07-15 2020-12-01 北京淇瑀信息科技有限公司 Resource limit distribution method and device based on guest group subdivision and electronic equipment
CN112132260A (en) * 2020-09-03 2020-12-25 深圳索信达数据技术有限公司 Training method, calling method, device and storage medium of neural network model
CN112529701A (en) * 2021-01-26 2021-03-19 四川享宇金信金融科技有限公司 Credit customer level evaluation method, device and equipment in wind control system
CN112907358A (en) * 2021-03-17 2021-06-04 平安消费金融有限公司 Loan user credit scoring method, loan user credit scoring device, computer equipment and storage medium
CN113538132A (en) * 2021-07-26 2021-10-22 天元大数据信用管理有限公司 Credit scoring method, device and medium based on regression tree algorithm
CN115456801A (en) * 2022-09-16 2022-12-09 北京曲速科技发展有限公司 Artificial intelligence big data wind control system, method and storage medium for personal credit
CN115545912A (en) * 2022-11-30 2022-12-30 联合赤道环境评价股份有限公司 Credit risk prediction method and device based on green identification information
CN113538132B (en) * 2021-07-26 2024-04-23 天元大数据信用管理有限公司 Credit scoring method, equipment and medium based on regression tree algorithm

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103632168A (en) * 2013-12-09 2014-03-12 天津工业大学 Classifier integration method for machine learning
CN107194795A (en) * 2016-03-15 2017-09-22 腾讯科技(深圳)有限公司 Credit score model training method, credit score computational methods and device
CN107292424A (en) * 2017-06-01 2017-10-24 四川新网银行股份有限公司 A kind of anti-fraud and credit risk forecast method based on complicated social networks
CN107481132A (en) * 2017-08-02 2017-12-15 上海前隆信息科技有限公司 A kind of credit estimation method and system, storage medium and terminal device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103632168A (en) * 2013-12-09 2014-03-12 天津工业大学 Classifier integration method for machine learning
CN107194795A (en) * 2016-03-15 2017-09-22 腾讯科技(深圳)有限公司 Credit score model training method, credit score computational methods and device
CN107292424A (en) * 2017-06-01 2017-10-24 四川新网银行股份有限公司 A kind of anti-fraud and credit risk forecast method based on complicated social networks
CN107481132A (en) * 2017-08-02 2017-12-15 上海前隆信息科技有限公司 A kind of credit estimation method and system, storage medium and terminal device

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110060144B (en) * 2019-03-18 2024-01-30 平安科技(深圳)有限公司 Method for training credit model, method, device, equipment and medium for evaluating credit
CN110060144A (en) * 2019-03-18 2019-07-26 平安科技(深圳)有限公司 Amount model training method, amount appraisal procedure, device, equipment and medium
CN111768285A (en) * 2019-04-01 2020-10-13 杭州金智塔科技有限公司 Credit wind control model construction system and method, wind control system and storage medium
CN110135973A (en) * 2019-04-23 2019-08-16 北京淇瑀信息科技有限公司 A kind of intelligent credit method based on IM and intelligent credit device
CN111161080A (en) * 2019-12-10 2020-05-15 中国建设银行股份有限公司 Information processing method and device
CN111695084A (en) * 2020-04-26 2020-09-22 北京奇艺世纪科技有限公司 Model generation method, credit score generation method, device, equipment and storage medium
CN112017062A (en) * 2020-07-15 2020-12-01 北京淇瑀信息科技有限公司 Resource limit distribution method and device based on guest group subdivision and electronic equipment
CN112132260A (en) * 2020-09-03 2020-12-25 深圳索信达数据技术有限公司 Training method, calling method, device and storage medium of neural network model
CN112529701A (en) * 2021-01-26 2021-03-19 四川享宇金信金融科技有限公司 Credit customer level evaluation method, device and equipment in wind control system
CN112907358A (en) * 2021-03-17 2021-06-04 平安消费金融有限公司 Loan user credit scoring method, loan user credit scoring device, computer equipment and storage medium
CN113538132A (en) * 2021-07-26 2021-10-22 天元大数据信用管理有限公司 Credit scoring method, device and medium based on regression tree algorithm
CN113538132B (en) * 2021-07-26 2024-04-23 天元大数据信用管理有限公司 Credit scoring method, equipment and medium based on regression tree algorithm
CN115456801A (en) * 2022-09-16 2022-12-09 北京曲速科技发展有限公司 Artificial intelligence big data wind control system, method and storage medium for personal credit
CN115545912A (en) * 2022-11-30 2022-12-30 联合赤道环境评价股份有限公司 Credit risk prediction method and device based on green identification information
CN115545912B (en) * 2022-11-30 2023-04-25 联合赤道环境评价股份有限公司 Credit risk prediction method and device based on green identification information

Similar Documents

Publication Publication Date Title
CN108898476A (en) A kind of loan customer credit-graded approach and device
TWI712981B (en) Risk identification model training method, device and server
CN108564286B (en) Artificial intelligent financial wind-control credit assessment method and system based on big data credit investigation
CN108846520B (en) Loan overdue prediction method, loan overdue prediction device and computer-readable storage medium
CN107025596B (en) Risk assessment method and system
CN108665159A (en) A kind of methods of risk assessment, device, terminal device and storage medium
CN107066616A (en) Method, device and electronic equipment for account processing
CN110956273A (en) Credit scoring method and system integrating multiple machine learning models
CN108133418A (en) Real-time credit risk management system
WO2015135321A1 (en) Method and device for mining social relationship based on financial data
US10521748B2 (en) Retention risk determiner
CN108665366A (en) Determine method, terminal device and the computer readable storage medium of consumer's risk grade
CN109816509A (en) Generation method, terminal device and the medium of scorecard model
CN112559900B (en) Product recommendation method and device, computer equipment and storage medium
CN110147389A (en) Account number treating method and apparatus, storage medium and electronic device
CN108564255A (en) Matching Model construction method, orphan's list distribution method, device, medium and terminal
CN110930038A (en) Loan demand identification method, loan demand identification device, loan demand identification terminal and loan demand identification storage medium
CN110751355A (en) Scientific and technological achievement assessment method and device
CN110119980A (en) A kind of anti-fraud method, apparatus, system and recording medium for credit
CN114612251A (en) Risk assessment method, device, equipment and storage medium
CN111090833A (en) Data processing method, system and related equipment
CN114638704A (en) Illegal fund transfer identification method and device, electronic equipment and storage medium
CN111667307B (en) Method and device for predicting financial product sales volume
CN109635289A (en) Entry classification method and audit information abstracting method
WO2022143431A1 (en) Method and apparatus for training anti-money laundering model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20181127