CN108898476A - A kind of loan customer credit-graded approach and device - Google Patents
A kind of loan customer credit-graded approach and device Download PDFInfo
- Publication number
- CN108898476A CN108898476A CN201810614063.4A CN201810614063A CN108898476A CN 108898476 A CN108898476 A CN 108898476A CN 201810614063 A CN201810614063 A CN 201810614063A CN 108898476 A CN108898476 A CN 108898476A
- Authority
- CN
- China
- Prior art keywords
- information
- classification
- client
- customer
- customer information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/03—Credit; Loans; Processing thereof
Abstract
The embodiment of the present invention discloses a kind of loan customer credit-graded approach and device, is related to data processing field, can be improved the data dimension of credit scoring model analysis, improves the accuracy of the credit scoring result of loan customer.This method includes:The customer information of at least one classification of sample client is obtained, the customer information of at least one classification is associated with the unique identifier of sample client;Pretreatment is carried out to the customer information of each classification and obtains sampled data;It is modeled according to the corresponding sampled data of the customer information of each classification and the promise breaking information of sample client, generates the corresponding Rating Model of customer information of each classification;The corresponding Rating Model of the customer information of each classification is subjected to Model Fusion and obtains credit scoring model;The customer information of client to be evaluated is inputted into credit scoring model, calculates the promise breaking information of client to be evaluated.
Description
Technical field
The embodiment of the present invention is related to data processing field more particularly to a kind of loan customer credit-graded approach and dress
It sets.
Background technique
The internets such as P2P credit product quickly emerges, and application way quick with its, convenient and fast allows borrower to have more
Selection, while efficiently examination & approval, high-quality financial service are even more that client's likability is allowed to go straight up to.Traditional commerce bank loan examination and approval procedures
Complexity, time-consuming, and human cost is high, financial service low efficiency, while subjective factor is strong, and risk is big, amount of actually making loans and visitor
The true credit matching degree in family is not high, the huge punching that these factors are subject to the credit operation of bank in Internet era
It hits.
Traditional commerce bank usually utilizes the information such as user's reference report, using complicated audit process, to the letter of user
With being evaluated, credit operation is examined.And internet is financial, is mainly levied using history consumer behavior and individual subscriber
Letter mainly constructs credit scoring model using statistical methods such as discriminant analysis, linear regression and Logistic recurrence, existing
The data dimension of the credit scoring model analysis constructed in technology is lower, causes model accuracy poor, final to influence to loan visitor
The credit scoring result at family.In addition, the modeling method of existing scheme is mostly linear modeling approach, and user is influenced in practice
It is not simple linear relationship between factor and the personal credit scoring of people's credit, it is typically nonlinear, therefore cannot be quasi-
The really relationship between reproduction user information and credit scoring, causes model accuracy poor.
Summary of the invention
The embodiment of the present invention provides a kind of loan customer credit-graded approach and device, can be improved credit scoring model
The data dimension of analysis improves the accuracy of the credit scoring result of loan customer.
In a first aspect, a kind of loan customer credit-graded approach is provided, including:
The customer information of at least one classification of sample client is obtained, the classification of the customer information includes:Client is basic
Information, customer capital liability information, client's reference information, client's revenue and expenditure information, client's social information, customer historical behavior letter
Breath, client's consumption information, client other supplemental informations;The wherein customer information of at least one classification and the sample client
Unique identifier association;
Pretreatment is carried out to the customer information of each classification and obtains sampled data;
It is carried out according to the corresponding sampled data of customer information of each classification and the promise breaking information of the sample client
Modeling, generates the corresponding Rating Model of customer information of each classification;
The corresponding default risk evaluation model of the customer information of each classification is carried out Model Fusion acquisition credit to comment
Sub-model;
The customer information of client to be evaluated is inputted into the credit scoring model, calculates the promise breaking letter of the client to be evaluated
Breath.
Second aspect provides a kind of loan customer credit scoring device, including:
Input module, the customer information of at least one classification for obtaining sample client, the classification of the customer information
Including:Client's essential information, customer capital liability information, client's reference information, client's revenue and expenditure information, client's social information, visitor
Family historical behavior information, client's consumption information, client other supplemental informations;Wherein the customer information of at least one classification with
The unique identifier of the sample client is associated with;
Preprocessing module pre-processes the customer information of each classification for what is obtained to the input module
Obtain sampled data;
Modeling module, the corresponding sampling of customer information of each classification for being obtained according to the preprocessing module
Data and the promise breaking information of the sample client model, and generate the corresponding Rating Model of customer information of each classification;
The modeling module is also used to the corresponding Rating Model of the customer information of each classification carrying out Model Fusion
Obtain credit scoring model;
Grading module, for the customer information of client to be evaluated to be inputted the credit scoring that the modeling module obtains
Model calculates the promise breaking information of the client to be evaluated.
The third aspect provides a kind of computer readable storage medium for storing one or more programs, one or more
A program includes instruction, and the loan that described instruction executes the computer as described in relation to the first aspect is objective
Family credit-graded approach.
In the above scheme, loan customer credit scoring device obtains the client of at least one classification of sample client first
The classification of information, the customer information includes:Client's essential information, customer capital liability information, client's reference information, Ke Hushou
Branch information, client's social information, customer historical behavioural information, client's consumption information, client other supplemental informations;Wherein it is described extremely
The customer information of a few classification is associated with the unique identifier of the sample client;To the customer information of each classification into
Row pretreatment obtains sampled data;According to the corresponding sampled data of customer information of each classification with the sample client's
Promise breaking information is modeled, and the corresponding Rating Model of customer information of each classification is generated;The client of each classification is believed
It ceases corresponding default risk evaluation model and carries out Model Fusion acquisition credit scoring model;The customer information of client to be evaluated is defeated
Enter the credit scoring model, calculates the promise breaking information of the client to be evaluated.Since this method can be first against each class
Other customer information is modeled in the dimension of customer information of all categories, the multiple scoring moulds that then will be generated in multiple dimensions
Type is merged, and final credit scoring model is generated, and is improved the data dimension that can be improved credit scoring model analysis, is mentioned
The accuracy of the credit scoring result of high loan customer.
Detailed description of the invention
In order to illustrate the technical solution of the embodiments of the present invention more clearly, below will be in embodiment or description of the prior art
Required attached drawing is briefly described, it should be apparent that, the accompanying drawings in the following description is only some realities of the invention
Example is applied, it for those of ordinary skill in the art, without creative efforts, can also be according to these attached drawings
Obtain other attached drawings.
Fig. 1 provides a kind of loan customer credit-graded approach flow diagram for the embodiment of the present invention;
Fig. 2 is a kind of structural schematic diagram of loan customer credit scoring device provided in an embodiment of the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other
Embodiment shall fall within the protection scope of the present invention.
The embodiment of the present invention is applied to following technical term:
T% distribution:Sample is arranged from small to large according to attribute value, its regularity of distribution is counted, according to sample distribution
Rule finds on t1% quantile under quantile and t2%, and the sample between two quantiles is normal sample, two quantiles
Except sample be exceptional sample.The corresponding value of two quantiles can be identical or different, such as t1%=t2%, or
T1% ≠ t2%.
Mean value or the filling of median vacancy value:It is the attribute of numeric type for value, such as age, the attributes such as assets,
Mean value or the median of variable can be used to fill the missing values of the dimension, it can be with for normal (symmetrical) data distribution
Using mean value, tilt data distribution should will just be filled using median;It is the attribute of character type, such as duty for value
Industry, educational background etc., the accounting of its statistics available different attribute value, using the highest attribute value of accounting as default value, to fill this
The missing values of dimension.
Characteristic interval ratio method:According to feature value by Feature Mapping to prior ready-portioned respective bins, further according to area
The positive example sample accounting of interior sample completes data vector as the weight of character pair value.For discrete type feature,
There is natural division in feature value, such as:Sex character has two male, female discrete features values, and education level feature has text
Limited feature value such as blind, semiliterate, primary school, junior middle school, senior middle school, training, undergraduate course, postgraduate, so discrete type is characterized in not
It needs directly to count positive example sample accounting, the sample in the sample of each feature value according to feature value distribution demarcation interval
Vectorization weight of this accounting as character pair value.For continuous type feature, first have to determine according to feature value range
Demarcation interval.The demarcation interval mode generally used is divided according to tercile.Assuming that a certain continuous type feature, it is taken
It is worth Normal Distribution, is divided into 10 sections if necessary, then 10 that just find the distribution of this feature value is waited hundred
Quantile, then according to the corresponding feature value of each tercile, this feature is divided into 10 sections.After having divided section,
Adjacent sample is accounted for difference and merged than the section less than specified threshold ε by the positive example sample accounting for needing to calculate each section.Generally
Using χ2Examine the reasonability for being verified division.Assuming that a certain continuous type feature, its feature value is finally divided into K area
Between, use giIndicate the positive example number of samples in i-th of section, biIndicate the negative data number in i-th of section, g indicates full dose data
Positive example number of samples is concentrated, b indicates negative data number in full dose data set.It enables:
Then statistic,Obey the χ of K-12Distribution.By to S2The χ for being K-1 with freedom degree2Point
The critical value of cloth is compared, and come the positive example sample number of examining between each section, whether there were significant differences with negative data number.
If meeting χ2It examines, then just illustrating that the mode of this demarcation interval is reasonable, if conditions are not met, just adjustment demarcation interval
Mode, until meeting χ2Until inspection.Continuous type feature is needed after having divided section according to feature value range
Positive example sample accounting in sample is counted on each section, which weighs as the vectorization for falling in corresponding section feature value
Weight.
Sampling, sampling are exactly to make of all categories to reach flat by duplication minority class sample or the most class samples of reduction
Weighing apparatus.Sampling includes sub- sampling and oversampling, and sub- sampling is to balance two class samples by reducing the quantity of most class samples.Cross pumping
Sample is the method for reaching the data volume balance of most class sample numbers by duplication minority class sample.
Information gain (information gain, IG), information gain are that characterization attributes feature occurs and not in the sample
There is the size to determine information content provided by characteristic attribute, is that an effect of the characteristic attribute in classification embodies, passes through
The small characteristic attribute of discarding information gain retains the big characteristic attribute of information gain, realizes Characteristic Attribute Reduction.Feature tkLetter
Ceasing gain calculation formula is:
In formula:P(ci) it is ciThe prior probability of class, P (tk) it is characterized attribute tkAppearance on entire training set it is general
Rate,It is characterized tkThe probability not occurred, P (cl|tk) it is characterized attribute tkUnder the conditions of existing, sample belongs to clClass it is general
Rate,It is characterized attribute tkUnder conditions of being not present, sample belongs to clThe probability of class.
Principal component analysis (Principal Component Analysis, PCA), PCA is a kind of statistical method, by just
Alternation changes commanders one group, and there may be the variables of correlation to be converted to one group of linearly incoherent variable, this group of variable after conversion
Principal component.Theorem:Any one m × m real symmetric matrix Xm×mIf Xm×mOrder be r, then there is orthogonal matrix Qm×r, so that:In formula, Qm×rFor m × r rank orthogonal matrix, column vector is matrix Xm×mFeature vector, Λr×r
For r × r rank diagonal matrix, the element on diagonal line is matrix Xm×mCharacteristic value.Therefore, it is carried out to any one matrix A
When PCA Feature Dimension Reduction, first have to be converted into symmetrical matrix X, it is general using the covariance matrix for calculating the matrix;Then again
The characteristic value and feature vector of calculating matrix X, constitutive characteristic matrix Λ and orthogonal matrix Q;Finally by following matrixing:Vectorization matrix is mapped to lower dimensional space from higher dimensional space, thus real
Existing matrix dimensionality reduction.
Generalized addictive models (Generalized additive model, GAM), the primary expression form of GAM model is such as
Under:
In formula:The expectation of E (Y) expression dependent variable;G () contiguous function and second order can be led;α0For intercept or constant term;xi
Indicate independent variable, si() is smooth function.In GAM model, dependent variable Y obeys the Arbitrary distribution of exponential family, such as normal state point
Cloth, Poisson distribution, bi-distribution, Gamma distribution etc. belong to exponential family.
Gradient promotes decision tree (Gradient Boosting Decision Tree, GBDT), and GBDT is a kind of iteration
Decision Tree algorithms, the algorithm are made of more decision trees, and the conclusion of all trees, which adds up, does final result, it be decision tree with
The application that Boosting method combines has stronger generalization ability and finds a variety of features for having distinction and feature group
Close inherent advantage.
Shown in referring to Fig.1, the embodiment of the present invention provides a kind of loan customer credit-graded approach, includes the following steps:
101, the customer information of at least one classification of sample client is obtained, the classification of customer information includes:Client is basic
Information, customer capital liability information, client's reference information, client's revenue and expenditure information, client's social information, customer historical behavior letter
Breath, client's consumption information, client other supplemental informations;The customer information of wherein at least one classification and unique knowledge of sample client
Other code association.
Wherein, the customer information of all categories of sample client can be directly in the internal data system of bank in step 101
Or extracted in database, wherein the unique identifier of sample client can be for User ID for example:User account or user identity
Information is demonstrate,proved, the unique identifier after extracting in a step 101 to the customer information of 8 dimensions, with the sample client as major key
Splicing is associated together, and is stored in hive file.
102, pretreatment is carried out to the customer information of each classification and obtains sampled data.
Wherein in the preprocessing process of step 102, specifically includes and the customer information of each classification is handled as follows:
S1, data cleansing is carried out to the customer information of each classification.
Illustratively, data cleansing can be carried out using customer information of the t% distribution to each classification, to clean exception
Value, such as the sample between quantile upper and lower in the sample distribution counted according to t% distribution rule is retained, by upper and lower quartile
Sample except point is removed.
S2, the filling of vacancy value is carried out to the customer information of each classification after progress data cleansing.
The customer information of each classification can be carried out using mean value or median vacancy value completion method in step s 2
The filling of vacancy value.
S3, the customer information for being worth filled each classification to progress vacancy carry out data vector processing.
It can be carried out at data vector using customer information of the characteristic interval ratio method to each classification in step s3
Reason.
S4, the customer information for carrying out data vectorization treated each classification is sampled, obtains each classification
The sampled data of customer information.
The customer information of each classification can be sampled using lack sampling or oversampler method in step s 4.
Wherein the S1-S4 in above-mentioned steps 102 can be based on Spark platform, read in Hive with Scala language each
The customer information of classification, and the data mining of the rule such as S1-S4 is carried out to customer information of all categories.
103, it is modeled according to the promise breaking information of the corresponding sampled data of customer information of each classification and sample client,
Generate the corresponding Rating Model of customer information of each classification.
Wherein in the modeling process of step 103, specifically includes and the customer information of each classification is handled as follows:
S1, feature selecting, selected characteristic data are carried out in the corresponding sampled data of customer information of each classification.
Illustratively, the customer information of information gain IG algorithm each classification pretreated to step 102 can be used
Corresponding sampled data carries out feature reducing, selected characteristic data.
S2, dimension-reduction treatment is carried out to characteristic.
Principal Component Analysis PCA can be used to the dimension-reduction treatment of characteristic in step s 2.By step S1 and S2
Redundancy can be eliminated.
S3, according to the corresponding characteristic of customer information of each classification and the promise breaking information of sample client using scheduled
The modeling of Model Self-Learning algorithm, generates the corresponding Rating Model of customer information of each classification.
After to characteristic dimension-reduction treatment, it can be determined in step s3 using Generalized Additive Models GAM or gradient promotion
Plan tree GBDT models the corresponding characteristic of the customer information of each classification and the promise breaking information of sample client, trained
To eight Rating Models.Wherein the S1-S3 in above-mentioned steps 103 can be based on Spark platform, realize and believe with Scala language
Gain IG algorithm and Principal Component Analysis PCA are ceased, Generalized Additive Models GAM is realized with R language, is realized with Scala language
Gradient promotes decision tree GBDT.
104, the corresponding Rating Model of the customer information of each classification is subjected to Model Fusion and obtains credit scoring model.
Generate in step 103 eight Rating Models can be subjected to fusion generation using Genetic Algorithms in step 104
Credit scoring model.The process can be realized based on Spark platform with Scala language.
105, the customer information of client to be evaluated is inputted into credit scoring model, calculates the promise breaking information of client to be evaluated.
In the above scheme, loan customer credit scoring device obtains the client of at least one classification of sample client first
The classification of information, the customer information includes:Client's essential information, customer capital liability information, client's reference information, Ke Hushou
Branch information, client's social information, customer historical behavioural information, client's consumption information, client other supplemental informations;Wherein it is described extremely
The customer information of a few classification is associated with the unique identifier of the sample client;To the customer information of each classification into
Row pretreatment obtains sampled data;According to the corresponding sampled data of customer information of each classification with the sample client's
Promise breaking information is modeled, and the corresponding Rating Model of customer information of each classification is generated;The client of each classification is believed
It ceases corresponding default risk evaluation model and carries out Model Fusion acquisition credit scoring model;The customer information of client to be evaluated is defeated
Enter the credit scoring model, calculates the promise breaking information of the client to be evaluated.Since this method can be first against each class
Other customer information is modeled in the dimension of customer information of all categories, the multiple scoring moulds that then will be generated in multiple dimensions
Type is merged, and final credit scoring model is generated, and is improved the data dimension that can be improved credit scoring model analysis, is mentioned
The accuracy of the credit scoring result of high loan customer.In addition, promoting decision tree GBDT using Generalized Additive Models GAM or gradient
Training Rating Model, may be implemented to overcome between the factor for influencing individual subscriber credit and personal credit scoring and nonlinear dependence
System.In addition the quality of data can be promoted by pre-processing to the customer information of each classification, and by characteristic selection and
Characteristic dimensionality reduction can eliminate redundancy, reduce computation complexity and improve the accuracy of model.
Referring to shown in Fig. 2, a kind of loan customer credit scoring device is provided, including:
Input module 21, the customer information of at least one classification for obtaining sample client, the class of the customer information
Do not include:Client's essential information, customer capital liability information, client's reference information, client's revenue and expenditure information, client's social information,
Customer historical behavioural information, client's consumption information, client other supplemental informations;The wherein customer information of at least one classification
It is associated with the unique identifier of the sample client;
Preprocessing module 22, it is pre- for being carried out to the customer information to each classification that the input module 21 obtains
Processing obtains sampled data;
The customer information of modeling module 23, each classification for being obtained according to the preprocessing module 22 is corresponding
Sampled data and the promise breaking information of the sample client model, and generate the corresponding scoring mould of customer information of each classification
Type;
The modeling module 23 is also used to the corresponding Rating Model of the customer information of each classification carrying out model to melt
It closes and obtains credit scoring model;
Grading module 24 is commented for the customer information of client to be evaluated to be inputted the credit that the modeling module obtains
Sub-model calculates the promise breaking information of the client to be evaluated.
In a kind of illustrative scheme, the preprocessing module 22 is specifically used for the customer information to each classification
Carry out data cleansing;The filling of vacancy value is carried out to the customer information of each classification after progress data cleansing;To into
The customer information that row vacancy is worth filled each classification carries out data vector processing;To progress data vector
The customer information for each classification that treated is sampled, and the sampled data of the customer information of each classification is obtained.
In a kind of illustrative scheme, the modeling module 23 is corresponding specifically for the customer information in each classification
Sampled data in carry out feature selecting, selected characteristic data;According to the corresponding characteristic of customer information of each classification
It is modeled according to the promise breaking information of the sample client using scheduled Model Self-Learning algorithm, generates the customer information of each classification
Corresponding Rating Model.
In a kind of illustrative scheme, the modeling module 23 is also used to carry out dimension-reduction treatment to the characteristic.
In a kind of illustrative scheme, the scheduled Model Self-Learning algorithm includes at least following any one:Extensively
Adopted additive model GAM and gradient promote decision tree GBDT.
It should be noted that input module 21, preprocessing module 22, modeling module 23 and grading module 24 can be single
The processor solely set up also can integrate and realize in some processor of controller, in addition it is also possible to program code
Form is stored in the memory of controller, is called by some processor of controller and is executed the function of the above each unit.
Processor described here can be central processing unit (Central Processing Unit, CPU) or specific
Integrated circuit (Application Specific Integrated Circuit, ASIC), or be arranged to implement this Shen
Please embodiment one or more integrated circuits.
It should be understood that magnitude of the sequence numbers of the above procedures are not meant to execute suitable in the various embodiments of the application
Sequence it is successive, the execution of each process sequence should be determined by its function and internal logic, the implementation without coping with the embodiment of the present application
Process constitutes any restriction.
In addition, a kind of calculating readable media (or medium) is also provided, including carrying out in above-described embodiment when executed
The computer-readable instruction of the operation of method.
In addition, also providing a kind of computer program product, including above-mentioned computer-readable media (or medium).
It should be understood that in various embodiments of the present invention, magnitude of the sequence numbers of the above procedures are not meant to execute suitable
Sequence it is successive, the execution of each process sequence should be determined by its function and internal logic, the implementation without coping with the embodiment of the present invention
Process constitutes any restriction.
Those of ordinary skill in the art may be aware that list described in conjunction with the examples disclosed in the embodiments of the present disclosure
Member and algorithm steps can be realized with the combination of electronic hardware or computer software and electronic hardware.These functions are actually
It is implemented in hardware or software, the specific application and design constraint depending on technical solution.Professional technician
Each specific application can be used different methods to achieve the described function, but this realization is it is not considered that exceed
The scope of the present invention.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description,
The specific work process of device and unit, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
In several embodiments provided herein, it should be understood that disclosed system, apparatus and method, it can be with
It realizes by another way.For example, apparatus embodiments described above are merely indicative, for example, the unit
It divides, only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units or components
It can be combined or can be integrated into another system, or some features can be ignored or not executed.Another point, it is shown or
The mutual coupling, direct-coupling or communication connection discussed can be through some interfaces, the indirect coupling of equipment or unit
It closes or communicates to connect, can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit
The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple
In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme
's.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit
It is that each unit physically exists alone, can also be integrated in one unit with two or more units.
It, can be with if the function is realized in the form of SFU software functional unit and when sold or used as an independent product
It is stored in a computer readable storage medium.Based on this understanding, technical solution of the present invention is substantially in other words
The part of the part that contributes to existing technology or the technical solution can be embodied in the form of software products, the meter
Calculation machine software product is stored in a storage medium, including some instructions are used so that a computer equipment (can be a
People's computer, server or network equipment etc.) it performs all or part of the steps of the method described in the various embodiments of the present invention.
And storage medium above-mentioned includes:USB flash disk, mobile hard disk, read-only memory (full name in English:Read-only memory, English letter
Claim:ROM), random access memory (full name in English:Random access memory, English abbreviation:RAM), magnetic disk or light
The various media that can store program code such as disk.
The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any
Those familiar with the art in the technical scope disclosed by the present invention, can easily think of the change or the replacement, and should all contain
Lid is within protection scope of the present invention.Therefore, protection scope of the present invention should be based on the protection scope of the described claims.
Claims (11)
1. a kind of loan customer credit-graded approach, which is characterized in that including:
The customer information of at least one classification of sample client is obtained, the classification of the customer information includes:Client's essential information,
Customer capital liability information, client's reference information, client's revenue and expenditure information, client's social information, customer historical behavioural information, client
Consumption information, client other supplemental informations;The customer information of at least one classification described in wherein is unique with the sample client's
Identification code association;
Pretreatment is carried out to the customer information of each classification and obtains sampled data;
It is modeled according to the promise breaking information of the corresponding sampled data of customer information of each classification and the sample client,
Generate the corresponding Rating Model of customer information of each classification;
The corresponding Rating Model of the customer information of each classification is subjected to Model Fusion and obtains credit scoring model;
The customer information of client to be evaluated is inputted into the credit scoring model, calculates the promise breaking information of the client to be evaluated.
2. loan customer credit-graded approach according to claim 1, which is characterized in that described to described in each classification
Customer information carries out pretreatment and obtains sampled data, including:
Data cleansing is carried out to the customer information of each classification;
The filling of vacancy value is carried out to the customer information of each classification after progress data cleansing;
The customer information for being worth filled each classification to progress vacancy carries out data vector processing;
The customer information for carrying out data vectorization treated each classification is sampled, the described of each classification is obtained
The sampled data of customer information.
3. loan customer credit-graded approach according to claim 1, which is characterized in that according to the visitor of each classification
The corresponding sampled data of family information and the promise breaking information of the sample client model, and generate the customer information pair of each classification
The Rating Model answered, including:
Feature selecting, selected characteristic data are carried out in the corresponding sampled data of customer information of each classification;
According to the corresponding characteristic of customer information of each classification and the promise breaking information of the sample client using predetermined
Model Self-Learning algorithm modeling, generate the corresponding Rating Model of customer information of each classification.
4. loan customer credit-graded approach according to claim 1, which is characterized in that according to the visitor of each classification
Before the corresponding characteristic of family information and the promise breaking information of the sample client are modeled using scheduled Model Self-Learning algorithm,
Further include:
Dimension-reduction treatment is carried out to the characteristic.
5. loan customer credit-graded approach according to claim 3, which is characterized in that the scheduled Model Self-Learning
Algorithm includes at least following any one:Generalized Additive Models GAM and gradient promote decision tree GBDT.
6. a kind of loan customer credit scoring device, which is characterized in that including:
Input module, the customer information of at least one classification for obtaining sample client, the classification of the customer information include:
Client's essential information, customer capital liability information, client's reference information, client's revenue and expenditure information, client's social information, customer historical
Behavioural information, client's consumption information, client other supplemental informations;The wherein customer information of at least one classification and the sample
The unique identifier of this client is associated with;
Preprocessing module carries out pretreatment acquisition to the customer information of each classification for what is obtained to the input module
Sampled data;
Modeling module, the corresponding sampled data of customer information of each classification for being obtained according to the preprocessing module
It is modeled with the promise breaking information of the sample client, generates the corresponding Rating Model of customer information of each classification;
The modeling module is also used to the corresponding Rating Model of the customer information of each classification carrying out Model Fusion acquisition
Credit scoring model;
Grading module, for the customer information of client to be evaluated to be inputted the credit scoring model that the modeling module obtains
Calculate the promise breaking information of the client to be evaluated.
7. loan customer credit scoring device according to claim 6, which is characterized in that the preprocessing module is specifically used
Data cleansing is carried out in the customer information to each classification;To the visitor of each classification after progress data cleansing
Family information carries out the filling of vacancy value;To carry out vacancy be worth filled each classification the customer information carry out data to
Quantification treatment;The customer information for carrying out data vectorization treated each classification is sampled, each classification is obtained
The customer information sampled data.
8. loan customer credit scoring device according to claim 6, which is characterized in that the modeling module is specific to use
Feature selecting, selected characteristic data are carried out in the corresponding sampled data of customer information in each classification;According to described each
The corresponding characteristic of the customer information of classification and the promise breaking information of the sample client use scheduled Model Self-Learning algorithm
Modeling, generates the corresponding Rating Model of customer information of each classification.
9. loan customer credit scoring device according to claim 6, which is characterized in that the modeling module is also used to
Dimension-reduction treatment is carried out to the characteristic.
10. loan customer credit scoring device according to claim 6, which is characterized in that the scheduled model is learnt by oneself
It practises algorithm and includes at least following any one:Generalized Additive Models GAM and gradient promote decision tree GBDT.
11. a kind of computer readable storage medium for storing one or more programs, which is characterized in that one or more of journeys
Sequence includes instruction, and described instruction executes the computer as described in any one of claim 1 to 5
Loan customer credit-graded approach.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810614063.4A CN108898476A (en) | 2018-06-14 | 2018-06-14 | A kind of loan customer credit-graded approach and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810614063.4A CN108898476A (en) | 2018-06-14 | 2018-06-14 | A kind of loan customer credit-graded approach and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108898476A true CN108898476A (en) | 2018-11-27 |
Family
ID=64345939
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810614063.4A Pending CN108898476A (en) | 2018-06-14 | 2018-06-14 | A kind of loan customer credit-graded approach and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108898476A (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110060144A (en) * | 2019-03-18 | 2019-07-26 | 平安科技(深圳)有限公司 | Amount model training method, amount appraisal procedure, device, equipment and medium |
CN110135973A (en) * | 2019-04-23 | 2019-08-16 | 北京淇瑀信息科技有限公司 | A kind of intelligent credit method based on IM and intelligent credit device |
CN111161080A (en) * | 2019-12-10 | 2020-05-15 | 中国建设银行股份有限公司 | Information processing method and device |
CN111695084A (en) * | 2020-04-26 | 2020-09-22 | 北京奇艺世纪科技有限公司 | Model generation method, credit score generation method, device, equipment and storage medium |
CN111768285A (en) * | 2019-04-01 | 2020-10-13 | 杭州金智塔科技有限公司 | Credit wind control model construction system and method, wind control system and storage medium |
CN112017062A (en) * | 2020-07-15 | 2020-12-01 | 北京淇瑀信息科技有限公司 | Resource limit distribution method and device based on guest group subdivision and electronic equipment |
CN112132260A (en) * | 2020-09-03 | 2020-12-25 | 深圳索信达数据技术有限公司 | Training method, calling method, device and storage medium of neural network model |
CN112529701A (en) * | 2021-01-26 | 2021-03-19 | 四川享宇金信金融科技有限公司 | Credit customer level evaluation method, device and equipment in wind control system |
CN112907358A (en) * | 2021-03-17 | 2021-06-04 | 平安消费金融有限公司 | Loan user credit scoring method, loan user credit scoring device, computer equipment and storage medium |
CN113538132A (en) * | 2021-07-26 | 2021-10-22 | 天元大数据信用管理有限公司 | Credit scoring method, device and medium based on regression tree algorithm |
CN115456801A (en) * | 2022-09-16 | 2022-12-09 | 北京曲速科技发展有限公司 | Artificial intelligence big data wind control system, method and storage medium for personal credit |
CN115545912A (en) * | 2022-11-30 | 2022-12-30 | 联合赤道环境评价股份有限公司 | Credit risk prediction method and device based on green identification information |
CN113538132B (en) * | 2021-07-26 | 2024-04-23 | 天元大数据信用管理有限公司 | Credit scoring method, equipment and medium based on regression tree algorithm |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103632168A (en) * | 2013-12-09 | 2014-03-12 | 天津工业大学 | Classifier integration method for machine learning |
CN107194795A (en) * | 2016-03-15 | 2017-09-22 | 腾讯科技(深圳)有限公司 | Credit score model training method, credit score computational methods and device |
CN107292424A (en) * | 2017-06-01 | 2017-10-24 | 四川新网银行股份有限公司 | A kind of anti-fraud and credit risk forecast method based on complicated social networks |
CN107481132A (en) * | 2017-08-02 | 2017-12-15 | 上海前隆信息科技有限公司 | A kind of credit estimation method and system, storage medium and terminal device |
-
2018
- 2018-06-14 CN CN201810614063.4A patent/CN108898476A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103632168A (en) * | 2013-12-09 | 2014-03-12 | 天津工业大学 | Classifier integration method for machine learning |
CN107194795A (en) * | 2016-03-15 | 2017-09-22 | 腾讯科技(深圳)有限公司 | Credit score model training method, credit score computational methods and device |
CN107292424A (en) * | 2017-06-01 | 2017-10-24 | 四川新网银行股份有限公司 | A kind of anti-fraud and credit risk forecast method based on complicated social networks |
CN107481132A (en) * | 2017-08-02 | 2017-12-15 | 上海前隆信息科技有限公司 | A kind of credit estimation method and system, storage medium and terminal device |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110060144B (en) * | 2019-03-18 | 2024-01-30 | 平安科技(深圳)有限公司 | Method for training credit model, method, device, equipment and medium for evaluating credit |
CN110060144A (en) * | 2019-03-18 | 2019-07-26 | 平安科技(深圳)有限公司 | Amount model training method, amount appraisal procedure, device, equipment and medium |
CN111768285A (en) * | 2019-04-01 | 2020-10-13 | 杭州金智塔科技有限公司 | Credit wind control model construction system and method, wind control system and storage medium |
CN110135973A (en) * | 2019-04-23 | 2019-08-16 | 北京淇瑀信息科技有限公司 | A kind of intelligent credit method based on IM and intelligent credit device |
CN111161080A (en) * | 2019-12-10 | 2020-05-15 | 中国建设银行股份有限公司 | Information processing method and device |
CN111695084A (en) * | 2020-04-26 | 2020-09-22 | 北京奇艺世纪科技有限公司 | Model generation method, credit score generation method, device, equipment and storage medium |
CN112017062A (en) * | 2020-07-15 | 2020-12-01 | 北京淇瑀信息科技有限公司 | Resource limit distribution method and device based on guest group subdivision and electronic equipment |
CN112132260A (en) * | 2020-09-03 | 2020-12-25 | 深圳索信达数据技术有限公司 | Training method, calling method, device and storage medium of neural network model |
CN112529701A (en) * | 2021-01-26 | 2021-03-19 | 四川享宇金信金融科技有限公司 | Credit customer level evaluation method, device and equipment in wind control system |
CN112907358A (en) * | 2021-03-17 | 2021-06-04 | 平安消费金融有限公司 | Loan user credit scoring method, loan user credit scoring device, computer equipment and storage medium |
CN113538132A (en) * | 2021-07-26 | 2021-10-22 | 天元大数据信用管理有限公司 | Credit scoring method, device and medium based on regression tree algorithm |
CN113538132B (en) * | 2021-07-26 | 2024-04-23 | 天元大数据信用管理有限公司 | Credit scoring method, equipment and medium based on regression tree algorithm |
CN115456801A (en) * | 2022-09-16 | 2022-12-09 | 北京曲速科技发展有限公司 | Artificial intelligence big data wind control system, method and storage medium for personal credit |
CN115545912A (en) * | 2022-11-30 | 2022-12-30 | 联合赤道环境评价股份有限公司 | Credit risk prediction method and device based on green identification information |
CN115545912B (en) * | 2022-11-30 | 2023-04-25 | 联合赤道环境评价股份有限公司 | Credit risk prediction method and device based on green identification information |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108898476A (en) | A kind of loan customer credit-graded approach and device | |
TWI712981B (en) | Risk identification model training method, device and server | |
CN108564286B (en) | Artificial intelligent financial wind-control credit assessment method and system based on big data credit investigation | |
CN108846520B (en) | Loan overdue prediction method, loan overdue prediction device and computer-readable storage medium | |
CN107025596B (en) | Risk assessment method and system | |
CN108665159A (en) | A kind of methods of risk assessment, device, terminal device and storage medium | |
CN107066616A (en) | Method, device and electronic equipment for account processing | |
CN110956273A (en) | Credit scoring method and system integrating multiple machine learning models | |
CN108133418A (en) | Real-time credit risk management system | |
WO2015135321A1 (en) | Method and device for mining social relationship based on financial data | |
US10521748B2 (en) | Retention risk determiner | |
CN108665366A (en) | Determine method, terminal device and the computer readable storage medium of consumer's risk grade | |
CN109816509A (en) | Generation method, terminal device and the medium of scorecard model | |
CN112559900B (en) | Product recommendation method and device, computer equipment and storage medium | |
CN110147389A (en) | Account number treating method and apparatus, storage medium and electronic device | |
CN108564255A (en) | Matching Model construction method, orphan's list distribution method, device, medium and terminal | |
CN110930038A (en) | Loan demand identification method, loan demand identification device, loan demand identification terminal and loan demand identification storage medium | |
CN110751355A (en) | Scientific and technological achievement assessment method and device | |
CN110119980A (en) | A kind of anti-fraud method, apparatus, system and recording medium for credit | |
CN114612251A (en) | Risk assessment method, device, equipment and storage medium | |
CN111090833A (en) | Data processing method, system and related equipment | |
CN114638704A (en) | Illegal fund transfer identification method and device, electronic equipment and storage medium | |
CN111667307B (en) | Method and device for predicting financial product sales volume | |
CN109635289A (en) | Entry classification method and audit information abstracting method | |
WO2022143431A1 (en) | Method and apparatus for training anti-money laundering model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181127 |