CN106204246A - A kind of BP neutral net credit estimation method based on PCA - Google Patents

A kind of BP neutral net credit estimation method based on PCA Download PDF

Info

Publication number
CN106204246A
CN106204246A CN201610686766.9A CN201610686766A CN106204246A CN 106204246 A CN106204246 A CN 106204246A CN 201610686766 A CN201610686766 A CN 201610686766A CN 106204246 A CN106204246 A CN 106204246A
Authority
CN
China
Prior art keywords
prime
data
sample
sample data
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610686766.9A
Other languages
Chinese (zh)
Inventor
詹进林
庄国强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
YLZ INFORMATION TECHNOLOGY Co Ltd
Original Assignee
YLZ INFORMATION TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by YLZ INFORMATION TECHNOLOGY Co Ltd filed Critical YLZ INFORMATION TECHNOLOGY Co Ltd
Priority to CN201610686766.9A priority Critical patent/CN106204246A/en
Publication of CN106204246A publication Critical patent/CN106204246A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0609Buyer or seller confidence or verification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof

Abstract

A kind of BP neutral net credit estimation method based on PCA of the present invention, from bank data, combing goes out the government data relating to individual, and combine bank's credit evaluation result for this individual, form sample data, estimated performance is improve to after sample data normalized, utilize PCA that sample data is carried out dimensionality reduction, complicated index can be solved, the data type of multidimensional, more meet the demand that big data process, and using bank for individual credit evaluation result as training BP neural network model reference, thus build a kind of credit evaluation model based on the big data of government, the subjectivity of expert estimation can be overcome, credit inquiry is provided to enterprise or individual, also the credit system of financial institution is supplemented, there is higher classification accuracy rate, practicality, and preferable Evaluated effect.

Description

A kind of BP neutral net credit estimation method based on PCA
Technical field
The present invention relates to a kind of BP neutral net credit estimation method based on PCA.
Background technology
At present, credit evaluation research is mainly carried out in financial institution, and it is logical according to the business datum of mechanism's themselves capture Cross the analysis and evaluation of professional person, obtain the credit report of enterprise and individual.The business datum relied solely in financial institution is entered Row credit evaluation easily causes the one-sidedness of conclusion.When financial institution is in the face of the less client of information material, tend not to Go out valuable credit evaluation.
Along with the arrival of big data, many-sided data fusion analysis becomes main flow, and the especially big data of government are for reference Play very important effect.The Chinese government grasps the data of more than 80%, but because various limitations and departmental benefits affect, data Can not share, mostly be in isolation and resting state, and complete due to its data of the assessment models set up based on government data Property, popularity and privacy, analyze that the result drawn is more convincing and social value.
The data relating to individual in the big data of government specifically include that people society data, educational data, health care data, just , there is the feature that data dimension is big, the increase of computation complexity can be caused in industry data etc..
Summary of the invention
It is an object of the invention to provide a kind of BP neutral net credit estimation method based on PCA, energy gram Take the subjectivity of expert estimation, there is higher classification accuracy rate, practicality, and preferably Evaluated effect, complexity can be solved Index, the data type of multidimensional, more meet big data process demand.
A kind of BP neutral net credit estimation method based on PCA of the present invention, comprises the following specific steps that:
Step 1, from bank data combing go out the government data relating to individual, and combine bank's letter for this individual By assessment result, form sample data, sample data be normalized, it is thus achieved that the sample data matrix X ' processed:
X ′ = x 11 ′ x 12 ′ ... x 1 p ′ x 21 ′ x 22 ′ ... x 2 p ′ . . . . . . . . . . . . x n 1 ′ x n 2 ′ ... x n p ′
Wherein, x 'ijRepresent the jth index of i-th sample data;
Step 11, sample data index for numeric type, use following formula to be normalized:
Wherein, i=1,2,3 ..., n, this n are total sample number, j=1,2,3 ..., p, this p are numeric type index sum, xij Represent the jth numeric type index of i-th sample data, max{xijRepresent the index that in i-th sample data, numerical value is maximum, min{xijRepresent the index that in i-th sample data, numerical value is minimum, x 'ijRepresent the i-th sample data after normalized Jth index;
Step 12, sample data index for nonumeric type, use following formula to be normalized:
Wherein, i=1,2,3 ..., n, this n are total sample number, j=1,2,3 ..., q, this q are nonumeric type index sum, xijRepresenting the jth nonumeric type index of i-th sample data, the corresponding final classification grade of m, K refers to classification grade number, wm Represent the weight that each classification grade m is corresponding, Nm(xij) represent sample nonumeric type index xijSame alike result under corresponding classification grade m Quantity, N (xij) represent sample nonumeric type index xijThe quantity that same alike result is total;
Step 2, utilize PCA PCA that the sample data matrix after normalization in step 1 is carried out dimension-reduction treatment, Calculate its principal element affecting credit evaluation and the sequence of each factor, specifically include following steps:
The sample data matrix X ' that step 21, step 1 normalized is crossed:
X ′ = x 11 ′ x 12 ′ ... x 1 p ′ x 21 ′ x 22 ′ ... x 2 p ′ . . . . . . . . . . . . x n 1 ′ x n 2 ′ ... x n p ′
Set up the covariance matrix R of dependency relation level of intimate between the sample data after reflecting normalization, as follows:
R = Σ i = 1 n ( x i 1 ′ - u 1 - ) ( x i 1 ′ - u 1 - ) n - 1 Σ i = 1 n ( x i 2 ′ - u 2 - ) ( x i 1 ′ - u 1 - ) n - 1 ... Σ i = 1 n ( x i p ′ - u p - ) ( x i 1 ′ - u 1 - ) n - 1 Σ i = 1 n ( x i 1 ′ - u 1 - ) ( x i 2 ′ - u 2 - ) n - 1 Σ i = 1 n ( x i 2 ′ - u 2 - ) ( x i 2 ′ - u 2 - ) n - 1 ... Σ i = 1 n ( x i p ′ - u p - ) ( x i 2 ′ - u 2 - ) n - 1 . . . . . . . . . . . . Σ i = 1 n ( x i 1 ′ - u 1 - ) ( x i p ′ - u p - ) n - 1 Σ i = 1 n ( x i 2 ′ - u 2 - ) ( x i p ′ - u p - ) n - 1 ... Σ i = 1 n ( x i p ′ - u p - ) ( x i p ′ - u p - ) n - 1
Wherein,For the average of sample data X ' middle pth row, described covariance matrix R is real symmetric matrix, i.e. Rij= Rji
Step 22, according to covariance matrix R calculate eigen vector:
Solving characteristic equation | λ I-R |=0, wherein I is unit matrix, obtains eigenvalue λi, wherein i=1,2 ..., p, and will Its order arrangement by size;
Obtain respectively corresponding to eigenvalue λiCharacteristic vector ei, require here | | ei| |=1, i.e.Wherein eij Represent characteristic vector eiJth component;
Select m characteristic vector eiComposition matrix and sample data XiIt is multiplied, obtains m main constituent Fi, wherein m < p, Formula is as follows:
Wherein Fi1It is referred to as i-th sample xiFirst principal component;
Step 23, calculating principal component contributor rate and contribution rate of accumulative total:
Owing to eigenvalue and main constituent are one to one, i-th main constituent FiIt is by ith feature value λiCorresponding Characteristic vector eiObtain, the i-th main constituent FiContribution rate be through ith feature value λiCalculated, then the i-th main one-tenth Divide FiContribution rate formula as follows:
Contribution rate of accumulative total computing formula is as follows:
Take contribution rate of accumulative total and reach m eigenvalue its m corresponding main constituent, wherein m < p of 85%~95%, thus obtain Training sample F to new:
Wherein the value in F matrix is calculated by formula 1-3 and obtains;
Step 3, the model of utilization BP neural network personal credit file:
Step 31, design BP neural network topology structure:
According to Kolmogorov theorem, set up three layers of BP neutral net, include respectively: input layer, hidden layer and output layer, should Input layer number is main constituent number m of training sample F newly generated in step 2, and output layer nodes is 1, and hidden layer saves Counting and determine according to Lippmann empirical equation, training function is TRAINLM, and adaptive learning function is LEARNGDM, and performance is divided Analysis function is MSE, and hidden layer transmission function is TANSIG function, and output layer transmission function is PURELIN function, according to Delta Practising rule to be adjusted the network connection weights between each node layer and threshold value, the computing formula of network global error E is:
In formula, p represents the number of training sample, EtIt is the network training error of t training sample, ztIt is the t training The network real output value of sample, ctIt it is the credit evaluation result for individual of the bank known to the t training sample;
Step 32, the training of BP neural network model:
Step 2 being trained through sample data F of Data Dimensionality Reduction, a part of sampling notebook data F is as training number According to, remaining as test data, arrange between learning rate, factor of momentum, the connection weights of each layer, output threshold value imparting (-1,1) Random number, described BP neural metwork training use Delta learning rules, preset network global error E accuracy value and training time Number, calculates network global error E, if network global error E reaches setting value less than preset accuracy value or frequency of training, then ties Shu Xunlian, obtains BP neural network model;
Step 33, personal credit model evaluation
It is input in step 32 train in the BP neural network model obtained by test data and carries out model measurement, if classification Precision reaches more than threshold value, then it is assumed that have preferable classifying quality, and this BP neural network model is by assessment, otherwise, returns step Rapid 31, readjust BP neural network topology structure, until trained BP neural network model carries out letter to test data Threshold classification precision is reached during with assessment;
Step 4, the personal credit file that the data input step 3 of pending credit evaluation is set up model in, output Assessment result.
The data relating to individual in this government data specifically include that people society data, educational data, health care data, just Industry data.
Present invention combing from bank data goes out the government data relating to individual, and combines bank's letter for this individual By assessment result, form sample data, improve estimated performance to after sample data normalized, utilize PCA Sample data is carried out dimensionality reduction, the index of complexity, the data type of multidimensional can be solved, more meet the demand that big data process, and Using bank for the credit evaluation result of individual as the reference of training BP neural network model, thus build a kind of based on government The credit evaluation model of big data, can overcome the subjectivity of expert estimation, provides credit inquiry, also to finance to enterprise or individual The credit system of mechanism is supplemented, and has higher classification accuracy rate, practicality, and preferably Evaluated effect.
Detailed description of the invention
A kind of BP neutral net credit estimation method based on PCA of the present invention, specifically comprises the following steps that
Step 1, from bank data combing go out the government data relating to individual, and combine bank's letter for this individual By assessment result, form sample data, this government data relates to the data of individual specifically include that people society data, educational data, Health care data, employment data, such as, individual's essential information data include: sex, age, schooling, marital status Deng;Health status data include: health care costs situation, whether have major disease etc.;Employment status data include: employment unit, Employment unit property, inactive status etc.;Social security status data includes: social security pays situation etc., returns sample data One change processes, it is thus achieved that the sample data matrix X ' processed:
X &prime; = x 11 &prime; x 12 &prime; ... x 1 p &prime; x 21 &prime; x 22 &prime; ... x 2 p &prime; . . . . . . . . . . . . x n 1 &prime; x n 2 &prime; ... x n p &prime;
Wherein, x 'ijRepresent the jth index of i-th sample data;
Step 11, sample data index for numeric type, use following formula to be normalized:
Wherein, i=1,2,3 ..., n, j=1,2,3 ..., p, n are total sample number, and p is numeric type index sum, xijRepresent The jth numeric type index of i-th sample data, max{xijRepresent the index that in i-th sample data, numerical value is maximum, min {xijRepresent the index that in i-th sample data, numerical value is minimum, x 'ijRepresent the jth of pretreated i-th sample data Index;
Step 12, sample data index for nonumeric type, use following formula to be normalized:
Wherein, i=1,2,3 ..., n, j=1,2,3 ..., q, n are total sample number, and q is nonumeric type index sum, xijTable Show the jth nonumeric type index of i-th sample data, the corresponding final classification grade of m, K refers to classification grade number, such as: Classification grade be excellent, good, in, poor, then K is 4, wmRepresent the weight that each classification grade m is corresponding, Nm(xij) represent that sample is nonumeric Type index xijThe quantity of same alike result, N (x under corresponding classification grade mij) represent sample nonumeric type index xijSame alike result is total Quantity;
Illustrate, for this index of sex, be divided into man, two attributes of female, it is assumed that in 100 records, man is 60, Female is 40, classification grade be excellent, good, in, poor, give weight by classification grade, be followed successively by 0.4,0.3,0.2,0.1, if property The distributed number of other point of man, female correspondence classification grade respectively is as shown in the table:
Excellent Good In Difference
Man 30 10 10 10
Female 10 6 2 2
Then:
Step 2, utilize PCA PCA that the sample data matrix after normalization in step 1 is carried out dimension-reduction treatment, Calculating its principal element affecting credit evaluation and the sequence of each factor, described PCA PCA is a kind of mathematic(al) manipulation Method, by the thinking of dimensionality reduction, multi objective is converted into a few aggregative indicator, it leads to one group of given correlated variables Crossing linear transformation becomes another to organize incoherent variable, the order arrangement that these new variablees successively decrease successively according to variance, specifically wraps Include following steps:
The sample data matrix X ' that step 21, step 1 normalized is crossed:
X &prime; = x 11 &prime; x 12 &prime; ... x 1 p &prime; x 21 &prime; x 22 &prime; ... x 2 p &prime; . . . . . . . . . . . . x n 1 &prime; x n 2 &prime; ... x n p &prime;
Set up the covariance matrix R of dependency relation level of intimate between the sample data after reflecting normalization, as follows:
R = &Sigma; i = 1 n ( x i 1 &prime; - u 1 - ) ( x i 1 &prime; - u 1 - ) n - 1 &Sigma; i = 1 n ( x i 2 &prime; - u 2 - ) ( x i 1 &prime; - u 1 - ) n - 1 ... &Sigma; i = 1 n ( x i p &prime; - u p - ) ( x i 1 &prime; - u 1 - ) n - 1 &Sigma; i = 1 n ( x i 1 &prime; - u 1 - ) ( x i 2 &prime; - u 2 - ) n - 1 &Sigma; i = 1 n ( x i 2 &prime; - u 2 - ) ( x i 2 &prime; - u 2 - ) n - 1 ... &Sigma; i = 1 n ( x i p &prime; - u p - ) ( x i 2 &prime; - u 2 - ) n - 1 . . . . . . . . . . . . &Sigma; i = 1 n ( x i 1 &prime; - u 1 - ) ( x i p &prime; - u p - ) n - 1 &Sigma; i = 1 n ( x i 2 &prime; - u 2 - ) ( x i p &prime; - u p - ) n - 1 ... &Sigma; i = 1 n ( x i p &prime; - u p - ) ( x i p &prime; - u p - ) n - 1
Wherein,For the average of sample data X ' middle pth row, described covariance matrix R is real symmetric matrix, i.e. Rij= Rji
Step 22, according to covariance matrix R calculate eigen vector:
Solving characteristic equation | λ I-R |=0, wherein I is unit matrix, obtains eigenvalue λi, wherein i=1,2 ..., p, and will It sequentially arranges by size, i.e. λ1≥λ2≥…≥λp≥0;
Obtain respectively corresponding to eigenvalue λiCharacteristic vector ei, require here | | ei| |=1, i.e.Wherein eijTable Show characteristic vector eiJth component;
Select m characteristic vector eiComposition matrix and sample data XiIt is multiplied, obtains m main constituent Fi, wherein m < p, Formula is as follows:
Wherein Fi1It is referred to as i-th sample xiFirst principal component;
Step 23, calculating principal component contributor rate and contribution rate of accumulative total:
Owing to eigenvalue and main constituent are one to one, i-th main constituent FiIt is by ith feature value λiCorresponding Characteristic vector eiObtain, the i-th main constituent FiContribution rate be through ith feature value λiCalculated, then the i-th main one-tenth Divide FiContribution rate formula as follows:
Contribution rate of accumulative total computing formula is as follows:
Take contribution rate of accumulative total and reach the eigenvalue λ of 85%~95%12,…,λmCorresponding the 1st, the 2nd ..., m-th Main constituent, wherein m < p, thus obtain new training sample F:
Wherein the value in F matrix is obtained by formula (1-3);
Step 3, the model of utilization BP neural network personal credit file:
Step 31, design BP neural network topology structure:
According to Kolmogorov theorem, set up three layers of BP neutral net, include respectively: input layer, hidden layer and output layer, should Input layer number is main constituent number m of training sample F newly generated in step 2, and output layer nodes is 1, and hidden layer saves Counting and determine according to Lippmann empirical equation, training function is TRAINLM, and adaptive learning function is LEARNGDM, and performance is divided Analysis function is MSE, and hidden layer transmission function is TANSIG function, and output layer transmission function is PURELIN function, according to Delta Practising rule to be adjusted the network connection weights between each node layer and threshold value, the computing formula of network global error E is:
In formula, p represents the number of training sample, EtIt is the network training error of t training sample, ztIt is the t training The network real output value of sample, ctIt it is the credit evaluation result for individual of the bank known to the t training sample;
Step 32, the training of BP neutral net:
Step 2 is trained through sample data F of Data Dimensionality Reduction, the 70% of sampling notebook data F as training data, Remaining 30% as test data, and arranging learning rate is 0.6, and factor of momentum takes 0.5, the connection weights of each layer, output threshold value Giving the random number between (-1,1), described BP neural metwork training uses Delta learning rules, preset network global error E essence Angle value is 0.5 or frequency of training is more than 5000, calculates network global error E, if network global error E is less than preset accuracy value Or frequency of training reaches setting value, then terminate training, obtain BP neural network model;
Step 33, personal credit model evaluation
It is input in step 32 train in the BP neural network model obtained by test data and carries out model measurement, if classification Precision reaches more than threshold value (70%), then it is assumed that have preferable classifying quality, and this BP neural network model is by assessment, otherwise, Return step 31, readjust BP neural network topology structure, until trained BP neural network model, to test data When carrying out credit evaluation, reach threshold classification precision;
Step 4, the personal credit file that the data input step 3 of pending credit evaluation is set up model in, output Assessment result.
The present invention focuses on: a kind of BP neutral net credit estimation method based on PCA of the present invention, From bank data, combing goes out the government data relating to individual, and combines bank's credit evaluation result for this individual, shape Become sample data, improve estimated performance to after sample data normalized, utilize PCA that sample data is entered Row dimensionality reduction, can solve the index of complexity, the data type of multidimensional, more meets the demand that big data process, and by bank for individual The credit evaluation result of people is as the reference of training BP neural network model, thus builds a kind of credit based on the big data of government Assessment models, can overcome the subjectivity of expert estimation, provides credit inquiry, also the credit body to financial institution to enterprise or individual System supplements, and has higher classification accuracy rate, practicality, and preferably Evaluated effect.
The above, not impose any restrictions the technical scope of the present invention, therefore every technical spirit according to the present invention Any trickle amendment, equivalent variations and the modification being made above example, all still falls within the range of technical solution of the present invention.

Claims (2)

1. a BP neutral net credit estimation method based on PCA, it is characterised in that include following concrete step Rapid:
Step 1, from bank data combing go out the government data relating to individual, and combine bank the credit of this individual commented Estimate result, form sample data, sample data is normalized, it is thus achieved that the sample data matrix X ' processed:
X &prime; = x 11 &prime; x 12 &prime; ... x 1 p &prime; x 21 &prime; x 22 &prime; ... x 2 p &prime; . . . . . . . . . . . . x n 1 &prime; x n 2 &prime; ... x n p &prime;
Wherein, x 'ijRepresent the jth index of i-th sample data;
Step 11, sample data index for numeric type, use following formula to be normalized:
Wherein, i=1,2,3 ..., n, this n are total sample number, j=1,2,3 ..., p, this p are numeric type index sum, xijRepresent The jth numeric type index of i-th sample data, max{xijRepresent the index that in i-th sample data, numerical value is maximum, min {xijRepresent the index that in i-th sample data, numerical value is minimum, x 'ijRepresent the of the i-th sample data after normalized J index;
Step 12, sample data index for nonumeric type, use following formula to be normalized:
Wherein, i=1,2,3 ..., n, this n are total sample number, j=1,2,3 ..., q, this q are nonumeric type index sum, xijTable Showing the jth nonumeric type index of i-th sample data, the corresponding final classification grade of m, K refers to classification grade number, wmRepresent The weight that each classification grade m is corresponding, Nm(xij) represent sample nonumeric type index xijThe number of same alike result under corresponding classification grade m Amount, N (xij) represent sample nonumeric type index xijThe quantity that same alike result is total;
Step 2, utilize PCA PCA that the sample data matrix after normalization in step 1 is carried out dimension-reduction treatment, calculate Go out its principal element affecting credit evaluation and the sequence of each factor, specifically include following steps:
The sample data matrix X ' that step 21, step 1 normalized is crossed:
X &prime; = x 11 &prime; x 12 &prime; ... x 1 p &prime; x 21 &prime; x 22 &prime; ... x 2 p &prime; . . . . . . . . . . . . x n 1 &prime; x n 2 &prime; ... x n p &prime;
Set up the covariance matrix R of dependency relation level of intimate between the sample data after reflecting normalization, as follows:
R = &Sigma; i = 1 n ( x i 1 &prime; - u 1 - ) ( x i 1 &prime; - u 1 - ) n - 1 &Sigma; i = 1 n ( x i 2 &prime; - u 2 - ) ( x i 1 &prime; - u 1 - ) n - 1 ... &Sigma; i = 1 n ( x i p &prime; - u p - ) ( x i 1 &prime; - u 1 - ) n - 1 &Sigma; i = 1 n ( x i 1 &prime; - u 1 - ) ( x i 2 &prime; - u 2 - ) n - 1 &Sigma; i = 1 n ( x i 2 &prime; - u 2 - ) ( x i 2 &prime; - u 2 - ) n - 1 ... &Sigma; i = 1 n ( x i p &prime; - u p - ) ( x i 2 &prime; - u 2 - ) n - 1 . . . . . . . . . . . . &Sigma; i = 1 n ( x i 1 &prime; - u 1 - ) ( x i p &prime; - u p - ) n - 1 &Sigma; i = 1 n ( x i 2 &prime; - u 2 - ) ( x i p &prime; - u p - ) n - 1 ... &Sigma; i = 1 n ( x i p &prime; - u p - ) ( x i p &prime; - u p - ) n - 1
Wherein,For the average of sample data X ' middle pth row, described covariance matrix R is real symmetric matrix, i.e. Rij=Rji
Step 22, according to covariance matrix R calculate eigen vector:
Solving characteristic equation | λ I-R |=0, wherein I is unit matrix, obtains eigenvalue λi, wherein i=1,2 ..., p, and pressed Size order arranges;
Obtain respectively corresponding to eigenvalue λiCharacteristic vector ei, require here | | ei| |=1, i.e.Wherein eijRepresent Characteristic vector eiJth component;
Select m characteristic vector eiComposition matrix and sample data XiIt is multiplied, obtains m main constituent Fi, wherein m < p, formula As follows:
Wherein Fi1It is referred to as i-th sample xiFirst principal component;
Step 23, calculating principal component contributor rate and contribution rate of accumulative total:
Owing to eigenvalue and main constituent are one to one, i-th main constituent FiIt is by ith feature value λiCharacteristic of correspondence Vector eiObtain, the i-th main constituent FiContribution rate be through ith feature value λiCalculated, then the i-th main constituent Fi Contribution rate formula as follows:
Contribution rate of accumulative total computing formula is as follows:
Take contribution rate of accumulative total and reach m eigenvalue its m corresponding main constituent, wherein m < p of 85%~95%, thus obtain new Training sample F:
Wherein the value in F matrix is calculated by formula 1-3 and obtains;
Step 3, the model of utilization BP neural network personal credit file:
Step 31, design BP neural network topology structure:
According to Kolmogorov theorem, set up three layers of BP neutral net, include respectively: input layer, hidden layer and output layer, this input Node layer number is main constituent number m of training sample F newly generated in step 2, and output layer nodes is 1, the number of hidden nodes Determining according to Lippmann empirical equation, training function is TRAINLM, and adaptive learning function is LEARNGDM, performance evaluation letter Number is MSE, and hidden layer transmission function is TANSIG function, and output layer transmission function is PURELIN function, according to Delta study rule Then the network between each node layer being connected weights and threshold value is adjusted, the computing formula of network global error E is:
In formula, p represents the number of training sample, EtIt is the network training error of t training sample, ztIt is the t training sample Network real output value, ctIt it is the credit evaluation result for individual of the bank known to the t training sample;
Step 32, the training of BP neural network model:
Step 2 being trained through sample data F of Data Dimensionality Reduction, a part of sampling notebook data F is as training data, surplus Remaining conduct test data, learning rate is set, factor of momentum, the connection weights of each layer, output threshold value give between (-1,1) with Machine number, described BP neural metwork training uses Delta learning rules, preset network global error E accuracy value and frequency of training, meter Calculate network global error E, if network global error E reaches setting value less than preset accuracy value or frequency of training, then terminate instruction Practice, obtain BP neural network model;
Step 33, personal credit model evaluation
It is input in step 32 train in the BP neural network model obtained by test data and carries out model measurement, if nicety of grading Reaching more than threshold value, then it is assumed that have preferable classifying quality, this BP neural network model is by assessment, otherwise, returns step 31, Readjust BP neural network topology structure, until trained BP neural network model carries out credit evaluation to test data Time reach threshold classification precision;
Step 4, the personal credit file that the data input step 3 of pending credit evaluation is set up model in, output assessment Result.
A kind of BP neutral net credit estimation method based on PCA the most according to claim 1, its feature It is: the data relating to individual in this government data specifically include that people society data, educational data, health care data, employment number According to.
CN201610686766.9A 2016-08-18 2016-08-18 A kind of BP neutral net credit estimation method based on PCA Pending CN106204246A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610686766.9A CN106204246A (en) 2016-08-18 2016-08-18 A kind of BP neutral net credit estimation method based on PCA

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610686766.9A CN106204246A (en) 2016-08-18 2016-08-18 A kind of BP neutral net credit estimation method based on PCA

Publications (1)

Publication Number Publication Date
CN106204246A true CN106204246A (en) 2016-12-07

Family

ID=57522086

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610686766.9A Pending CN106204246A (en) 2016-08-18 2016-08-18 A kind of BP neutral net credit estimation method based on PCA

Country Status (1)

Country Link
CN (1) CN106204246A (en)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106844663A (en) * 2017-01-23 2017-06-13 中国石油大学(华东) A kind of ship collision methods of risk assessment and system based on data mining
CN107063349A (en) * 2017-04-17 2017-08-18 云南电网有限责任公司电力科学研究院 A kind of method and device of Fault Diagnosis Method of Power Transformer
CN107273917A (en) * 2017-05-26 2017-10-20 电子科技大学 A kind of Method of Data with Adding Windows based on parallelization Principal Component Analysis Algorithm
CN107481135A (en) * 2017-08-16 2017-12-15 广东工业大学 A kind of personal credit evaluation method and system based on BP neural network
CN107590737A (en) * 2017-10-24 2018-01-16 厦门大学 Personal credit scores and credit line measuring method
CN107633265A (en) * 2017-09-04 2018-01-26 深圳市华傲数据技术有限公司 For optimizing the data processing method and device of credit evaluation model
CN108446890A (en) * 2018-02-26 2018-08-24 平安普惠企业管理有限公司 A kind of examination & approval model training method, computer readable storage medium and terminal device
CN108537397A (en) * 2017-03-01 2018-09-14 腾讯科技(深圳)有限公司 A kind of internet reference appraisal procedure and system
CN109191838A (en) * 2018-09-07 2019-01-11 西北工业大学 The current management method of highway green channel based on artificial intelligence and system
CN109447574A (en) * 2018-10-09 2019-03-08 广州供电局有限公司 Assets based on Fuzzy Optimum Neural Network turn solid project processing method
CN109816513A (en) * 2018-12-21 2019-05-28 上海拍拍贷金融信息服务有限公司 User credit ranking method and device, readable storage medium storing program for executing
CN110344824A (en) * 2019-06-25 2019-10-18 中国矿业大学(北京) A kind of sound wave curve generation method returned based on random forest
CN110381079A (en) * 2019-07-31 2019-10-25 福建师范大学 Network log method for detecting abnormality is carried out in conjunction with GRU and SVDD
CN110533528A (en) * 2019-08-30 2019-12-03 北京市天元网络技术股份有限公司 Assess the method and apparatus of business standing
CN110561191A (en) * 2019-07-30 2019-12-13 西安电子科技大学 Numerical control machine tool cutter abrasion data processing method based on PCA and self-encoder
CN110957024A (en) * 2019-10-25 2020-04-03 卫宁健康科技集团股份有限公司 Medical credit evaluation method, device and storage medium
CN114283023A (en) * 2021-12-29 2022-04-05 工业云制造(四川)创新中心有限公司 Manufacturing management method and system based on cloud manufacturing support technology
DE202022104425U1 (en) 2022-08-03 2022-08-09 Sayed Sayeed Ahmad Intelligent system for secure integration of credit checks and banking systems through machine learning
CN115795314A (en) * 2023-02-07 2023-03-14 山东海量信息技术研究院 Key sample sampling method, system, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102592171A (en) * 2011-12-30 2012-07-18 南京邮电大学 Method and device for predicting cognitive network performance based on BP (Back Propagation) neural network
CN103514566A (en) * 2013-10-15 2014-01-15 国家电网公司 Risk control system and method
CN104792522A (en) * 2015-04-10 2015-07-22 北京工业大学 Intelligent gear defect analysis method based on fractional wavelet transform and BP neutral network
CN105675807A (en) * 2016-01-07 2016-06-15 中国农业大学 Evaluation method of atrazine residue based on BP neural network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102592171A (en) * 2011-12-30 2012-07-18 南京邮电大学 Method and device for predicting cognitive network performance based on BP (Back Propagation) neural network
CN103514566A (en) * 2013-10-15 2014-01-15 国家电网公司 Risk control system and method
CN104792522A (en) * 2015-04-10 2015-07-22 北京工业大学 Intelligent gear defect analysis method based on fractional wavelet transform and BP neutral network
CN105675807A (en) * 2016-01-07 2016-06-15 中国农业大学 Evaluation method of atrazine residue based on BP neural network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
刘春玲等: "基于主成分分析法和BP神经网络的银行客户信用评价", 《河北工程技术高等专科学校学报》 *
宋新明等: "基于主成分分析法和BP神经网络的电力客户信用评价", 《技术经济与管理研究》 *

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106844663B (en) * 2017-01-23 2020-01-17 中国石油大学(华东) Ship collision risk assessment method and system based on data mining
CN106844663A (en) * 2017-01-23 2017-06-13 中国石油大学(华东) A kind of ship collision methods of risk assessment and system based on data mining
CN108537397A (en) * 2017-03-01 2018-09-14 腾讯科技(深圳)有限公司 A kind of internet reference appraisal procedure and system
CN107063349A (en) * 2017-04-17 2017-08-18 云南电网有限责任公司电力科学研究院 A kind of method and device of Fault Diagnosis Method of Power Transformer
CN107273917A (en) * 2017-05-26 2017-10-20 电子科技大学 A kind of Method of Data with Adding Windows based on parallelization Principal Component Analysis Algorithm
CN107481135A (en) * 2017-08-16 2017-12-15 广东工业大学 A kind of personal credit evaluation method and system based on BP neural network
CN107633265A (en) * 2017-09-04 2018-01-26 深圳市华傲数据技术有限公司 For optimizing the data processing method and device of credit evaluation model
CN107633265B (en) * 2017-09-04 2021-03-30 深圳市华傲数据技术有限公司 Data processing method and device for optimizing credit evaluation model
CN107590737A (en) * 2017-10-24 2018-01-16 厦门大学 Personal credit scores and credit line measuring method
CN108446890A (en) * 2018-02-26 2018-08-24 平安普惠企业管理有限公司 A kind of examination & approval model training method, computer readable storage medium and terminal device
CN109191838A (en) * 2018-09-07 2019-01-11 西北工业大学 The current management method of highway green channel based on artificial intelligence and system
CN109447574A (en) * 2018-10-09 2019-03-08 广州供电局有限公司 Assets based on Fuzzy Optimum Neural Network turn solid project processing method
CN109816513A (en) * 2018-12-21 2019-05-28 上海拍拍贷金融信息服务有限公司 User credit ranking method and device, readable storage medium storing program for executing
CN110344824A (en) * 2019-06-25 2019-10-18 中国矿业大学(北京) A kind of sound wave curve generation method returned based on random forest
CN110561191A (en) * 2019-07-30 2019-12-13 西安电子科技大学 Numerical control machine tool cutter abrasion data processing method based on PCA and self-encoder
CN110381079A (en) * 2019-07-31 2019-10-25 福建师范大学 Network log method for detecting abnormality is carried out in conjunction with GRU and SVDD
CN110533528A (en) * 2019-08-30 2019-12-03 北京市天元网络技术股份有限公司 Assess the method and apparatus of business standing
CN110957024A (en) * 2019-10-25 2020-04-03 卫宁健康科技集团股份有限公司 Medical credit evaluation method, device and storage medium
CN114283023A (en) * 2021-12-29 2022-04-05 工业云制造(四川)创新中心有限公司 Manufacturing management method and system based on cloud manufacturing support technology
DE202022104425U1 (en) 2022-08-03 2022-08-09 Sayed Sayeed Ahmad Intelligent system for secure integration of credit checks and banking systems through machine learning
CN115795314A (en) * 2023-02-07 2023-03-14 山东海量信息技术研究院 Key sample sampling method, system, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN106204246A (en) A kind of BP neutral net credit estimation method based on PCA
Mallikarjuna et al. Evaluation of forecasting methods from selected stock market returns
Xu et al. Composite quantile regression neural network with applications
Kumar et al. Software development cost estimation using wavelet neural networks
WO2018090657A1 (en) Bp_adaboost model-based method and system for predicting credit card user default
Arshad Net present value is better than internal rate of return
CN109118013A (en) A kind of management data prediction technique, readable storage medium storing program for executing and forecasting system neural network based
CN106570525A (en) Method for evaluating online commodity assessment quality based on Bayesian network
Lukić Analysis if the Efficiency of Trade in Oil Derivatives in Serbia by Applying the Fuzzy AHP-TOPSIS Method
Kibekbaev et al. Benchmarking regression algorithms for income prediction modeling
CN110503508A (en) A kind of item recommendation method of the more granularity matrix decompositions of level
Trujillo Patterns in a complex system: an empirical study of valuation in business bankruptcy cases
CN109559042A (en) A kind of fund manager&#39;s scoring algorithm based on multidimensional index regression analysis
CN106777402A (en) A kind of image retrieval text method based on sparse neural network
Etemadi et al. Earnings per share forecast using extracted rules from trained neural network by genetic algorithm
Jiang et al. Stock price fluctuation prediction method based on time series analysis.
CN114298834A (en) Personal credit evaluation method and system based on self-organizing mapping network
CN106203864A (en) A kind of brand assets appraisal procedure based on big data and system
Kayakuş et al. Predicting the share of tourism revenues in total exports
Kayakuş Estimating the Changes in the Number of Visitors on the Websites of the Tourism Agencies in the COVID-19 Process by Machine Learning Methods
Nuroğlu Estimating and forecasting trade flows by panel data analysis and neural networks
Du Enterprise credit rating based on genetic neural network
Huang A genetic fuzzy neural network for bankruptcy prediction in chinese corporations
Wu et al. The trend analysis of China's stock market based on fractal method and BP neural network model
Huang et al. Implementation of classifiers for choosing insurance policy using decision trees: A case study

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information

Inventor after: Wu Liangbin

Inventor after: Zhan Jinlin

Inventor after: Chen Kunlong

Inventor after: Zhuang Guoqiang

Inventor before: Zhan Jinlin

Inventor before: Zhuang Guoqiang

COR Change of bibliographic data