CN110163743A - A kind of credit-graded approach based on hyperparameter optimization - Google Patents

A kind of credit-graded approach based on hyperparameter optimization Download PDF

Info

Publication number
CN110163743A
CN110163743A CN201910347744.3A CN201910347744A CN110163743A CN 110163743 A CN110163743 A CN 110163743A CN 201910347744 A CN201910347744 A CN 201910347744A CN 110163743 A CN110163743 A CN 110163743A
Authority
CN
China
Prior art keywords
credit
function
value
evaluation function
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201910347744.3A
Other languages
Chinese (zh)
Inventor
郭锐
张祥
赵熙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Titanium Rong Intelligent Technology (suzhou) Co Ltd
Original Assignee
Titanium Rong Intelligent Technology (suzhou) Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Titanium Rong Intelligent Technology (suzhou) Co Ltd filed Critical Titanium Rong Intelligent Technology (suzhou) Co Ltd
Priority to CN201910347744.3A priority Critical patent/CN110163743A/en
Publication of CN110163743A publication Critical patent/CN110163743A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof

Abstract

The present invention relates to the credit-graded approach based on hyperparameter optimization, step includes: S1, collects scoring main information data and to data prediction and feature selecting, makes training dataset and test data set;S2 establishes credit scoring model, chooses the modeling of XGBoost algorithm, and using Gaussian process combination Bayes to the hyperparameter optimization of algorithm;S3 chooses the fixed XGBoost algorithm of optimal hyper parameter group, uses training dataset training credit scoring model;S4 is predicted and is assessed to credit scoring model using test data set, by formula score=A-B*ln (p/ (1-p)), calculates credit scoring.The present invention is in hyperparameter optimization, when that can not determine objective function curve, assumed by guess, assertive goal function meets the Gaussian Profile of multivariable and to assuming the further modified process of assessment models, promote the efficiency and reliability of hyperparameter optimization, accelerate model formation efficiency, is conducive to corporate model and substitutes efficiency, promote risk control ability.

Description

A kind of credit-graded approach based on hyperparameter optimization
Technical field
The present invention relates to computer credit scoring technology field more particularly to a kind of credit scorings based on hyperparameter optimization Method.
Background technique
With the rapid development of internet credit industry, risk problem is also continued to bring out.It is asked by model to control risk Topic has become numerous enterprises preferred option, and examination & approval efficiency can be greatly improved by carrying out credit evaluation using model, save manpower at This.
In modeling process, the adjustment of model hyper parameter needs to consume the plenty of time, and the method for usual parameter optimization has network Search method and Monte Carlo analysis.Network searching method can not be suitable for continuity parameter, once parameter combination scale increase, time Meeting exponentially grade increases during going through, and takes considerable time, random search cannot utilize priori knowledge for next group of hyper parameter Selection.
Summary of the invention
The purpose of the present invention is to provide a kind of credit-graded approaches based on hyperparameter optimization, it is intended to solve modeling process Middle adjustment model hyper parameter needs the problem of consuming the plenty of time.
To achieve the above object, technical scheme is as follows:
A kind of credit-graded approach based on hyperparameter optimization, includes the following steps:
S1, collects scoring main information data and to data prediction and feature selecting, makes training dataset and test Data set;
S2 chooses the modeling of XGBoost algorithm, it is assumed that evaluation function is Gaussian process function, using EI optimisation criteria, selection Optimal hyper parameter group;
S3 selects optimal hyper parameter group and training dataset training credit scoring model;
S4 is predicted and is assessed to credit scoring model using test data set, passes through formula score=A-B*ln (p/ (1- P)), credit scoring is calculated, wherein p is model prediction probability, and A, B are constant.
In step 1, scoring main information data include application information, carrier data, Unionpay portrait, main strategies, Data prediction includes desensitization process and WOE coding, makes training dataset and test data set.
Hypothesis evaluation function isWherein, y is output valve, and f is Gaussian process function, and N is height This distribution, f (cx) are prior probability model,For variance;For one group of data point x1:n={ x1,…,xn, it is assumed that evaluation letter Several value f1:n={ f (x1),…,f(xn) described using Gaussian Profile, f1:n~N (m (x1:n), k), xnFor n-th group input to Amount, x1:nThe input vector matrix of n-th group, f (x are arrived for the 1st groupn) it is xnCorresponding evaluation function value, f1:nFor x1To xnIt is corresponding The set of evaluation function value.
According to formula Wherein, Φ (Z) is Cumulative Distribution Function, and φ (z) is probability-distribution function,For hyper parameter Gauss Procedure function value, f (x) are Gaussian process functional value, and E is expectation function,μIt (x) is Gaussian process function x mean value, σ (x) is height This procedure function x mean square deviation uses EI optimization process to give evaluation function input value x, updating using Gaussian process function The posterior error of hypothesis evaluation function is estimated, obtains new input value x by maximizing EI, is counted again by new input value x The output valve of the y of hypothesis evaluation function is calculated, fixed the number of iterations is repeated, until convergence.
Credit-graded approach based on hyperparameter optimization of the invention, by leading to when that can not determine objective function curve Guess is crossed it is assumed that assertive goal function meets the Gaussian Profile of multivariable and to assuming the further modified process of assessment models, The efficiency and reliability of hyperparameter optimization is promoted, model formation efficiency is accelerated, is conducive to corporate model and substitutes efficiency, promote wind Dangerous control ability.
Detailed description of the invention
Fig. 1 is the flow chart of the credit-graded approach in one embodiment of the invention based on hyperparameter optimization.
Specific embodiment
Technical solution of the present invention is described in further detail with reference to the accompanying drawings and examples.
Credit-graded approach based on hyperparameter optimization of the invention, as shown in Figure 1, including the following steps:
S1, collects scoring main information data and to data prediction and feature selecting, makes training dataset and test Data set;
S2 establishes credit scoring model, chooses the modeling of XGBoost algorithm, it is assumed that evaluation function is Gaussian process function, is adopted With EI optimisation criteria, optimal hyper parameter group is selected;
S3 selects optimal hyper parameter group and training dataset training credit scoring model;
S4 is predicted and is assessed to credit scoring model using test data set, passes through formula score=A-B*ln (p/ (1- P)), credit scoring is calculated, wherein p is model prediction probability, and A, B are constant.
Wherein, scoring main information data include application information, carrier data, Unionpay's portrait and main strategies, are led to It crosses data information desensitization process and WOE is encoded and realized data prediction, in data that treated, 70% data are for training Model, makes training dataset, and 30% data test model makes test data set.
Using Gaussian process combination Bayes to the hyperparameter optimization of algorithm method particularly includes: it is assumed that evaluation function is height This procedure function it is expected to obtain multivariate Gaussian distributed model with covariance function by determining, then uses EI optimisation criteria, choosing Select optimal hyper parameter group.
Hypothesis evaluation function isWherein y is evaluation function output valve, and f is Gaussian process letter Number, N are Gaussian Profile, and f (cx) is prior probability model,For variance;For one group of data point x1:n={ x1,…,xn, it is false Determine the value f of evaluation function1:n={ f (x1),…,f(xn), f is described using Gaussian Profile1:n~N (m (x1:n), k), xnIt is n-th Group input vector, x1:nThe input vector matrix of n-th group, f (x are arrived for first groupn) it is xnCorresponding evaluation function value, f1:nFor x1It arrives xnThe set of corresponding evaluation function value.
According to formula Wherein, x is input value,For current one group of optimal hyper parameter, Φ (Z) is Cumulative Distribution Function, φ It (z) is probability-distribution function,For hyper parameter Gaussian process functional value, f (x) is Gaussian process functional value, and E is desired letter Number, μ (x) are Gaussian process function x mean value, and σ (x) is Gaussian process function x mean square deviation, use EI optimization process to give accepted opinion Valence function input value x is estimated using the posterior error that Gaussian process function updates hypothesis evaluation function, is obtained by maximizing EI New x recalculates the output valve of the y of hypothesis evaluation function by new x, repeats fixed the number of iterations, until restraining To optimal hyper parameter group.
Finally, verifying credit scoring model using test data set, evaluation index includes ROC curve, KS curve, obscures square Battle array, and by formula score=A-B*ln (p/ (1-p)), calculate credit scoring, whereinpFor model prediction probability, A, B are Constant.Constant A, B are by will be known to two or the score value assumed is brought into and is calculated in formula, it is generally the case that need to set two It is a it is assumed that first is that score value is specifically expected to some specific ratio set, second is that determining the score be doubled of ratio.
In one embodiment, data dimension, which includes that credit dimension is the age, works, to be determined to the information data of scoring main body Property and the credit card accrediting amount, Risk Dimensions are overdue label, overdue for bad sample, and not overdue preferably sample is good Sample is denoted as 0, and bad sample is denoted as 1, is used for training pattern as model parameter.
Credit scoring model is established, XGBoost algorithm is chosen, using the model under logarithm loss appraisal different parameters Can, the loss functionWherein, N is sample size, yiFor i-th of sample This true value, piFor the predicted value of i-th of sample.Select the hyperparameter optimization being affected to algorithm, including min_ Child_weight, learning_rate, max_depth, n_estimators, setup parameter range min_child_ Weight:(1,5), learning_rate:(0.01,0.1) and, max_depth:(1,100), n_estimators:(10, 500)。
Gaussian process function is initialized, chooses initialization X value, such as x1 (1,0.01,1,10), x2 (1,0.03,2,12) makes The posterior error value for assuming valuation functions is updated with Gaussian process function, then by EI optimisation criteria, calculates the x for maximizing EI Value, calculates the value of Gaussian process function f, according to whether it is minimum or reach the number of iterations to reach the loss function, determines whether to meet Target, if meet save x value, if conditions are not met, repeat update Gaussian process function, obtain best hyper parameter value, as x (1, 0.01,1,10).Finally using the parameter value in the fixed XGBoost of best hyper parameter value x (1,0.01,1,10), and use instruction Practice data set training credit scoring model.
Credit-graded approach based on hyperparameter optimization of the invention, in hyperparameter optimization, it is assumed that evaluation function meets Gaussian Profile chooses input value x by maximizing EI, calculates Gaussian process functional value, judge whether to meet target, constantly repeatedly In generation, updates Gaussian process function, by when that can not determine objective function curve, by guess it is assumed that assertive goal function is full The Gaussian Profile of sufficient multivariable and to assuming the further modified process of assessment models.
Technical solution of the present invention and beneficial effect is described in detail in embodiment described above, it should be understood that Above is only a specific embodiment of the present invention, it is not intended to restrict the invention, it is all to be done in spirit of the invention Any modification, supplementary, and equivalent replacement etc., should all be included in the protection scope of the present invention.

Claims (4)

1. a kind of credit-graded approach based on hyperparameter optimization, which comprises the steps of:
S1, collects scoring main information data and to data prediction and feature selecting, makes training dataset and test data Collection;
S2 establishes credit scoring model, chooses the modeling of XGBoost algorithm, it is assumed that and hyper parameter evaluation function is Gaussian process function, Using EI optimisation criteria, optimal hyper parameter group is selected;
S3 selects optimal hyper parameter group and training dataset training credit scoring model;
S4 is predicted and is assessed to credit scoring model using test data set, by formula score=A-B*ln (p/ (1-p)), It calculates, credit scoring, wherein p is model prediction probability, and A, B are constant.
2. the credit-graded approach according to claim 1 based on hyperparameter optimization, it is characterised in that: in step 1, comment Dividing main information data includes application information, carrier data, Unionpay's portrait, main strategies, and data prediction includes desensitization Processing and WOE coding, feature selecting are that selection including selects the age for modeling from application information from all fields, Educational background, carrier data such as network duration, a nearly month call detailed list.
3. the credit-graded approach according to claim 1 based on hyperparameter optimization, it is characterised in that: assuming that hyper parameter is commented Valence function isWherein, y is the value of evaluation function, and f is Gaussian process function, and N is Gaussian Profile, f It (cx) is prior probability model,For variance;For one group of data point x1:n={ x1,…,xn, it is assumed that the value f of evaluation function1:n ={ f (x1),…,f(xn), it is described using Gaussian Profile, f1:n~N (m (x1:n), k), xnFor n-th group input vector, x1:nIt is 1 group of input vector matrix to n-th group, f (xn) it is xnCorresponding evaluation function value, f1:nFor x1To xnCorresponding evaluation function value Set.
4. the credit-graded approach according to claim 3 based on hyperparameter optimization, it is characterised in that: according to formulaWherein,For hyper parameter Gaussian process functional value, f (x) is Gaussian process functional value, and E is expectation function, and μ (x) is Gaussian process letter Number x mean value, σ (x) are Gaussian process function x mean square deviation, and Φ (Z) is Cumulative Distribution Function, and φ (z) is probability-distribution function;It is logical Given evaluation function input value x is crossed, is estimated using the posterior error that Gaussian process function updates hypothesis evaluation function, passes through maximum Change EI and obtain new evaluation function input value x, recalculates the y's of hypothesis evaluation function by new evaluation function input value x Output valve repeats fixed the number of iterations, until convergence.
CN201910347744.3A 2019-04-28 2019-04-28 A kind of credit-graded approach based on hyperparameter optimization Withdrawn CN110163743A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910347744.3A CN110163743A (en) 2019-04-28 2019-04-28 A kind of credit-graded approach based on hyperparameter optimization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910347744.3A CN110163743A (en) 2019-04-28 2019-04-28 A kind of credit-graded approach based on hyperparameter optimization

Publications (1)

Publication Number Publication Date
CN110163743A true CN110163743A (en) 2019-08-23

Family

ID=67640152

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910347744.3A Withdrawn CN110163743A (en) 2019-04-28 2019-04-28 A kind of credit-graded approach based on hyperparameter optimization

Country Status (1)

Country Link
CN (1) CN110163743A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111553482A (en) * 2020-04-09 2020-08-18 哈尔滨工业大学 Method for adjusting and optimizing hyper-parameters of machine learning model
CN111951097A (en) * 2020-08-12 2020-11-17 深圳微众信用科技股份有限公司 Enterprise credit risk assessment method, device, equipment and storage medium
CN112734568A (en) * 2021-01-29 2021-04-30 深圳前海微众银行股份有限公司 Credit scoring card model construction method, device, equipment and readable storage medium
CN113673174A (en) * 2021-09-08 2021-11-19 中国平安人寿保险股份有限公司 Hyper-parameter determination method, device, equipment and storage medium
CN113793212A (en) * 2021-09-24 2021-12-14 重庆富民银行股份有限公司 Credit assessment method
CN113919933A (en) * 2021-08-25 2022-01-11 北京睿知图远科技有限公司 Client scoring verification method based on quality label

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107798600A (en) * 2017-12-05 2018-03-13 深圳信用宝金融服务有限公司 The credit risk recognition methods of the small micro- loan of internet finance and device
CN108154430A (en) * 2017-12-28 2018-06-12 上海氪信信息技术有限公司 A kind of credit scoring construction method based on machine learning and big data technology
CN108596757A (en) * 2018-04-23 2018-09-28 大连火眼征信管理有限公司 A kind of personal credit file method and system of intelligences combination
US20180322406A1 (en) * 2017-05-04 2018-11-08 Zestfinance, Inc. Systems and methods for providing machine learning model explainability information

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180322406A1 (en) * 2017-05-04 2018-11-08 Zestfinance, Inc. Systems and methods for providing machine learning model explainability information
CN107798600A (en) * 2017-12-05 2018-03-13 深圳信用宝金融服务有限公司 The credit risk recognition methods of the small micro- loan of internet finance and device
CN108154430A (en) * 2017-12-28 2018-06-12 上海氪信信息技术有限公司 A kind of credit scoring construction method based on machine learning and big data technology
CN108596757A (en) * 2018-04-23 2018-09-28 大连火眼征信管理有限公司 A kind of personal credit file method and system of intelligences combination

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
王重仁 等: "基于超参数优化和集成学习的互联网信贷个人信用评估", 《统计与决策》 *
韩修龙: "基于XGBOOST的用户信用评分建模", 《电脑知识与技术》 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111553482A (en) * 2020-04-09 2020-08-18 哈尔滨工业大学 Method for adjusting and optimizing hyper-parameters of machine learning model
CN111553482B (en) * 2020-04-09 2023-08-08 哈尔滨工业大学 Machine learning model super-parameter tuning method
CN111951097A (en) * 2020-08-12 2020-11-17 深圳微众信用科技股份有限公司 Enterprise credit risk assessment method, device, equipment and storage medium
CN112734568A (en) * 2021-01-29 2021-04-30 深圳前海微众银行股份有限公司 Credit scoring card model construction method, device, equipment and readable storage medium
CN112734568B (en) * 2021-01-29 2024-01-12 深圳前海微众银行股份有限公司 Credit scoring card model construction method, device, equipment and readable storage medium
CN113919933A (en) * 2021-08-25 2022-01-11 北京睿知图远科技有限公司 Client scoring verification method based on quality label
CN113673174A (en) * 2021-09-08 2021-11-19 中国平安人寿保险股份有限公司 Hyper-parameter determination method, device, equipment and storage medium
CN113673174B (en) * 2021-09-08 2023-07-25 中国平安人寿保险股份有限公司 Super parameter determination method, device, equipment and storage medium
CN113793212A (en) * 2021-09-24 2021-12-14 重庆富民银行股份有限公司 Credit assessment method

Similar Documents

Publication Publication Date Title
CN110163743A (en) A kind of credit-graded approach based on hyperparameter optimization
Sormin et al. Predictions of World Population Life Expectancy Using Cyclical Order Weight/Bias
Oduguwa et al. Bi-level optimisation using genetic algorithm
Cheng et al. Accurately predicting building energy performance using evolutionary multivariate adaptive regression splines
CN109408823B (en) A kind of specific objective sentiment analysis method based on multi-channel model
CN110321361A (en) Examination question based on improved LSTM neural network model recommends determination method
CN104102917A (en) Construction method of domain self-adaptive classifier, construction device for domain self-adaptive classifier, data classification method and data classification device
CN108038538A (en) Multi-objective Evolutionary Algorithm based on intensified learning
CN110533150A (en) Self -adaptive and reuse system and method based on Support vector regression model
Lin et al. Tourism demand forecasting: Econometric model based on multivariate adaptive regression splines, artificial neural network and support vector regression
CN110111606A (en) A kind of vessel traffic flow prediction technique based on EEMD-IAGA-BP neural network
Su et al. Cabin placement layout optimisation based on systematic layout planning and genetic algorithm
Król et al. Investigation of evolutionary optimization methods of TSK fuzzy model for real estate appraisal
CN106156857A (en) The method and apparatus selected for mixed model
CN105740949A (en) Group global optimization method based on randomness best strategy
CN114004153A (en) Penetration depth prediction method based on multi-source data fusion
CN109214500A (en) A kind of transformer fault recognition methods based on integrated intelligent algorithm
CN102955946A (en) Two-stage fast classifier based on linear classification tree and neural network
Maulana et al. Crude Oil Price Forecasting Using Long Short-Term Memory
Weise et al. An improved generic bet-and-run strategy with performance prediction for stochastic local search
Xian A new fuzzy comprehensive evaluation model based on the support vector machine
CN106886648A (en) A kind of three-element vector synthesis control optimization method
Dawid et al. Genetic algorithms
Ayati et al. Multiobjective wrapper sampling design for leak detection of pipe networks based on machine learning and transient methods
Campigotto et al. Adapting to a realistic decision maker: experiments towards a reactive multi-objective optimizer

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20190823

WW01 Invention patent application withdrawn after publication