CN112364298A

CN112364298A - Strategy method for automatically adjusting model based on model effect function

Info

Publication number: CN112364298A
Application number: CN202011234901.9A
Authority: CN
Inventors: 梁协君; 卢成伟; 周恒�; 蒋涛
Original assignee: Hangzhou Youshu Finance Information Services Co ltd
Current assignee: Hangzhou Youshu Finance Information Services Co ltd
Priority date: 2020-11-08
Filing date: 2020-11-08
Publication date: 2021-02-12

Abstract

The invention discloses a strategy method for automatically adjusting a model based on a model effect function, which comprises the following specific steps. And obtaining evaluation index KS statistic and stability index PSI of the model according to the algorithm, monitoring KS and PSI, and setting a triggering automatic parameter adjusting threshold value. And constructing an objective function based on the model effect based on the KS and PSI definitions. And if the monitoring index triggers the threshold value, adjusting the corresponding algorithm parameters by using a TPE algorithm, and recording the parameter value and the model effect function value every time. And finding out the parameter value which enables the model effect function to be maximum to serve as a new parameter of the model, so as to achieve the purpose of automatically improving the effect of the original model according to the sample at the current time period. The strategy method can greatly reduce the workload brought by modeling personnel to continuously adjust the model for solving the failure problem of the old model. The modeling efficiency is greatly improved. The method can be widely applied to online modeling scenes such as customer value mining, credit auditing and the like.

Description

Strategy method for automatically adjusting model based on model effect function

Technical Field

The invention relates to the relevant technology of machine learning, in particular to a strategy method for automatically improving model effect based on machine learning model evaluation indexes and actual business targets.

Background

As the internet continues to deeply penetrate the daily lives of consumers, the demand of on-line modeling scenarios such as inventory customer value mining, credit auditing, etc. faced by financial institutions is increasing. And the customer group data can form a sudden or gradual unstable trend due to various reasons, so that the prediction effect of the original model is gradually reduced and even fails. This will hinder the development of the business and cause a lot of losses to the financial institution. And the modeler is a time-consuming and labor-consuming task to adjust the model from time to time. Especially when the number of models reaches a certain magnitude, the maintenance cost of the models rises exponentially. Future fields of model adaptation will necessarily be able to be automated.

At present, there are many methods for automatically adjusting a model, but basically, the methods are only limited to the optimization of single indexes such as accuracy or stability, and the effect is not stable. The results are far from the results obtained after manual model adjustment.

Disclosure of Invention

Aiming at the defects in the prior art, the invention provides a strategy method for automatically adjusting a model based on a model effect function, which can effectively improve the model prediction capability and the model stability capability after the model is automatically reconstructed.

In order to solve the technical problem, the invention is solved by the following technical scheme:

firstly, constructing an original model unit. The content of the unit comprises: 1. and recording the related configuration of the original model. Including the time of construction of the original model and the detailed information of the sample set of the training model (specifically, the model-entering features and the detailed data dictionary specifications of the target variable to be learned). 2. And recording the algorithm information of the original model. Including the name of the algorithm used, the values of the particular super parameter set. 3. And recording each evaluation index of the original model. The evaluation indexes include model KS statistics, model stability indexes PSI, AUC, GINI values, and the like. The frequency of evaluation index was recorded every day.

The mathematical definition of the KS statistic (Kolmogorov-Smirnov) mentioned in the first step above is: KS ═ max (TPR-FPR), where TPR denotes true positive rate (rate determined to be true positive as well); FPR indicates the false positive rate (rate of positive cases but not true cases). From this it can be seen that the KS statistic is the maximum of the TPR and FPR differences, which is a measure of the difference between the cumulative distributions of good and bad samples. The risk differentiation capability of the model can be evaluated, and the risk differentiation capability is a typical index for evaluating the prediction capability of the model.

The stability index psi (publication stability index) mentioned in the first step is mathematically defined as:

wherein A is_iA sample ratio representing a ratio of samples in the i-th group to which the prediction probability (output result of the model) of the samples falls after dividing the n-th group by the prediction probability based on the cross-time samples; e_iThe sample ratio is expressed such that the prediction probability of a sample falls in the i-th group after dividing n groups by the prediction probability (output result of the model) based on the training samples. The model PSI is an index for quantifying the difference degree of the two sample distributions and is a typical index for evaluating the stability of the model.

And secondly, constructing a strategy unit. The content of the unit comprises: 1. and setting a strategy trigger value. And triggering the strategy when the evaluation index of the original model is lower than a certain threshold value. 2. A trigger policy time point is determined. 3. Data preparation and recording of detailed information of the sample set. And uploading the new data set to serve as the original data set of the new model training sample and the verification set. 4. And (5) customizing the strategy. The method comprises the steps of setting a model effect function, setting a domain space and setting a function solving strategy.

The set policy trigger value mentioned in the second step above may be defined according to the failure experience of the model index. If KS statistic is <0.2, the model has no discrimination capability; 0.2-0.4, which represents that the model has the distinguishing capability; 0.41-0.5, which shows that the model has better distinguishing capability; 0.51-0.6, which shows that the model has good distinguishing capability; 0.61-0.75, which shows that the model has very good distinguishing capability; KS statistic >0.75, indicating that model anomalies are likely to be problematic. So the policy trigger value may be KS statistic <0.2 or KS statistic > 0.75. The trigger metrics associated with the trigger values herein shall include any metrics that may reveal a failure of the model, such as model KS statistics, model stability metrics PSI, AUC, GINI values, and the like.

The model effect function mentioned in the second step is defined as follows:

wherein, KS_t(D_t,θ_t) Sample D representing time t_tAnd hyperparametric array theta_tA KS index quantity calculated in the t period; PSI_t(D_t,θ_t) Sample D representing time t_tAnd hyperparametric array theta_tPSI indexes calculated in the period t; | KS_t(D_t-1,θ_t-1)-KS_t(D_t,θ_t) L is used for measuring the stability of the new model and the old model across time;

to measure how much a business person needs to pay attention to the model in the face of stability across time.

The model effect function considers the prediction capability and the stability capability of the model, simultaneously uses the PSI change percentage of the original model in the cross-time period as a measurement index of the attention degree of business personnel to the future change of business data, and uses the index to measure the cross-time stability of the new model and the old model. The reason for this setting is based on the assumption that "future data is evolved based on historical data", and the quantitative mode corresponding to the change degree of the historical data is described as the change percentage of the PSI index of the original model. This incorporates the amplitude of the variation of the data layer into the model effect function.

The domain space mentioned in the second step is specifically a value domain space of the super parameter set defining different underlying algorithms of the new model. The algorithm comprises a decision tree, a logistic regression, a neural network, a generalized linear regression, a random forest, an integration model based on a boosting method (boosting), a stacking method (stacking), a self-help aggregation method (bagging) and the like. The method for defining the hyperparameter set value domain space comprises an empirical method and a statistical distribution method by using a model expert.

The function mentioned in the second step solves the strategy. Specifically, the following equation set is solved by using a TPE algorithm to obtain an optimal hyperparameter set theta_t：

Wherein, F (D)_t,θ_t,D_t-1,θ_t-1) For model effect function, PSI, in the second step_t(D_t,θ_t)<0.1 denotes the sample D with the t period_tAnd a set of hyper-parameters theta_tThe PSI indicator calculated during the period t must not exceed 0.1. 0.1 is based on the failure experience of the PSI index, i.e., less than 0.1 for the PSI index, indicating that the model need not be updated. Adding PSI to a set of equations_t(D_t,θ_t)<0.1, because for a new model of t period, the model effect is at least the sample D of t period_tThe upper part is kept stable. KS_t(D_t,θ_t)>Sample D with time t is denoted 0.2_tAnd a set of hyper-parameters theta_tThe KS statistic calculated during time t must be no less than 0.2, 0.2 being based on failure experience of the KS statistic, since for a new model for time t there is essentially no predictive capability if the KS statistic does not reach 0.2.

The reason why the effect function of the model is maximized rather than minimized is that it is expected that the higher the prediction capability of the new model is, the better the stability capability of the new model is, the lower the stability of the new model is, the better the stability of the old model across time is, and the conclusion that the effect function of the model needs to be maximized can be deduced by combining the mathematical definition of the effect function of the model.

The TPE algorithm used in the function solution strategy in the second step is substantially an improved algorithm of bayesian optimization, and like bayesian optimization, the TPE algorithm tracks the evaluation results of the past models, and uses these results to form a probability model, and maps the hyper-parameters to the score probability P (score | θ) of the model effect function. And by iterating the score probability function mapped by the hyperparameter set, the hyperparameter set which meets the maximum model effect function can be quickly found according to the standard of an expected improvement function. In the TPE algorithm, the Expected Improvement function (Expected Improvement) is defined as:

wherein g (theta) represents P (score | theta) at score>＝score^*A function expression of time; l (θ) represents P (score | θ) at score<score^*A function expression of time.

A criterion is specified for selecting the next set of hyper-parameters from the score probability function P (score | θ).

And thirdly, constructing a new model unit. The content of the unit comprises: 1. the result of the policy unit is recorded. Including recording the new model algorithm name and the optimal hyper-parameter array. 2. And recording the model evaluation index of the new model. Including model KS statistics, model stability indicators PSI, AUC, GINI values, Lift values, and the like. The frequency of evaluation index was recorded every day.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:

fig. 1 is a flowchart of implementation steps of a policy method for automatically adjusting a model based on a model effect function according to an embodiment of the present invention.

FIG. 2 is a diagram of an example of codes for constructing a domain space in the method disclosed by the embodiment of the invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the drawings of the embodiments of the present invention. It is to be understood that the embodiments described are only a few embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the described embodiments of the invention without any inventive step, are within the scope of protection of the invention.

The steps of the whole technical scheme are clearly and completely explained in the following combined with the application.

The first step is as follows: and constructing an original model unit.

1) And recording the related configuration of the original model.

Model build time: year 2020, 1 month and 1 day

Model build sample detail information:

a. sample name: to public customer data 20200101

b. Sample dimension: 10000*142

c. A sample data dictionary:

2) and recording the algorithm information of the model.

The algorithm name: XGBOOST algorithm

The set of hyperparameters:

parameter Chinese name	Parameter English name	Parameter value field	Parameter value
				Learning rate	Eta	[0,1]	0.3
Leaf number controller	Gamma	[0,+∞[	0
				Maximum tree depth	Max_depth	[0,+∞[	6
Leaf minimum weight sum	Min_child_weight	[0,+∞[	1
				Sample sampling ratio	Subsample	]0,1]	1
Column sample ratio	Colsample_bytree	]0,1]	1
				L2 regularization parameter	Lambda	[0,+∞[	1
L1 regularization parameter	Alpha	[0,+∞[	0
				Positive and negative balance parameters	Scale_pos_weight	[0,+∞[	1
…	…	…	…

3) And recording the evaluation index of the original model.

Date	KS statistics	PSI	AUC	GINI	…
						20200101	0.478	0.025	0.763	0.526	…
20200102	0.477	0.023	0.762	0.525	…
						…	…	…	…	…	…
20200830	0.221	0.162	0.550	0.100	…
						20200831	0.198	0.173	0.562	0.124	…

The second step is that: and constructing a strategy unit.

1) And setting a strategy trigger value. The policy is triggered when any of the rules in the following table is satisfied.

Rule sequence number	Index (I)	Symbol	Threshold value
				1	KS statistics	<	0.2
2	PSI	>	0.1
				3	AUC	<	0.6
4	GINI	<	0.2
				…	…	…	…

2) A trigger policy time point is determined.

Triggering the strategy time point: 20200830

3) Data preparation and recording of detailed information of the sample set.

a. Sample name: to public customer data 20200830

b. Sample dimension: 10000*142

c. A sample data dictionary:

4) and (5) customizing the strategy.

Set model effect function:

wherein t is 20200830; t-1 ═ 20200101; d_t"Pair public customer data 20200830"; d_t-1"Pair public customer data 20200101"; theta_tRepresenting a corresponding hyper-parameter set of an algorithm in a new model trained on data of 8, month and 30 days in 2020; theta_t-1The corresponding super parameter set of the algorithm in the original model trained on the data of 1 month and 1 day in 2020 is shown.

Set domain space:

a. the algorithm name is as follows: XGBOOST algorithm

b. Domain space configuration table:

set function solution strategy.

a. Solving a system of equations:

b. Solving an equation set algorithm: TPE algorithm

And thirdly, constructing a new model unit.

1) The result of the policy unit is recorded.

New model algorithm name: XGBOOST algorithm

The optimal hyper-parameter array:

parameter Chinese name	Parameter English name	Parameter value field	Optimum parameter value
				Learning rate	Eta	[0,1]	0.7
Leaf number controller	Gamma	[0,+∞[	1
				Maximum tree depth	Max_depth	[0,+∞[	10
Leaf minimum weight sum	Min_child_weight	[0,+∞[	3
				Sample sampling ratio	Subsample	]0,1]	2
Column sample ratio	Colsample_bytree	]0,1]	1
				L2 regularization parameter	Lambda	[0,+∞[	0.8
L1 regularization parameter	Alpha	[0,+∞[	0.1
				Positive and negative balance parameters	Scale_pos_weight	[0,+∞[	1
…	…	…	…

2) And recording the model evaluation index of the new model.

Date	KS statistics	PSI	AUC	GINI	…
						20200830	0.517	0.007	0.812	0.624	…
20200831	0.516	0.007	0.811	0.622	…
						…	…	…	…	…	…

In this embodiment, the index performance of the newly trained model based on the model effect function even exceeds the index performance of the original model:

finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

In summary, the above-mentioned embodiments are only preferred embodiments of the present invention, and all equivalent changes and modifications made in the claims of the present invention should be covered by the claims of the present invention.

Claims

1. A strategy method for automatically adjusting a model based on a model effect function is characterized by comprising the following steps:

s1, constructing an original model unit, wherein the content comprises recording the related configuration of the original model, recording the algorithm information of the original model, and recording each evaluation index of the original model;

s2, constructing a strategy unit, wherein the content comprises setting a strategy trigger value, determining a trigger strategy time point, preparing data and customizing a strategy, and the customizing strategy comprises setting a model effect function, setting a domain space and setting a function solving strategy;

and S3, constructing a new model unit, wherein the content comprises recording the name of the new model algorithm, the optimal hyper-parameter array and the model evaluation index of the new model.

2. The method of claim 1, wherein the method comprises:

the model effect function described in step S2, which is defined as:

to measure how much stability over time needs attention.

3. The method of claim 1, wherein the method comprises:

the setting function solving policy described in step S2 is: solving the following equation set by using a TPE algorithm to obtain an optimal hyperparameter set theta_t

Wherein, F (D)_t,θ_t,D_t-1,θ_t-1) For the model effect function, PSI, in step S2_t(D_t,θ_t)<0.1 denotes the sample D with the t period_tAnd a set of hyper-parameters theta_tThe PSI index calculated in the period t must not exceed 0.1; KS_t(D_t,θ_t)>Sample D with time t is denoted 0.2_tAnd a set of hyper-parameters theta_tThe calculated KS statistic at time t must not be less than 0.2.