CN113283673A

CN113283673A - Model performance attenuation evaluation method, model training method and device

Info

Publication number: CN113283673A
Application number: CN202110710086.7A
Authority: CN
Inventors: 朱惠嘉; 高民东; 蒋宁; 卞正; 吴海英; 林亚臣; 陈俊旭
Original assignee: Mashang Xiaofei Finance Co Ltd
Current assignee: Mashang Xiaofei Finance Co Ltd
Priority date: 2021-06-25
Filing date: 2021-06-25
Publication date: 2021-08-20

Abstract

The invention provides a model performance attenuation evaluation method, a model training method and a model training device, wherein the method comprises the following steps: acquiring initial evaluation characteristics of a target model; the initial evaluation characteristics of the target model comprise at least one of N groups of service related characteristics of the target model and model performance index values of the target model, and the N groups of service related characteristics of the target model correspond to N evaluation samples of the target model one to one; determining the target evaluation characteristics of the target model according to the initial evaluation characteristics of the target model; and inputting the target evaluation characteristics of the target model into a pre-trained performance attenuation evaluation model to obtain a performance attenuation evaluation result of the target model. The model performance attenuation evaluation method provided by the invention can improve the efficiency of model performance attenuation evaluation.

Description

Model performance attenuation evaluation method, model training method and device

Technical Field

The invention relates to the technical field of information processing, in particular to a model performance attenuation evaluation method, a model training method and a model training device.

Background

With the continuous development of artificial intelligence technology, machine learning models are applied more and more widely in various technical fields. Taking the internet financial field as an example, an anti-fraud, pre-loan approval, mid-loan management and post-loan collection model based on logistic regression, an integration tree, a neural network and the like is established in financial wind control approval to help wind control business personnel to more effectively perform business operation by combining rules, find the maximum set of high-quality customers to the maximum extent, and determine the customers with low misjudgment repayment willingness or repayment capacity to the minimum extent as normal or high-quality customers.

In order to ensure the effectiveness of the model, the model is often required to be subjected to performance evaluation to confirm whether the model is attenuated or not. At present, after a model has no problem in performance verification, the performance of the model on line is often evaluated by monitoring the model score (for example, performance Index values such as KS (Kolmogorov-Smirnov), AUC (Area Under ROC Curve), PSI (probability Stability Index) and variable distribution in real time, and when a certain Index changes abnormally or a threshold is set, an abnormal state is prompted, and a corresponding model developer evaluates an abnormal prompt and judges whether the model is attenuated, so that the efficiency of model attenuation evaluation in the prior art is low.

Disclosure of Invention

The embodiment of the invention provides a model performance attenuation evaluation method, a model training method and a model training device, and aims to solve the problem that the efficiency of model attenuation evaluation is low in the prior art.

In order to solve the technical problem, the invention is realized as follows:

in a first aspect, an embodiment of the present invention provides a model performance degradation evaluation method. The method comprises the following steps:

acquiring initial evaluation characteristics of a target model; the initial evaluation features of the target model comprise at least one of N groups of service related features of the target model and model performance index values of the target model, the N groups of service related features of the target model correspond to N evaluation samples of the target model one to one, and each group of service related features of the target model comprise at least one of the following: inputting each input variable value of the target model by the evaluation sample, inputting log information obtained by the target model by the evaluation sample, and inputting a sample score obtained by the target model by the evaluation sample; n is an integer greater than 1;

determining the target evaluation characteristics of the target model according to the initial evaluation characteristics of the target model;

and inputting the target evaluation characteristics of the target model into a pre-trained performance attenuation evaluation model to obtain a performance attenuation evaluation result of the target model.

In a second aspect, an embodiment of the present invention provides a model training method. The method comprises the following steps:

respectively obtaining initial evaluation characteristics of each model sample in S model samples, wherein the initial evaluation characteristics of the model samples comprise at least one of P groups of service related characteristics of the model samples and model performance index values of the model samples, the P groups of service related characteristics of the model samples correspond to the P evaluation samples of the model samples one by one, and each group of service related characteristics of the model samples comprise at least one of the following characteristics: inputting various input variable values of the model sample by the evaluation sample, inputting log information obtained by the model sample by the evaluation sample, and inputting a sample score obtained by the model sample by the evaluation sample; s and P are both integers greater than 1;

determining the target evaluation characteristics of each model sample according to the initial evaluation characteristics of each model sample;

and training a performance decay evaluation model according to the target evaluation characteristics of each model sample and the marking parameters of each model sample, wherein the marking parameters are used for indicating the truth values of the model samples corresponding to the target decay evaluation variables.

In a third aspect, an embodiment of the present invention further provides a model performance degradation evaluation apparatus. The model performance attenuation evaluation device comprises:

the first acquisition module is used for acquiring initial evaluation characteristics of the target model; the initial evaluation features of the target model comprise at least one of N groups of service related features of the target model and model performance index values of the target model, the N groups of service related features of the target model correspond to N evaluation samples of the target model one to one, and each group of service related features of the target model comprise at least one of the following: inputting each input variable value of the target model by the evaluation sample, inputting log information obtained by the target model by the evaluation sample, and inputting a sample score obtained by the target model by the evaluation sample; n is an integer greater than 1;

the first determination module is used for determining the target evaluation characteristics of the target model according to the initial evaluation characteristics of the target model;

and the input module is used for inputting the target evaluation characteristics of the target model into a pre-trained performance attenuation evaluation model to obtain the performance attenuation evaluation result of the target model.

In a fourth aspect, an embodiment of the present invention further provides a model training apparatus. The model training device includes:

a second obtaining module, configured to obtain initial evaluation features of each model sample in S model samples, where the initial evaluation features of the model samples include at least one of P groups of service related features of the model samples and model performance index values of the model samples, the P groups of service related features of the model samples correspond to the P evaluation samples of the model samples one to one, and each group of service related features of the model samples includes at least one of: inputting various input variable values of the model sample by the evaluation sample, inputting log information obtained by the model sample by the evaluation sample, and inputting a sample score obtained by the model sample by the evaluation sample; s and P are both integers greater than 1;

the third determining module is used for determining the target evaluation characteristics of each model sample according to the initial evaluation characteristics of each model sample;

and the training module is used for training a performance attenuation evaluation model according to the target evaluation characteristics of each model sample and the marking parameters of each model sample, wherein the marking parameters are used for indicating the true values of the model samples corresponding to the target attenuation evaluation variables.

In a fifth aspect, an embodiment of the present invention further provides an electronic device, which includes a processor, a memory, and a computer program stored on the memory and executable on the processor, where the computer program, when executed by the processor, implements the steps of the above model performance degradation evaluation method, or implements the steps of the above model training method.

In a sixth aspect, the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and the computer program, when executed by a processor, implements the steps of the above model performance degradation evaluation method, or implements the steps of the above model training method.

In the embodiment of the invention, the performance attenuation evaluation is directly carried out on the target model through the pre-trained performance attenuation evaluation model without manual judgment, so that the efficiency of the model performance attenuation evaluation can be improved. In addition, under the condition that the performance attenuation evaluation is carried out on the target model by considering the service related characteristics such as each input variable value of the target model input by the evaluation sample, the log information obtained by the target model input by the evaluation sample and the sample score obtained by the target model input by the evaluation sample, the accuracy of the model attenuation evaluation can be improved, and the reason of the performance reduction of the model can be conveniently and quickly positioned.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments of the present invention will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive exercise.

FIG. 1 is a flow chart of a model performance decay evaluation method provided by an embodiment of the invention;

FIG. 2 is a schematic diagram of the damping model occupancy of the model group on line in different months under different MOBs according to the embodiment of the present invention;

FIG. 3 is a flow chart of a model training method provided by an embodiment of the invention;

FIG. 4 is a schematic diagram of a model performance degradation evaluation system provided by an embodiment of the invention;

FIG. 5 is a block diagram of a model performance degradation evaluation apparatus according to an embodiment of the present invention;

FIG. 6 is a block diagram of a model training apparatus according to an embodiment of the present invention;

FIG. 7 is a block diagram of a model performance degradation evaluation apparatus according to still another embodiment of the present invention;

fig. 8 is a block diagram of a model training apparatus according to still another embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The embodiment of the invention provides a model performance attenuation evaluation method. Referring to fig. 1, fig. 1 is a flowchart of a model performance degradation evaluation method provided by an embodiment of the present invention, as shown in fig. 1, including the following steps:

step 101, obtaining initial evaluation characteristics of a target model; the initial evaluation features of the target model comprise at least one of N groups of service related features of the target model and model performance index values of the target model, the N groups of service related features of the target model correspond to N evaluation samples of the target model one to one, and each group of service related features of the target model comprise at least one of the following: inputting each input variable value of the target model by the evaluation sample, inputting log information obtained by the target model by the evaluation sample, and inputting a sample score obtained by the target model by the evaluation sample; n is an integer greater than 1.

In this embodiment, the target model may be any classification model to be subjected to performance degradation evaluation, for example, a wind control model.

The evaluation sample of the target model may be any data that can be input into the target model to be scored by the target model, and the evaluation sample of the target model may include input variable values corresponding to the respective input variables of the target model. For example, if the target model is a wind control model, and the input variables of the wind control model include a user credit record variable, a loan state variable, and an application record variable, the evaluation sample of the wind control model may be any account data including user credit record data, loan state data, and application record data, and the account data input to the wind control model may be used to score the account represented by the account data to distinguish whether the account is a good account or a bad account.

The N evaluation samples of the target model may include evaluation samples input to the target model over a period of time, for example, evaluation samples input to the target model in the last 6 months or the last 9 months. For example, the set of service-related features corresponding to the evaluation sample a may include variable values of the respective inputs of the evaluation sample a to the target model, log information obtained by inputting the evaluation sample a to the target model, and sample scores obtained by inputting the evaluation sample a to the target model; the set of service-related features corresponding to the evaluation sample B may include each input variable value input to the target model by the evaluation sample B, log information obtained by inputting the evaluation sample B to the target model, and a sample score obtained by inputting the evaluation sample B to the target model, and so on.

For example, if the target model includes input variables a, b, and C, and the evaluation samples have values of A, B and C corresponding to the input variables a, b, and C, respectively, the input variable values of the evaluation sample input target model include A, B and C.

The log information obtained by inputting the evaluation sample into the target model may refer to log information returned by processing the evaluation sample by the target model, and may include, but is not limited to, at least one of a log code, an input parameter, an output parameter, and the like, where the log code may be used to indicate whether there is an abnormality in processing the evaluation sample by the target model, for example, the log codes of 404, 601, 602, 701, and 702 types.

The sample score obtained by inputting the evaluation sample into the target model may be a classification probability output by processing the evaluation sample by the target model, and for example, if the target model is a wind control model, a probability indicating whether the account is a bad account may be obtained by inputting account data into the wind control model.

The model performance index value may include, but is not limited to, at least one of a KS value, an AUC value, a PSI value, and the like. Optionally, the model performance index value may include a plurality of model performance index sub-values corresponding to statistical times, where each statistical time is based on the same statistical period, the statistical period may be reasonably set according to an actual situation, for example, 1 day, half a month, or 1 month, and the model performance index sub-value may also include, but is not limited to, at least one of a KS value, an AUC value, a PSI value, and the like. Optionally, the embodiment may also calculate and record the corresponding model performance index value according to the dimensions of week, month, and the like.

And 102, determining the target evaluation characteristics of the target model according to the initial evaluation characteristics of the target model.

In one embodiment, the initial evaluation feature of the target model may be directly used as the target evaluation feature of the target model in step 102. For example, if the initial evaluation feature of the target model includes a model performance index value of the target model, the model performance index value of the target model may be directly input to the performance degradation evaluation model as the target evaluation feature of the target model.

In another embodiment, the initial evaluation features of the target model may be aggregated to obtain the target evaluation features of the target model. For example, if the initial evaluation features of the target model include N groups of service-related features of the target model, the N groups of service-related features of the target model may be aggregated according to a time dimension, and the target evaluation features of the target model may be determined based on the aggregated evaluation features; alternatively, if the initial evaluation feature of the target model includes N sets of service-related features of the target model and a model performance index value of the target model, where the model performance index value of the target model includes model performance index sub-values corresponding to a plurality of statistical times, the N sets of service-related features may be aggregated according to a time dimension, the model performance index sub-values corresponding to the plurality of statistical times of the target model may be subjected to arithmetic processing such as averaging, maximum value, or minimum value, and the target evaluation feature of the target model may be determined based on the aggregated first evaluation feature and the second evaluation feature after the arithmetic processing.

Step 103, inputting the target evaluation characteristics of the target model into a pre-trained performance attenuation evaluation model to obtain a performance attenuation evaluation result of the target model.

In this embodiment, the performance degradation evaluation model may be a logistic regression model, an ensemble tree model, a decision tree model, or a fusion model including multiple models. The integrated tree model may include, but is not limited to Lightgbm, Catboost, Xgboost, or the like. Specifically, under the condition that the performance attenuation evaluation model is a logistic regression model, the logistic regression model has better interpretability, so that the reason of the performance attenuation of the model is convenient to locate; under the condition that the performance attenuation evaluation model is an integrated tree model or a fusion model, compared with a logistic regression model, the accuracy of the model performance attenuation evaluation result is higher.

The target attenuation evaluation variable of the performance attenuation evaluation model is used for measuring or defining whether the evaluated model has performance attenuation, and may be reasonably set according to actual requirements, for example, the target attenuation evaluation variable is the probability that the number of days of model attenuation exceeds 30 days within 4 months after the target time point, so that the performance attenuation evaluation model may obtain the probability value that the number of days of model attenuation exceeds 30 days within 4 months after the target time point, for example, if the probability value is greater than a threshold value, it is determined that the evaluated model has performance attenuation, and if the probability value is less than or equal to the threshold value, it is determined that the evaluated model does not have performance attenuation, wherein the threshold value may be reasonably set according to actual requirements, for example, the threshold value may be 0.5.

The performance degradation evaluation model may be a model obtained by training based on target evaluation features of a large number of model samples and a labeling parameter, where the labeling parameter is used to indicate a true value of the model sample corresponding to the target degradation evaluation variable, for example, for a probability that a model degradation day of the target degradation evaluation variable exceeds 30 days in 4 months after a target time point, if a model degradation day of a certain model sample exceeds 30 days in 4 months after the target time point, the labeling parameter is 1, and otherwise, the labeling parameter is 0.

Correspondingly, the performance degradation evaluation result is a first value used for representing whether the target model has performance degradation, and the first value is a probability value that the number of model degradation days of the target model exceeds the number of target days in a target time period. Taking the probability that the target decay evaluation variable is more than 30 days of model decay days in 4 months after the target time point as an example, the performance decay evaluation result may be a probability value that the model decay days of the target model in 4 months after the target time point are more than 30 days, for example, if the probability value that the model decay days of the target model in 4 months after the target time point are more than 30 days is greater than a threshold, it indicates that the target model has performance decay; and if the probability value that the model attenuation days of the target model exceed 30 days within 4 months after the target time point is less than or equal to the threshold value, indicating that the target model has no performance attenuation.

It should be noted that the target time point may be any selected time point. The target time point is related to an evaluation sample for determining a target evaluation feature of the input performance degradation evaluation model, for example, in a case where the target evaluation feature of the input performance degradation evaluation model is the target evaluation feature of the target model, the evaluation sample for determining the target evaluation feature of the target model may be an evaluation sample input to the target model within a preset time period before the target time point; in the case where the target evaluation feature of the input performance degradation evaluation model is the target evaluation feature of the model sample, the evaluation sample used to determine the target evaluation feature of the model sample may be the evaluation sample of the input model sample within a preset time period before the target time point. The preset time period can be reasonably set according to actual conditions, for example, 6 months, 3 months and the like.

Further, the target time point in the training phase of the performance degradation evaluation model and the target time point in the application phase of the performance degradation evaluation model may not be the same. For example, the preset time period is 3 months, the target time point in the training phase of the performance degradation evaluation model may be 2021 year 03 month 01 day, so that the target evaluation characteristics of the model sample may be determined based on the evaluation sample input to the model sample 3 months before 2021 year 03 month 01 day and the performance degradation evaluation model may be input to train the performance degradation evaluation model, and the target time point in the application phase of the performance degradation evaluation model may be the current date, so that the target evaluation characteristics of the target model may be determined based on the evaluation sample input to the target model 3 months before the current date and the performance degradation evaluation model may be input to perform the performance degradation evaluation on the target model.

According to the model performance attenuation evaluation method provided by the embodiment of the invention, the performance attenuation evaluation is directly carried out on the target model through the pre-trained performance attenuation evaluation model without manual judgment, so that the efficiency of the model performance attenuation evaluation can be improved. In addition, when the performance decay evaluation is performed on the target model by considering the service-related characteristics such as the variable values of the input variables of the target model inputted by the evaluation sample, the log information of the target model inputted by the evaluation sample, and the sample scores of the target model inputted by the evaluation sample, the accuracy of the model decay evaluation can be improved, and the causes of model performance degradation, such as variable inconsistency, model package errors, interface configuration errors, and the like, can be conveniently and quickly located.

Optionally, the initial evaluation features of the target model include N sets of service-related features of the target model, the N evaluation samples include evaluation samples input into the target model within M statistical times with a same statistical period as a statistical reference, and M is a positive integer;

the determining the target evaluation characteristics of the target model according to the initial evaluation characteristics of the target model comprises:

respectively carrying out aggregation calculation on service relevant features corresponding to the same statistical time in the N groups of service relevant features of the target model to obtain aggregation evaluation features corresponding to each statistical time of the target model;

and determining the target evaluation characteristics of the target model according to the aggregation evaluation characteristics corresponding to the M statistical times of the target model.

In this embodiment, the statistical period may be set reasonably according to actual conditions, for example, 1 hour, 1 day, half month, or 1 month. The M statistical times may be M consecutive statistical times based on the same statistical period, for example, if the statistical period is 1 day, the M statistical times may be M consecutive days; alternatively, the M statistical times may be M discontinuous statistical times based on the same statistical period, for example, if the statistical period is 1 day, the M statistical times may be M discontinuous days.

The service-related features corresponding to the same statistical time may refer to the service-related features corresponding to all the evaluation samples input into the target model within the same statistical time, for example, the service-related features corresponding to all the evaluation samples input into the target model within the same day. The following description will be given by taking an example in which the statistical period is 1 day and M is 60.

For the N groups of service-related features, aggregation evaluation features corresponding to all evaluation samples input into the target model within each day may be aggregated, so as to obtain aggregation evaluation features corresponding to each day within 60 days, and further, the target evaluation features of the target model may be determined according to the aggregation evaluation features corresponding to each day within 60 days.

In the case where the initial evaluation feature further includes a model performance index value, the target evaluation feature may include a first evaluation feature determined from the aggregate evaluation features corresponding to the M statistical times and a model performance index value, or the target evaluation feature may include a first evaluation feature determined from the aggregate evaluation features corresponding to the M statistical times and a second evaluation feature determined based on the model performance index value, for example, the model performance index value includes model performance index sub-values corresponding to the plurality of statistical times, and the second evaluation feature may be an evaluation feature obtained by performing arithmetic processing such as averaging, maximum value, or minimum value on the model performance index sub-values corresponding to the plurality of statistical times.

According to the embodiment of the invention, the aggregation calculation is respectively carried out on the service related characteristics corresponding to the same statistical time in the N groups of service related characteristics of the target model, so that the aggregation evaluation characteristics corresponding to each statistical time of the target model are obtained, and the target evaluation characteristics of the target model are determined according to the aggregation evaluation characteristics corresponding to the M statistical times of the target model, so that more diversified characteristics effective for evaluating model attenuation can be obtained, and the accuracy of model attenuation evaluation can be further improved.

Optionally, each group of service-related features of the target model includes respective input variable values of the evaluation sample input into the target model, and the aggregate evaluation feature corresponding to each statistical time of the target model includes at least one of the following:

a first variable index value of the target model, wherein the first variable index value of the target model is a similarity between an input variable value matrix of the target model and a reference variable value matrix of the target model, the input variable value matrix of the target model is a matrix formed by a first variable value set of the target model, the reference variable value matrix of the target model is a matrix formed by a reference variable value set corresponding to the first variable value set of the target model, and the first variable value set of the target model comprises various input variable values of the target model in the same statistical time;

a second variable index value of the target model, the second variable index value of the target model being a value determined according to the number of first variable values corresponding to each input variable of the target model, the first variable values corresponding to each input variable of the target model being variable values different from corresponding reference variable values in a second variable value set corresponding to each input variable of the target model, the second variable value sets corresponding to each input variable of the target model respectively including input variable values corresponding to each input variable of the target model within the same statistical time;

a third variable index value of the target model, where the third variable index value of the target model is a value determined according to the number of second variable values corresponding to each input variable of the target model, the second variable values corresponding to each input variable of the target model are input variable values of known abnormalities in the third variable value sets corresponding to each input variable of the target model, and the third variable value sets corresponding to each input variable of the target model respectively include input variable values corresponding to each input variable of the target model within the same statistical time;

a fourth variable index value of the target model, wherein the fourth variable index value of the target model is a value determined according to the number of statistical abnormal values corresponding to each input variable of the target model, the statistical abnormal values corresponding to each input variable of the target model are input variable values determined as abnormal by a preset abnormal value test method in third variable values corresponding to each input variable of the target model, the third variable values corresponding to each input variable of the target model are input variable values except for known abnormal input variable values in a fourth variable value set corresponding to each input variable of the target model, and the fourth variable value set corresponding to each input variable of the target model respectively comprises the input variable values corresponding to each input variable of the target model within the same statistical time;

the target model comprises a fifth variable index value and a sixth variable index value, wherein the fifth variable index value of the target model is determined according to PSI values corresponding to input variables of the target model, the PSI values corresponding to the input variables of the target model are PSI values obtained through calculation according to a fifth variable value set and a sixth variable value set corresponding to the input variables of the target model, the fifth variable value set corresponding to the input variables of the target model respectively comprises input variable values corresponding to the input variables of the target model in the same statistical time, and the sixth variable value set corresponding to the input variables of the target model is variable values corresponding to the input variables of the target model in a reference statistical time.

In this embodiment, the aggregation evaluation feature corresponding to each of the M statistical times may include at least one of a first variable index value, a second variable index value, a third variable index value, a fourth variable index value, and a fifth variable index value.

It is to be understood that the first variable index value, the second variable index value, the third variable index value, the fourth variable index value, and the fifth variable index value corresponding to each statistical time are determined based on the input variable values located within the statistical time, that is, the input variable values of the evaluation samples input to the target model within the statistical time. For example, the aggregation evaluation feature corresponding to the statistical time a is determined based on the input variable value located within the statistical time a, and for example, the first variable value set corresponding to the first variable index value corresponding to the statistical time a includes the input variable value located within the statistical time a among the input variable values of the target model; for the aggregate evaluation feature corresponding to the statistical time b, the aggregate evaluation feature is determined based on the input variable value located within the statistical time b, and for example, the first variable value set corresponding to the first variable index value located within the statistical time b includes the input variable value located within the statistical time b among the input variable values of the target model, and so on.

The reference variable value set corresponding to the first variable value set of the target model may include reference variable values corresponding to respective input variable values in the first variable value set of the target model, where the reference variable values corresponding to the input variable values may refer to variable values stored in a database corresponding to the input variable values, and in actual application, the on-line data input to the target model is often stored in the database correspondingly, and the input variable values are usually consistent with the corresponding reference variable values under the condition of no errors in operation in the target model deployment process.

The second variable index value of the target model may be a value determined based on the number of first variable values corresponding to each input variable of the target model, and may be, for example, a maximum value among the number of first variable values corresponding to each input variable of the target model, or an average value of the number of first variable values corresponding to each input variable of the target model.

The third variable index value of the target model may be a value determined according to the number of second variable values corresponding to each input variable of the target model, and may be, for example, a maximum value among the number of second variable values corresponding to each input variable of the target model, or may be a maximum value among the ratios of the number of second variable values corresponding to each input variable of the target model to the number of input variable values in the third variable value set corresponding to each input variable. The input variable values for known anomalies described above may refer to input variable values that may be directly determined to be anomalies, e.g., input variable values that have been labeled as anomalies by the data source provider. As another example, for a user credit variable, the corresponding input variable value is-9999, and it is apparent that there is an anomaly in the input variable value.

The fourth variable index value of the target model is determined according to the number of statistical abnormal values corresponding to each input variable of the target model, for example, the fourth variable index value of the target model may be a maximum value among the number of statistical abnormal values corresponding to each input variable of the target model, or the fourth variable index value of the target model may be a maximum value among the ratios of the statistical abnormal values corresponding to each input variable of the target model, and the ratio of the statistical abnormal values corresponding to the input variables may be a ratio of the number of statistical abnormal values corresponding to the input variables to the number of input variable values in the fourth variable value set corresponding to the input variables. The statistical abnormal value is an input variable value determined to be abnormal through a preset abnormal value test method. The preset outlier Test method may include, but is not limited to, outlier Test methods such as a graph-based Test (Turkey Test) and a Z-score Test (Z _ score Test). It should be noted that, in the present embodiment, reference may be made to related technologies for a specific manner of the preset abnormal value detection method, which is not described in detail herein.

The fifth variable index value of the target model is determined according to the PSI values corresponding to the input variables of the target model, for example, the fifth variable index value of the target model may be an average value of the PSI values corresponding to the input variables of the target model, or the fifth variable index value of the target model may be a maximum value of the PSI values corresponding to the input variables of the target model. The reference statistical time may be at least one statistical time including a time when the target model is on line, or may include at least one statistical time including a time corresponding to a training set used for training the target model, and for example, a statistical period is 1 day, the reference statistical time may be at least one day including a date when the target model is on line or at least one day including a date corresponding to a training set used for training the target model, where the number of days included in the reference statistical time may be determined according to the amount of data input into the target model and environmental factors, for example, if the amount of data input into the target model is large on the date when the target model is on line, the reference statistical time may be a date when the target model is on line.

The present embodiment is described below by taking an input variable x1, an input variable x2, and a statistical time a1 and a statistical time a2 included in M statistical times included in a target model as an example, where the input variable x1 and the input variable x2 may be any two input variables of the target model, and the statistical time a1 and the statistical time a2 may be any two statistical times of the M statistical times:

the first variable index value corresponding to the statistical time a1 may be a similarity between the input variable value matrix corresponding to the statistical time a1 and the reference variable value matrix corresponding to the statistical time a1, the input variable value matrix corresponding to the statistical time a1 may be a matrix formed by the first variable value set corresponding to the statistical time a1, the reference variable value matrix corresponding to the statistical time a1 may be a matrix formed by the reference variable value set corresponding to the first variable value set corresponding to the statistical time a1, and the first variable value set corresponding to the statistical time a1 may include the respective input variable values of the target model at the statistical time a 1. The first variable index value corresponding to the statistical time a2 may be a similarity between the input variable value matrix corresponding to the statistical time a2 and the reference variable value matrix corresponding to the statistical time a2, the input variable value matrix corresponding to the statistical time a2 may be a matrix formed by the first variable value set corresponding to the statistical time a2, the reference variable value matrix corresponding to the statistical time a2 may be a matrix formed by the reference variable value set corresponding to the first variable value set corresponding to the statistical time a2, and the first variable value set corresponding to the statistical time a2 may include the respective input variable values of the target model at the statistical time a 2. By analogy, the first variable index value corresponding to each statistical time in the M statistical times can be obtained.

The second variable index value corresponding to the statistical time a1 may be a value determined according to the number of first variable values corresponding to each input variable of the target model, wherein the first variable value corresponding to the input variable x1 may be a variable value different from a corresponding reference variable value in a second variable value set corresponding to the input variable x1, and the second variable value set corresponding to the input variable x1 includes an input variable value corresponding to the input variable x1 within the statistical time a 1; the first variable value corresponding to the input variable x2 may be a variable value different from the corresponding reference variable value in the second variable value set corresponding to the input variable x2, and the second variable value set corresponding to the input variable x2 includes an input variable value corresponding to the input variable x2 within the statistical time a 1; and repeating the steps until the first variable value corresponding to each input variable of the target model is obtained. The second variable index value corresponding to the statistical time a2 may be a value determined according to the number of first variable values corresponding to each input variable of the target model, where the first variable value corresponding to the input variable x1 may be a variable value different from a corresponding reference variable value in a second variable value set corresponding to the input variable x1, the second variable value set corresponding to the input variable x1 includes an input variable value corresponding to the input variable x1 within the statistical time a2, the first variable value corresponding to the input variable x2 may be a variable value different from a corresponding reference variable value in the second variable value set corresponding to the input variable x2, and the second variable value set corresponding to the input variable x2 includes an input variable value corresponding to the input variable x2 within the statistical time a 2; by analogy, second variable values corresponding to the input variables of the target model can be obtained. By analogy, the second variable index value corresponding to each statistical time in the M statistical times can be obtained.

The third variable index value corresponding to the statistical time a1 may be a value determined according to the number of second variable values corresponding to each input variable of the target model, where the second variable value corresponding to the input variable x1 is an input variable value of a known abnormality in the third variable value set corresponding to the input variable x1, and the third variable value set corresponding to the input variable x1 includes an input variable value corresponding to the input variable x1 within the statistical time a 1; the second variable value corresponding to the input variable x2 is an input variable value of a known abnormality in the third variable value set corresponding to the input variable x2, and the third variable value set corresponding to the input variable x2 comprises an input variable value corresponding to the input variable x2 within the statistical time a 1; by analogy, second variable values corresponding to the input variables of the target model can be obtained. The third variable index value corresponding to the statistical time a2 may be a value determined according to the number of second variable values corresponding to each input variable of the target model, where the second variable value corresponding to the input variable x1 is an input variable value of a known abnormality in the third variable value set corresponding to the input variable x1, and the third variable value set corresponding to the input variable x1 includes an input variable value corresponding to the input variable x1 within the statistical time a 2; the second variable value corresponding to the input variable x2 is an input variable value of a known abnormality in the third variable value set corresponding to the input variable x2, and the third variable value set corresponding to the input variable x2 comprises an input variable value corresponding to the input variable x2 within the statistical time a 2; by analogy, second variable values corresponding to the input variables of the target model can be obtained. By analogy, a third variable index value corresponding to each statistical time in the M statistical times can be obtained.

The fourth variable index value corresponding to the statistical time a1 may be a value determined according to the number of statistical abnormal values corresponding to each input variable of the target model, where the statistical abnormal value corresponding to the input variable x1 is an input variable value determined as abnormal by a preset abnormal value test method among the third variable values corresponding to the input variable x1, the third variable value corresponding to the input variable x1 is an input variable value other than the input variable value of the known abnormality in the fourth variable value set corresponding to the input variable x1, and the fourth variable value set corresponding to the input variable x1 includes the input variable value corresponding to the input variable x1 within the statistical time a 1; the statistical abnormal value corresponding to the input variable x2 is an input variable value determined as abnormal by a preset abnormal value test method in the third variable value corresponding to the input variable x2, the third variable value corresponding to the input variable x2 is an input variable value except the known abnormal input variable value in the fourth variable value set corresponding to the input variable x2, and the fourth variable value set corresponding to the input variable x2 comprises the input variable value corresponding to the input variable x2 within the statistical time a 1; by analogy, the statistical abnormal value corresponding to each input variable of the target model can be obtained. The fourth variable index value corresponding to the statistical time a2 may be a value determined according to the number of statistical abnormal values corresponding to each input variable of the target model, where the statistical abnormal value corresponding to the input variable x1 is an input variable value determined as abnormal by a preset abnormal value test method among the third variable values corresponding to the input variable x1, the third variable value corresponding to the input variable x1 is an input variable value other than the input variable value of the known abnormality in the fourth variable value set corresponding to the input variable x1, and the fourth variable value set corresponding to the input variable x1 includes the input variable value corresponding to the input variable x1 within the statistical time a 2; the statistical abnormal value corresponding to the input variable x2 is an input variable value determined as abnormal by a preset abnormal value test method in the third variable value corresponding to the input variable x2, the third variable value corresponding to the input variable x2 is an input variable value except the known abnormal input variable value in the fourth variable value set corresponding to the input variable x2, and the fourth variable value set corresponding to the input variable x2 comprises the input variable value corresponding to the input variable x2 within the statistical time a 2; by analogy, the statistical abnormal value corresponding to each input variable of the target model can be obtained. By analogy, the fourth variable index value corresponding to each statistical time in the M statistical times can be obtained.

The index value of the fifth variable corresponding to the statistical time a1 may be determined according to PSI values corresponding to the input variables of the target model, where the PSI values corresponding to the input variable x1 are PSI values calculated by a fifth variable value set and a sixth variable value set corresponding to the input variable x1, respectively, the fifth variable value set corresponding to the input variable x1 includes an input variable value corresponding to the input variable x1 within the statistical time a1, and the sixth variable value set corresponding to the input variable x1 is a variable value corresponding to the input variable x1 within the reference statistical time; the PSI values corresponding to the input variable x2 are PSI values calculated by a fifth variable value set and a sixth variable value set corresponding to the input variable x2, the fifth variable value set corresponding to the input variable x2 includes input variable values corresponding to the input variable x2 within statistical time a1, and the sixth variable value set corresponding to the input variable x2 is variable values corresponding to the input variable x2 within reference statistical time; by analogy, PSI values corresponding to all input variables of the target model can be obtained. The index value of the fifth variable corresponding to the statistical time a2 may be determined according to PSI values corresponding to the input variables of the target model, where the PSI values corresponding to the input variable x1 are PSI values calculated by a fifth variable value set and a sixth variable value set corresponding to the input variable x1, respectively, the fifth variable value set corresponding to the input variable x1 includes an input variable value corresponding to the input variable x1 within the statistical time a2, and the sixth variable value set corresponding to the input variable x1 is a variable value corresponding to the input variable x1 within the reference statistical time; the PSI values corresponding to the input variable x2 are PSI values calculated by a fifth variable value set and a sixth variable value set corresponding to the input variable x2, the fifth variable value set corresponding to the input variable x2 includes input variable values corresponding to the input variable x2 within statistical time a2, and the sixth variable value set corresponding to the input variable x2 is variable values corresponding to the input variable x2 within reference statistical time; by analogy, PSI values corresponding to all input variables of the target model can be obtained. By analogy, a fifth variable index value corresponding to each statistical time in the M statistical times can be obtained.

In the embodiment of the present invention, when each group of service related features of the target model includes each input variable value of the evaluation sample input into the target model, each aggregate evaluation feature corresponding to each statistical time includes at least one of a first variable index value of the target model, a second variable index value of the target model, a third variable index value of the target model, a fourth variable index value of the target model, and a fifth variable index value of the target model, and since the aggregate features of the input variables can reflect the abnormal situations of the input variables more accurately and stably, the influence of the abnormal situations related to the input variables of the target model on model performance degradation can be reflected more accurately based on the aggregate features, and thus the accuracy of model degradation evaluation can be improved.

Optionally, each group of service-related features of the target model includes log information obtained by inputting the evaluation sample into the target model, and the aggregate evaluation feature corresponding to each statistical time of the target model includes at least one of the following:

the log code vector set of the target model comprises vectors obtained by performing one-hot coding on each log code in a first log information set of the target model, and the first log information set of the target model comprises log information belonging to the same statistical time in N pieces of log information;

the format detection value set of the target model comprises format detection values corresponding to all log information in a second log information set of the target model, the second log information set of the target model comprises log information belonging to the same statistical time in the N log information, and the format detection values are used for indicating whether the in-out reference format in the corresponding log information is consistent with the reference format.

In this embodiment, the aggregation evaluation feature corresponding to each of the M statistical times may include at least one of a log code vector set of the target model and a format detection value set of the target model. It can be understood that, for each statistical time, the set of log code vectors of the target model and the set of format detection values of the target model are determined based on the log information corresponding to the statistical time.

The types of the log code may include, but are not limited to, 404, 601, 602, 701, and 702, and the like. Taking the types of the log codes including 404, 601, 602, 701 and 702 as examples, if the log codes returned by the target model include 701, 601 and 404, the log code vector obtained by performing one-hot (one-hot) encoding on the log code is (11010). The set of log code vectors of the target model may include a log code vector obtained by performing one-hot encoding on each log code in the first set of log information of the target model.

The access parameter format in the log information may include a format of an input parameter and a format of an output parameter in the log information. For example, the reference format is as follows: (input variable # 1: input variable value), (input variable # 2: input variable value), (output variable # 1: output variable value), (output variable # 2: no corresponding output variable), if the entry and exit format in a certain log information is as follows: (input variable # 1: an input variable value), (input variable # 2: an input variable value), (output variable # 1: an output variable value), (output variable # 2: "), it means that the access parameter format in the log information does not match the reference format, and the value of the corresponding format detection value may be a first value, for example, 0, to indicate that the access parameter format in the log information does not match the reference format; if the access parameter format in a certain log message is as follows: (input variable # 1: input variable value), (input variable # 2: input variable value), (output variable # 1: output variable value), (output variable # 2: no corresponding output variable), then it means that the access parameter format in the log information matches the reference format, and the value of the corresponding format detection value may be a second value, e.g., 1, to indicate that the access parameter format in the log information matches the reference format.

The following describes the present embodiment by taking a statistical time a1 and a statistical time a2 included in the M statistical times as an example, where the statistical time a1 and the statistical time a2 may be any two statistical times of the M statistical times:

the set of log code vectors corresponding to the statistical time a1 may include a vector obtained by performing one-hot encoding on each log code in the first log information set corresponding to the statistical time a1, and the first log information set corresponding to the statistical time a1 includes log information located within the statistical time a1 in the N pieces of log information; the set of log code vectors corresponding to the statistical time a2 may include a vector obtained by performing one-hot encoding on each log code in the first log information set corresponding to the statistical time a2, and the first log information set corresponding to the statistical time a2 includes log information located within the statistical time a2 in the N pieces of log information; by analogy, a log code vector set corresponding to each statistical time in the M statistical times can be obtained.

The format detection value set corresponding to the statistical time a1 may include format detection values corresponding to respective log information in a second log information set corresponding to the statistical time a1, and the second log information set corresponding to the statistical time a1 includes log information located within the statistical time a1 of the N pieces of log information; the format detection value set corresponding to the statistical time a2 may include format detection values corresponding to respective log information in a second log information set corresponding to the statistical time a2, and the second log information set corresponding to the statistical time a2 includes log information located within the statistical time a2 of the N pieces of log information; by analogy, a format detection value set corresponding to each statistical time in the M statistical times can be obtained.

In the embodiment of the invention, under the condition that each group of service related characteristics of the target model comprises log information obtained by inputting the evaluation sample into the target model, the aggregation evaluation characteristics corresponding to each statistical time comprise at least one of a log code vector set of the target model and a format detection value set of the target model, and because the aggregation characteristics of the log information can more accurately and stably reflect the abnormal conditions of the log information, the influence of the abnormal conditions related to the log information of the target model on the model performance attenuation can be more accurately reflected on the basis of the characteristics, and the accuracy of model attenuation evaluation can be further improved.

Optionally, each group of service-related features of the target model includes a sample score obtained by inputting the evaluation sample into the target model, and the aggregate evaluation feature corresponding to each statistical time of the target model includes at least one of the following:

the score abnormality index value of the target model is a value determined according to the number of the abnormality sample scores of the target model, the abnormality sample score of the target model is a sample score which is different from a corresponding reference sample score in a first sample score set of the target model, and the first sample score of the target model is a sample score which belongs to the same statistical time in the N sample scores;

the score abnormality detection value set of the target model comprises score abnormality detection values corresponding to all sample scores in a second sample score set of the target model, the second sample score set of the target model is a sample score belonging to the same statistical time in the N sample scores, and the score abnormality detection values are used for indicating whether the corresponding sample scores are located in a preset interval.

In this embodiment, the aggregation evaluation feature corresponding to each of the M statistical times may include at least one of a score anomaly index value of the target model and a score anomaly detection value set of the target model. It is understood that, for each statistical time, the score anomaly index value of the target model and the score anomaly detection value set of the target model are determined based on the sample score corresponding to the statistical time.

The reference sample score may be a sample score obtained by inputting a reference sample to the target model. In practical application, on-line data input into the target model is often stored in the database correspondingly, and the evaluation sample is generally consistent with the corresponding reference evaluation sample under the condition of no error in operation in the deployment process of the target model. The score abnormality index value of the target model is a value determined according to the number of abnormal sample scores of the target model, for example, the score abnormality index value of the target model may be the number of abnormal sample scores of the target model, or the score abnormality index value of the target model may be a ratio of the number of abnormal sample scores of the target model to the number of sample scores in the first sample score set of the target model.

The preset interval may be reasonably set according to actual conditions, for example, in the case that the sample score is in the form of probability, the preset interval may be [0, 1 ]. The score anomaly detection value is used to indicate whether a corresponding sample score is within a preset interval, for example, if a sample score is within the preset interval, the score anomaly detection value may be a third value, for example, 1, to indicate that the sample score is within the preset interval, and if a sample score is not within the preset interval, the score anomaly detection value may be a fourth value, for example, 0, to indicate that the sample score is not within the preset interval, that is, the sample score has a score anomaly.

The following describes the present embodiment by taking M statistical times including statistical time a1 and statistical time a2 as an example, where the statistical time a1 and the statistical time a2 may be any two statistical times of the M statistical times:

the score anomaly index value corresponding to the statistical time a1 may be a value determined according to the number of the anomaly sample scores corresponding to the statistical time a1, the anomaly sample score corresponding to the statistical time a1 is a sample score different from the corresponding reference sample score in the first sample score set corresponding to the statistical time a1, and the first sample score set corresponding to the statistical time a1 is a sample score located within the statistical time a1 in the N sample scores; the score anomaly index value corresponding to the statistical time a2 may be a value determined according to the number of the anomaly sample scores corresponding to the statistical time a2, the anomaly sample score corresponding to the statistical time a2 is a sample score different from the corresponding reference sample score in the first sample score set corresponding to the statistical time a2, and the first sample score set corresponding to the statistical time a2 is a sample score located within the statistical time a2 in the N sample scores; by analogy, the score abnormal index value corresponding to each statistical time in the M statistical times can be obtained.

The score abnormality detection value set corresponding to the statistical time a1 may include score abnormality detection values corresponding to respective sample scores in a second sample score set corresponding to the statistical time a1, and the second sample score set corresponding to the statistical time a1 is a sample score within the statistical time a1 of the N sample scores; the score abnormality detection value set corresponding to the statistical time a2 may include score abnormality detection values corresponding to respective sample scores in a second sample score set corresponding to the statistical time a2, and the second sample score set corresponding to the statistical time a2 is a sample score within the statistical time a2 of the N sample scores; by analogy, the score anomaly detection value set corresponding to each statistical time in the M statistical times can be obtained.

In the embodiment of the invention, under the condition that each group of service related characteristics of the target model comprises a sample score obtained by inputting the evaluation sample into the target model, the aggregation evaluation characteristics corresponding to each statistical time comprise at least one of a score abnormal index value of the target model and a score abnormal detection value set of the target model, and the aggregation characteristics of the sample score can more accurately and stably reflect the abnormal condition of the sample score, so the influence of the abnormal condition related to the sample score of the target model on the model performance attenuation can be more accurately reflected on the basis of the characteristics, and the accuracy of the model attenuation evaluation can be further improved.

For example, in a case that each set of service related features of the target model includes each input variable value of the evaluation sample input into the target model, log information obtained by the evaluation sample input into the target model, and a sample score obtained by the evaluation sample input into the target model, the aggregate evaluation feature may include: at least one of a first variable index value of the target model, a second variable index value of the target model, a third variable index value of the target model, a fourth variable index value of the target model, and a fifth variable index value of the target model, at least one of a set of log code vectors of the target model and a set of format detection values of the target model, and at least one of a score anomaly index value of the target model and a set of score anomaly detection values of the target model.

Optionally, M is an integer greater than 1, the target evaluation features of the target model at least include RFM features of the target model, and the RFM features of the target model are features obtained by performing RFM analysis on the aggregate evaluation features corresponding to M statistical times of the target model.

In this embodiment, the RFM analysis is performed on the M aggregation evaluation features corresponding to the statistical time, that is, the aggregation evaluation features corresponding to the statistical time are analyzed from three dimensions of R (Recency, latest time), F (Frequency), and M (monetry, total amount), so as to derive more features for evaluating the attenuation condition of the target model.

Illustratively, the above-mentioned RFM feature may include at least one of an R feature, an F feature, and an M feature. The R-feature may include at least one of a second variable index value of the target model, a third variable index value of the target model, a fourth variable index value of the target model, and a score anomaly index value of the target model corresponding to a latest one of the M statistical times, and/or a time difference between a latest one of the M statistical times and a current statistical time when the corresponding one of the second variable index value, the third variable index value of the target model, the fourth variable index value of the target model, and the score anomaly index value of the target model exceeds a respective preset value.

The F-feature may include the number of the score abnormal detection values indicating that the corresponding sample score is not located in the preset interval in the score abnormal detection value set of the target model corresponding to the plurality of statistical times, and/or the number of the format detection values indicating that the entry and exit parameter format in the corresponding log information is inconsistent with the reference format in the format detection value set of the target model corresponding to the plurality of statistical times.

The M-feature may include a sum of second variable index values of the target model corresponding to a plurality of statistical times, and/or a sum of score abnormality index values of the target model corresponding to a plurality of statistical times, and the like.

It is understood that the target evaluation characteristics of the target model of the present embodiment may include the RFM characteristics of the target model, and may also include evaluation characteristics other than the RFM characteristics of the target model, for example, the above-mentioned aggregate evaluation characteristics corresponding to M statistical times may also be included.

According to the embodiment of the invention, more characteristics which are more effective for model attenuation evaluation are derived by performing RFM analysis on the aggregation evaluation characteristics corresponding to M statistical times, so that the accuracy of model attenuation evaluation can be improved.

Optionally, the aggregation evaluation feature corresponding to each statistical time of the target model includes a second variable index value of the target model, a format detection value set of the target model, and a score anomaly index value of the target model;

the second variable index value of the target model is a value determined according to the number of first variable values corresponding to each input variable of the target model, the first variable value corresponding to the input variable of the target model is a variable value different from a corresponding reference variable value in a second variable value set corresponding to the input variable of the target model, and the second variable value set of the target model comprises the input variable values corresponding to the input variables of the target model in the same statistical time;

the format detection value set of the target model comprises format detection values corresponding to all log information in a second log information set of the target model, the second log information set of the target model comprises log information belonging to the same statistical time in N log information, and the format detection values are used for indicating whether the in-out reference format in the corresponding log information is consistent with the reference format or not;

the score abnormal index value of the target model is a value determined according to the number of abnormal sample scores of the target model, the abnormal sample score of the target model is a sample score which is different from a corresponding reference sample score in a first sample score set of the target model, and the first sample score set of the target model comprises sample scores which are positioned in the same statistical time in N sample scores;

the RFM characteristics of the target model comprise an R index value of the target model, an F index value of the target model and an M index value of the target model;

the R index value of the target model is a time difference value between a first statistical time and a current statistical time of the target model, the first statistical time of the target model is a statistical time with a minimum time interval with the current statistical time in a statistical time set of the target model, and the first statistical time set of the target model comprises statistical times when corresponding second variable index values in M statistical times exceed a preset value;

the F index value of the target model is the number of first format detection values in a target format detection value set, the target format detection value set comprises format detection value sets corresponding to all statistical times in a second statistical time set of the target model, the first format detection values indicate that an access format in corresponding log information is inconsistent with a reference format, the second statistical time set of the target model comprises any K adjacent statistical times in M statistical times, and K is a positive integer less than or equal to M;

the M index value of the target model is the sum of all score abnormality index values in a score abnormality index value set, the score abnormality index value set comprises score abnormality index values corresponding to all statistical times in a third statistical time set of the target model, the third statistical time set of the target model comprises any L adjacent statistical times in the M statistical times, and L is a positive integer less than or equal to M.

In this embodiment, the second variable index value of the target model, the format detection value set of the target model, and the score anomaly index value of the target model may be referred to the corresponding descriptions, and are not repeated herein to avoid repetition.

For the current statistical time, for example, if the statistical period is 1 day, the current statistical time is the current day, and if the statistical period is 1 week, the current statistical time is the current week. The values of K and L can be set reasonably according to actual requirements, for example, K is 30 or 60, and L is 15 or 30. The preset value can be set reasonably according to actual conditions, for example, 30, 50 or 100.

In addition, when the M pieces of statistical time are M continuous statistical times, the K adjacent statistical times may be any K continuous statistical times among the M pieces of statistical times, and the L adjacent statistical times may be any L continuous statistical times among the M pieces of statistical times.

Taking the statistical period of 1 day, M of 90, K of 30, and L of 15 as an example, the R index value may be the number of days from the current date of the latest day among all the days in which the corresponding second variable index value exceeds the preset value in 90 days; the F index value may be the number of format detection values indicating that the entry and exit parameter format in the corresponding log information is inconsistent with the reference format in the format detection value set of the target model corresponding to the last 30 days of 90 days, or the number of format detection values indicating that the entry and exit parameter format in the corresponding log information is inconsistent with the reference format in the format detection value set of the target model corresponding to 30 th to 60 th days of 90 days, or the like; the M index value may be a sum of fractional anomaly index values of the target model corresponding to the last 15 days of 90 days, or the M index value may be a sum of fractional anomaly index values of the target model corresponding to 15 th to 30 th days of 90 days, or the like.

In this embodiment, the R index value, the F index value, and the M index value are obtained by further performing statistical analysis on the corresponding second variable index value, the format detection value set of the target model, and the score abnormality index value of the target model, respectively, so that the R index value, the F index value, and the M index value have higher stability and validity than the corresponding second variable index value, the corresponding format detection value set of the target model, and the score abnormality index value of the target model, and thus the accuracy of model attenuation evaluation can be further improved.

Optionally, inputting the target evaluation feature of the target model into a pre-trained performance degradation evaluation model to obtain a performance degradation evaluation result of the target model, where the method includes:

performing evidence weight conversion on the target evaluation characteristics of the target model to obtain an evidence weight value of the target model;

and inputting the evidence weight value of the target model into a pre-trained performance attenuation evaluation model to obtain a performance attenuation evaluation result of the target model.

In this embodiment, evidence weight conversion may be performed on the target evaluation feature of the target model based on the following formula to obtain an evidence weight value X corresponding to the target evaluation feature of the target model_ij：

Wherein, w_jkJ-th evaluation feature x which is a target evaluation feature of the i-th model_ijThe transformed evidence Weight (WOE) values in the kth bin, i and j are positive integers. Delta_jkIs a binary virtual variable, if x_ijIs taken in the kth bin, then delta _jk1, otherwise δ _jk0. The box dividing number n and the box dividing range can be reasonably set according to actual requirements, and different evaluation characteristics can correspond to different box dividing numbers and different box dividing ranges. In this embodiment, the ith model is the target model.

According to the embodiment of the invention, the evidence weight conversion is carried out on the target evaluation characteristics of the target model, so that the contribution degree of each evaluation characteristic to the model performance evaluation result can be intuitively obtained based on the evidence weight value corresponding to each evaluation characteristic, and the subsequent analysis on the attenuation reason of the model is further facilitated.

Optionally, the performance decay evaluation model is a logistic regression model; the method further comprises the following steps:

and determining the feature scores of all the evaluation features of the target evaluation features according to the logistic regression model and outputting the feature scores.

The R index value x is included in the target evaluation characteristics₁₁F index value x₁₂And M index value x₁₃These three features are exemplified by the above logistic regression model as follows:

wherein R, s and t represent the number of bins corresponding to the R index value, the F index value and the M index value, respectively, and β represents the number of bins₀、β₁、β₂And beta₃All are weighted values of the logistic regression model. Based on the logistic regression model, a feature score of each of the evaluation features of the target evaluation features, for example, a feature score of an R index value β can be obtained₁(w₁₁δ₁₁+w₁₂δ₁₂+w₁₃δ₁₃+…+w₁₃δ_1r) F index value having a characteristic score of beta₂(w₂₁δ₂₁+w₂₂δ₂₂+w₂₃δ₂₃+…+w₂₃δ_2s) The characteristic score of the M index value is beta₃(w₃₁δ₃₁+w₃₂δ₃₂+w₃₃δ₃₃+…+w₃₃δ_3t)。

Based on the feature scores of the evaluation features of the target evaluation features, the reasons of model attenuation, such as variable inconsistency, model package errors, interface configuration errors, customer group changes and the like, can be analyzed more intuitively. Specifically, on the one hand, the contribution degree of each evaluation feature to the prediction of the target model as performance attenuation can be determined according to the feature score of each evaluation feature of the target evaluation features, wherein the greater the feature score, the higher the contribution degree; on the other hand, the predicted model predicted to be non-performance attenuation by the performance attenuation evaluation model can be observed, and the target model is predicted to be the main influence factor of the performance attenuation, namely, different influences of various evaluation characteristics on the model predicted to be performance attenuation and non-performance attenuation are considered separately. For example, the mean value of each evaluation feature of other predicted models may be calculated respectively, and the ratio of the difference between each evaluation feature of the target model and the mean value of each evaluation feature of other predicted models to the corresponding mean value may be calculated respectively to obtain a first ratio of each evaluation feature, and the first ratio may reflect the degree of difference between each evaluation feature predicted as performance degradation and non-performance degradation in the model, where the larger the first ratio, the higher the degree of difference.

Further, the evaluation features with the largest influence on the performance attenuation predicted by the target model can be determined by combining the feature scores of the evaluation features and the first ratios of the evaluation features, and then the corresponding business problems can be checked by combining the specific values of the evaluation features with the largest influence. For example, the R index value may be the number of days in which the corresponding second variable index value exceeds a preset value within the last 15 days, the F index value may be the number of format detection values indicating that the access format in the corresponding log information is inconsistent with the reference format in the format detection value set corresponding to the last 30 days, the M index value may be an average value of the fifth variable index values corresponding to the last 30 days, and if the feature score and the first ratio of the F index value are the largest, the F index value may be determined to be an evaluation feature having the largest influence on the target model that is predicted to be degraded, and whether the target model is predicted to be degraded due to the change of the log format may be further considered; or if the feature score of the R index value and the first ratio are the largest, the R index value may be determined as an evaluation feature having the largest influence on the target model predicted as performance degradation, and whether the target model is predicted as performance degradation due to factors such as variable inconsistency and interface configuration error may be further considered; if the feature score of the M index value and the first ratio are the largest, the M index value may be determined as an evaluation feature having the largest influence on the target model predicted to be degraded, and whether the target model is predicted to be degraded due to a change in the customer base may be further considered.

Optionally, in this embodiment, the evidence weight values and feature scores of each evaluation feature of the target evaluation features in each sub-box may be further obtained and output for a user to view, for example, as shown in table 1.

TABLE 1

According to the method and the device, the characteristic scores of all the evaluation characteristics of the target evaluation characteristics are determined according to the logistic regression model and are output, and the probability of model performance attenuation and the characteristic scores of all the evaluation characteristics can be effectively determined due to the fact that the logistic regression model has good interpretability, and the reason of the model performance attenuation can be conveniently and visually located.

Optionally, the target attenuation evaluation variable of the performance attenuation evaluation model is: the number of model decay days in the target time period exceeds the target number of days;

the performance attenuation evaluation result is a first value used for representing whether the target model has performance attenuation or not, and the first value is a probability value that the number of model attenuation days of the target model exceeds the number of target days in a target time period.

In this embodiment, the target attenuation evaluation variable is used to measure or define whether the model to be evaluated has performance attenuation, for example, if the value of the target attenuation evaluation variable is 1, it indicates that the model to be evaluated has performance attenuation, and if the value of the target attenuation evaluation variable is 0, it indicates that the model to be evaluated does not have performance attenuation. The target number of days may be a number of days set according to an empirical value, or may be a number of days determined according to the fading condition of the S model samples used for the performance fading evaluation model training. For example, the target number of days may be 30. The target time period may be a time period set based on an empirical value, or may be a time period determined based on the decay state of S model samples used for the performance decay evaluation model training. For example, the target time period may be 4 months after the target time point.

Optionally, the performance decay evaluation model is trained according to target evaluation features corresponding to each model sample in S model samples and a labeling parameter, where the labeling parameter is used to indicate a true value of the model sample corresponding to the target decay evaluation variable, and S is an integer greater than 1.

In this embodiment, the model sample may be any classification model. The specific content of the target evaluation feature corresponding to each model sample may refer to the target evaluation feature corresponding to the target model, and is not described herein again to avoid repetition.

The above-mentioned label parameter is used to indicate the true value of the model sample corresponding to the target decay evaluation variable, for example, the above-mentioned target decay evaluation variable is the probability that the number of model decay days exceeds 30 days in 4 months after the target time point, if the number of model decay days exceeds 30 days in 4 months after the target time point, the true value of the model sample corresponding to the target decay evaluation variable is 1, if the number of model decay days does not exceed 30 days in 4 months after the target time point, the true value of the model sample corresponding to the target decay evaluation variable is 0.

Optionally, the target evaluation features corresponding to each model sample may be further subjected to evidence weight conversion to obtain an evidence weight value corresponding to the target evaluation features corresponding to each model sample, and the performance decay evaluation model is trained based on the evidence weight value corresponding to the target evaluation features corresponding to each model sample and the labeling parameters.

The following describes training of the performance degradation evaluation model, taking the performance degradation evaluation model as a logistic regression model and three evaluation features including an R index value, an F index value, and an M index value as target evaluation features as examples:

the target evaluation characteristics and labeling parameters corresponding to the S model samples may be as shown in table 2.

TABLE 2

Where model _ id represents the identity of the model sample, x_ijThe characteristic value of j characteristic of the ith model sample is represented, and the value range of i is [1, S ]]J has a value range of [1, 3 ]]。MDD₃₀MOB₄And m represents the labeled parameter of the model sample corresponding to the target attenuation evaluation variable, and the value of the labeled parameter is 0 or 1.

TABLE 3

The logistic regression model can be shown as follows:

wherein z ═ ln (odds) ═ β₀+β₁X_i1+β₂X_i2+β₃X_i3(formula IV)

The WOE value of the evaluation feature of each model sample shown in table 3 is input to the above expression three, whereby the target attenuation evaluation variable MDD can be obtained₃₀MOB₄Is estimated probability pi_iAccording toMaximum likelihood rule, when estimating probability pi_iWhen the result is equal to the corresponding actual result (i.e. true value), the likelihood function is required to be able to achieve the maximum value, and since the value of the target attenuation evaluation variable is 0 or 1, the mathematical expression of the likelihood function can be obtained as follows:

wherein, y_iRepresenting the estimated probability pi_iThe corresponding true value, that is, the value of the target attenuation evaluation variable, can determine each weight (that is, beta) of the logistic regression model by the gradient ascent method₀To beta₃) And obtaining a final model attenuation evaluation model, namely the formula II.

Optionally, the target number of days is a value determined by performing a rolling rate analysis on the decay number of days of the S model samples.

The rolling rate analysis of the decay days of the S model samples is described below with reference to examples:

a determination criterion of model performance degradation may be preset to determine whether the model has performance degradation based on the determination criterion of model performance degradation, for example, the model score satisfies a preset condition.

For example, in practical situations, a general model cannot be reused after being attenuated for more than 90 days, where the model attenuation is determined based on the above criterion of the model performance attenuation, but the 90 days are too long, and the 90 days are allowed to be completely exposed for a correspondingly longer time, so a relatively short value T can be found, and as long as the model attenuation reaches T days, there is a high probability that the model will be attenuated for 90 days (indicating that the model cannot be used any more), and the value T can be determined by Roll Rate Analysis (Roll Rate Analysis).

Illustratively, the scroll rate analysis process is as follows: one observation point is selected and the 6 months of performance time are seen before and after, so that the observation point is at least 6 months away from the current observation point, and it is understood that the embodiment is not limited to 6 months. S model samples, each online for the 12 month period, were observedThe model sample can determine how many days it decayed 6 months before the observation point and 6 months after the observation point. For example, a Model_iDays of decay in the first 6 months was MDD_BiDays to decline at 6 months of MDD_AiAnd the number of decay days is segmented into 0 day, 1-30 days, 31-60 days, 61-90 days and more than 90 days, then for the S model samples, each model sample has MDD_BiAnd MDD_AiCorresponding attenuation days segment is set as MDDS_BiAnd MDDS_AiThen the slave MDDS for the S model samples can be calculated_BiTo MDDS_AiThe rolling rate of (a), i.e. the probability of the 5 segments from the first 6 months of the model sample to the last 6 months of the model sample from 0 days, 1-30 days, 31-60 days, 61-90 days, 90 days to 0 days, 1-30 days, 31-60 days, 61-90 days, and 5 segments, is shown in table 4 for example:

TABLE 4

As can be seen from table 4, the rolling rate from each segment of the first 6 months to the segment of more than 90 months of the last 6 months has a large increase in 31-60 and reaches 55%, and in combination with the business experience and the principle of choosing a shorter T value as much as possible, for example, T may be chosen to be 30, which represents a Model_iAs long as the decay exceeds 30 days, i.e. MDD₃₀There is a greater probability of decay beyond 90 days without further use.

In the embodiment of the invention, the target days are values determined by performing rolling rate analysis on the decay days of the S model samples, so that the determined target decay evaluation variable can objectively and accurately reflect the performance decay condition of the evaluated model.

Optionally, the target time period is a decay performance period determined by performing account age analysis on the S model samples based on a target model decay index, and the target model decay index is a number of model decay days exceeding the target number of days.

In this embodiment, the target model attenuation index may be used to determine whether a model sample has performance attenuation, for example, the target model attenuation index may be whether the number of model attenuation days exceeds 30 days, if the number of model attenuation days of a certain model sample exceeds 30 days, it is determined that the model sample has performance attenuation, and if the number of model attenuation days of a certain model sample does not exceed 30 days, it is determined that the model sample does not have performance attenuation. Specifically, the target model attenuation index may be an index determined based on the roll rate analysis.

The determination of the decay performance period by account Analysis (Vintage Analysis) of the S model samples based on the target model decay index is described below with reference to examples:

model sample set for on-line Model₁ Model₂…Model_S]Observing, dividing S model samples into x different groups according to the month of the online time of the model samples, representing the number of months after the model samples are online by MOB, and analyzing and determining MDD (minimization drive D) by using rolling rate₃₀Judging whether the model is attenuated or not, and obtaining model groups MG on lines in different months_iThe decay model fractions at their different MOBs are shown in fig. 2.

As can be seen from FIG. 2, the MOB is being reached₄Then, the model attenuation amount ratio reaches a relatively stable state, so that the MOB is considered₄Can substantially expose the model of the decay, and thus, can determine the target time period as the MOB₄。

If the model decay is not found to reach a relatively stable expression period by the account age analysis, the target decay evaluation variable may be defined empirically, for example, decay may reach 10 days or more within 30 days after the target time point.

The embodiment of the invention also provides a model training method, and the performance attenuation evaluation model provided by any one of the above model performance attenuation evaluation method embodiments can be obtained by training based on the model training method provided by the embodiment of the invention.

Referring to fig. 3, fig. 3 is a flowchart of a model training method according to an embodiment of the present invention, and as shown in fig. 3, the method includes the following steps:

step 301, respectively obtaining initial evaluation features of each model sample in S model samples, where the initial evaluation features of the model samples include at least one of P groups of service related features of the model samples and model performance index values of the model samples, the P groups of service related features of the model samples correspond to the P evaluation samples of the model samples one to one, and each group of service related features of the model samples includes at least one of: inputting various input variable values of the model sample by the evaluation sample, inputting log information obtained by the model sample by the evaluation sample, and inputting a sample score obtained by the model sample by the evaluation sample; s and P are both integers greater than 1.

In this embodiment, the model sample may be any classification model. For convenience of description, any one of the S model samples is explained as an example below.

The evaluation sample of the model sample may be any data into which the model sample including the input variable values corresponding to the respective input variables of the model sample is input to be scored by the model sample.

The P evaluation samples of the model samples may include evaluation samples input to the model samples over a period of time, for example, evaluation samples input to the model samples for the last 6 months or the last 9 months. And inputting each evaluation sample of the P evaluation samples into the model sample to obtain a group of service related characteristics corresponding to the evaluation sample.

For example, if the model sample includes input variables e, f, g, and H, and the evaluation sample has values of E, F, G and H corresponding to the input variables e, f, g, and H, respectively, the input variable values of the evaluation sample input model sample include E, F, G and H.

The log information obtained by inputting the evaluation sample into the model sample may refer to log information returned by processing the evaluation sample by the model sample, and may include, but is not limited to, at least one of a log code, an input parameter, an output parameter, and the like, where the log code may be used to indicate whether there is an abnormality in processing the evaluation sample by the model sample, for example, the log codes of 404, 601, 602, 701, and 702 types.

The sample score obtained by inputting the evaluation sample into the model sample may be a classification probability output by processing the evaluation sample by the model sample, and for example, if the model sample is a wind control model, a probability indicating whether the account is a bad account may be obtained by inputting certain account data into the wind control model.

The model performance index value may include, but is not limited to, at least one of a KS value, an AUC value, a PSI value, and the like. Optionally, the model performance index value may include a plurality of model performance index sub-values corresponding to statistical times, where each statistical time is based on the same statistical period, the statistical period may be reasonably set according to an actual situation, for example, 1 day, half a month, or 1 month, and the model performance index sub-value may also include, but is not limited to, at least one of a KS value, an AUC value, a PSI value, and the like.

And 302, determining the target evaluation characteristics of each model sample according to the initial evaluation characteristics of each model sample.

In one embodiment, the initial evaluation feature of the model sample may be directly used as the target evaluation feature of the model sample in step 302. For example, if the initial evaluation feature of the model sample includes a model performance index value of the model sample, the model performance index value of the model sample may be directly input to the performance degradation evaluation model as the target evaluation feature of the model sample.

In another embodiment, the initial evaluation features of the model sample may be aggregated to obtain the target evaluation features of the model sample. For example, if the initial evaluation features of the model samples include P groups of service-related features of the model samples, the P groups of service-related features of the model samples may be aggregated according to a time dimension, and the target evaluation features of the model samples are determined based on the aggregated evaluation features; alternatively, if the initial evaluation feature of the model sample includes P groups of service related features of the model sample and a model performance index value of the model sample, where the model performance index value of the model sample includes model performance index sub-values corresponding to a plurality of statistical times, the P groups of service related features may be aggregated according to a time dimension, the model performance index sub-values corresponding to the plurality of statistical times of the model sample may be subjected to arithmetic processing such as averaging, maximum value or minimum value, and the target evaluation feature of the model sample may be determined based on the aggregated first evaluation feature and the second evaluation feature after the arithmetic processing.

It is understood that, in this embodiment, specific implementations of determining the target evaluation features of the model samples may refer to the specific implementations of determining the target evaluation features of the target model described above.

Step 303, training a performance decay evaluation model according to the target evaluation characteristic of each model sample and the labeling parameter of each model sample, wherein the labeling parameter is used for indicating the true value of the model sample corresponding to the target decay evaluation variable.

In this embodiment, the target attenuation evaluation variable is used to measure or define whether the evaluated model has performance attenuation, for example, if the value of the target attenuation evaluation variable is greater than a threshold, it indicates that the evaluated model has performance attenuation, and if the value of the target attenuation evaluation variable is less than or equal to the threshold, it indicates that the evaluated model does not have performance attenuation, and the threshold may be reasonably set according to actual requirements, for example, the threshold may be 0.5. The target attenuation evaluation variable can be reasonably set according to actual requirements, for example, the probability that the number of model attenuation days exceeds the number of target days in a target time period, and the like.

The above-mentioned label parameter is used to indicate the true value of the model sample corresponding to the target decay evaluation variable, for example, the target decay evaluation variable is that the number of model decay days exceeds 30 days, if the number of model decay days of a model sample exceeds 30 days within 4 months after the target time point, the true value of the model sample corresponding to the target decay evaluation variable is 1, if the number of model decay days of a model sample does not exceed 30 days within 4 months after the target time point, the true value of the model sample corresponding to the target decay evaluation variable is 0.

The performance degradation evaluation model may be a logistic regression model, an ensemble tree model, or a decision tree model, or a fusion model including a plurality of models, or the like. The integrated tree model may include, but is not limited to Lightgbm, Catboost, Xgboost, or the like. Specifically, under the condition that the performance attenuation evaluation model is a logistic regression model, the logistic regression model has better interpretability, so that the reason of the performance attenuation of the model is convenient to locate; under the condition that the performance attenuation evaluation model is an integrated tree model or a fusion model, compared with a logistic regression model, the accuracy of the model performance attenuation evaluation result is higher.

The training of the performance degradation evaluation model according to the target evaluation characteristics of each model sample and the labeling parameters of each model sample may be that the performance degradation evaluation model is iteratively trained according to the target evaluation characteristics of each model sample and the labeling parameters of each model sample until the loss value of the loss function is minimized, or the loss value of the loss function is less than or equal to a preset value, and the like. The loss function may be reasonably set according to actual requirements, for example, for a logistic regression model and a decision tree model, the loss function may be a maximum likelihood function, and for an integrated tree model, the loss function may be a square loss function.

The following describes training of the performance degradation evaluation model, taking the performance degradation evaluation model as a logistic regression model and three evaluation features including an R index value, an F index value, and an M index value as target evaluation features as examples: by using the evaluation characteristics of each model sample shown in Table 3Inputting the value of the characterized WOE value and the value (namely the labeled parameter) of the target attenuation evaluation variable into the formula III, and obtaining the MDD aiming at the target attenuation evaluation variable₃₀MOB₄Is estimated probability pi_iAccording to the maximum likelihood rule, when estimating the probability pi_iWhen the value is equal to the corresponding actual result (namely the true value), the likelihood function is required to realize the maximum value, the value of the target attenuation evaluation variable is 0 or 1, the mathematical expression of the likelihood function shown in the formula five can be obtained, and each weight (namely beta) of the logistic regression model can be determined by a gradient ascent method₀To beta₃) And obtaining a final model attenuation evaluation model, namely the formula II. Based on the model attenuation evaluation model represented by the formula II, the model attenuation of a subsequent new online can be evaluated, the probability that the model attenuation exceeds 30 days in the 4-month presentation period after the target time point is calculated, and the accurate evaluation of the model attenuation state is realized, so that the iterative optimization of the attenuation model is performed in time. Further, based on the model decay evaluation model represented by equation two, the evidence weight value and the feature score of the R index value, the F index value, and the M index value on each bin can be obtained, for example, a feature score table as shown in table 1 can be obtained. Through the feature score table shown in table 1, each feature score of the evaluated model can be determined, and meanwhile, a feature value can be determined through the feature score, and the previous tracing is continued to determine the specific reasons of model attenuation, such as variable inconsistency, model package error, interface configuration error and the like.

The embodiment of the invention respectively obtains the initial evaluation characteristics of each model sample in S model samples, wherein the initial evaluation characteristics of the model samples comprise at least one of P groups of service related characteristics of the model samples and model performance index values of the model samples, the P groups of service related characteristics of the model samples correspond to the P evaluation samples of the model samples one by one, and each group of service related characteristics of the model samples comprise at least one of the following characteristics: inputting various input variable values of the model sample by the evaluation sample, inputting log information obtained by the model sample by the evaluation sample, and inputting a sample score obtained by the model sample by the evaluation sample; s and P are both integers greater than 1; determining the target evaluation characteristics of each model sample according to the initial evaluation characteristics of each model sample; and training a performance decay evaluation model according to the target evaluation characteristics of each model sample and the marking parameters of each model sample, wherein the marking parameters are used for indicating the true values of the model samples corresponding to the target decay evaluation variables, and the trained performance decay evaluation model can evaluate the performance decay condition of the model more accurately and efficiently.

Optionally, the initial evaluation features of the model samples include P groups of service-related features of the model samples, the P evaluation samples include evaluation samples input into the model samples within Q statistical times with the same statistical period as a statistical reference, and Q is a positive integer;

the determining the target evaluation characteristics of each model sample according to the initial evaluation characteristics of each model sample respectively comprises the following steps:

performing aggregation calculation on the service related features corresponding to the same statistical time in the P groups of service related features of each model sample respectively to obtain the aggregation evaluation feature corresponding to each statistical time of each model sample;

and determining the target evaluation characteristics of each model sample according to the aggregation evaluation characteristics corresponding to the Q statistical times of each model sample.

In this embodiment, the statistical period may be set reasonably according to actual conditions, for example, 1 hour, 1 day, half month, or 1 month. The Q statistical times may be Q consecutive statistical times based on the same statistical period, for example, if the statistical period is 1 day, the Q statistical times may be Q consecutive days; alternatively, the Q statistical times may be Q discontinuous statistical times based on the same statistical period, for example, if the statistical period is 1 day, the Q statistical times may be Q discontinuous days.

The value of Q may be the same as or different from the value of M. It can be understood that the implementation manner of this embodiment may be the description related to the above performance degradation evaluation method embodiment, and details are not described herein.

Optionally, each group of service related features of the model sample includes each input variable value of the evaluation sample input to the model sample, and the aggregate evaluation feature corresponding to each statistical time of the model sample includes at least one of the following:

the first variable index value of the model sample is the similarity between the input variable value matrix of the model sample and the reference variable value matrix of the model sample, the input variable value matrix of the model sample is a matrix formed by a first variable value set of the model sample, the reference variable value matrix of the model sample is a matrix formed by a reference variable value set corresponding to the first variable value set of the model sample, and the first variable value set of the model sample comprises various input variable values of the model sample in the same statistical time;

the second variable index value of the model sample is a value determined according to the number of first variable values corresponding to each input variable of the model sample, the first variable values corresponding to each input variable of the model sample are respectively variable values different from corresponding reference variable values in the second variable value set corresponding to each input variable of the model sample, and the second variable value sets corresponding to each input variable of the model sample respectively comprise input variable values corresponding to each input variable of the model sample within the same statistical time;

the third variable index value of the model sample is a value determined according to the number of second variable values corresponding to each input variable of the model sample, the second variable values corresponding to each input variable of the model sample are the input variable values of known abnormalities in the third variable value sets corresponding to each input variable of the model sample, and the third variable value sets corresponding to each input variable of the model sample respectively comprise the input variable values corresponding to each input variable of the model sample within the same statistical time;

the fourth variable index value of the model sample is a value determined according to the number of statistical abnormal values corresponding to each input variable of the model sample, the statistical abnormal values corresponding to each input variable of the model sample are input variable values determined as abnormal by a preset abnormal value test method in third variable values corresponding to each input variable of the model sample, the third variable values corresponding to each input variable of the model sample are input variable values except the input variable value of the known abnormal value in the fourth variable value set corresponding to each input variable of the model sample, and the fourth variable value set corresponding to each input variable of the model sample respectively comprises the input variable values corresponding to each input variable of the model sample in the same statistical time;

the PSI values corresponding to the input variables of the model samples are calculated according to fifth variable value sets and sixth variable value sets corresponding to the input variables of the model samples, the fifth variable value sets corresponding to the input variables of the model samples respectively comprise input variable values corresponding to the input variables of the model samples in the same statistical time, and the sixth variable value sets corresponding to the input variables of the model samples are variable values corresponding to the input variables of the model samples in the reference statistical time.

It can be understood that the implementation manner of this embodiment may be the description related to the above performance degradation evaluation method embodiment, and details are not described herein.

Optionally, each group of service related features of the model sample includes log information obtained by inputting the evaluation sample into the model sample, and the aggregate evaluation feature corresponding to each statistical time of the model sample includes at least one of the following:

the log code vector set of the model sample comprises vectors obtained by performing one-hot coding on each log code in a first log information set of the model sample, and the first log information set of the model sample comprises log information in the same statistical time in the P pieces of log information;

the format detection value set of the model sample comprises format detection values corresponding to all log information in a second log information set of the model sample, the second log information set of the model sample comprises log information in the same statistical time in the P log information, and the format detection values are used for indicating whether the in-out reference format in the corresponding log information is consistent with the reference format.

Optionally, each group of service related features of the model sample includes a sample score obtained by inputting the evaluation sample into the model sample, and the aggregate evaluation feature corresponding to each statistical time of the model sample includes at least one of the following:

the score abnormality index value of the model sample is a value determined according to the number of the abnormal sample scores of the model sample, the abnormal sample score of the model sample is a sample score which is different from the corresponding reference sample score in the first sample score set of the model sample, and the first sample score of the model sample is a sample score which belongs to the same statistical time in the P sample scores;

the score abnormality detection value set of the model sample comprises score abnormality detection values corresponding to all sample scores in a second sample score set of the model sample, the second sample score set of the model sample is a sample score belonging to the same statistical time in P sample scores, and the score abnormality detection value is used for indicating whether the corresponding sample score is located in a preset interval.

Optionally, Q is an integer greater than 1, the target evaluation features of the model samples at least include RFQ features of the model samples, and the RFQ features of the model samples are features obtained by performing RFQ analysis on Q aggregation evaluation features corresponding to statistical time of the model samples.

Optionally, the aggregation evaluation feature corresponding to each statistical time of the model sample includes a second variable index value of the model sample, a format detection value set of the model sample, and a score anomaly index value of the model sample;

the second variable index value of the model sample is a value determined according to the number of first variable values corresponding to each input variable of the model sample, the first variable value corresponding to the input variable of the model sample is a variable value different from a corresponding reference variable value in a second variable value set corresponding to the input variable of the model sample, and the second variable value set of the model sample comprises the input variable value corresponding to the input variable of the model sample in the same statistical time;

the format detection value set of the model sample comprises format detection values corresponding to all log information in a second log information set of the model sample, the second log information set of the model sample comprises log information belonging to the same statistical time in P log information, and the format detection values are used for indicating whether the in-out reference format in the corresponding log information is consistent with the reference format or not;

the score abnormality index value of the model sample is a value determined according to the number of abnormal sample scores of the model sample, the abnormal sample score of the model sample is a sample score which is different from a corresponding reference sample score in a first sample score set of the model sample, and the first sample score set of the model sample comprises sample scores which are positioned in the same statistical time in P sample scores;

the RFM characteristics of the model samples comprise R index values of the model samples, F index values of the model samples and M index values of the model samples;

the R index value of the model sample is a time difference value between a first statistical time of the model sample and a current statistical time, the first statistical time of the model sample is a statistical time with a minimum time interval with the current statistical time in a statistical time set of the model sample, and the first statistical time set of the model sample comprises statistical times when corresponding second variable index values in M statistical times exceed a preset value;

the F index value of the model sample is the number of first format detection values in a format detection value set corresponding to all the statistical times in a second statistical time set of the model sample, the first format detection values indicate that the access format in the corresponding log information is inconsistent with the reference format, the second statistical time set of the model sample comprises any K adjacent statistical times in M statistical times, and K is a positive integer less than or equal to M;

the M index value of the model sample is the sum of fraction abnormal index values of the model sample corresponding to all statistical times in a third statistical time set of the model sample, the third statistical time set of the model sample comprises any L adjacent statistical times in the M statistical times, and L is a positive integer less than or equal to M.

Optionally, the target attenuation evaluation variable is: probability that the number of model decay days exceeds the target number of days within the target time period.

Optionally, the target time period is an attenuation performance period determined by performing account age analysis on the S model samples based on a model sample attenuation index, and the model sample attenuation index is a model attenuation number of days exceeding the target number of days.

An embodiment of the present invention is described below with reference to fig. 4:

as shown in FIG. 4, the base data model collects online models (i.e., model sample models)₁To model1_S) The data is monitored on the basis, each model sample takes a sample sheet (namely an evaluation sample) as a minimum dimension, and data statistics is carried out in real time. For example, a variable interface configuration may be made to an existing system (e.g., Blaze, Acmp, Luma, etc. systems) to implement model calls. As another example, the underlying monitoring data of the online model (i.e., model samples) may be collected by a data development platform (e.g., a DSP). Wherein, the basic monitoring data is also the initial evaluation feature.

And the data aggregation module is used for carrying out aggregation calculation on the basic monitoring data. The data aggregation module may include a variable dimension aggregation module, a log dimension aggregation module, and a score dimension aggregation module.

The variable dimension aggregation module is used for performing corresponding aggregation calculation on the variable data in the basic monitoring data by using the model dimension, so as to obtain the variable related index features under the model dimension, namely the first variable index value, the second variable index value, the third variable index value, the fourth variable index value and the fifth variable index value. Note that the current calculation is performed in the daily dimension.

The log dimension aggregation module is configured to perform corresponding aggregation calculation on log information in the basic monitoring data according to the model dimension, so as to obtain log related index features in the model dimension, that is, the log code vector set and the format detection value set. Note that the current calculation is performed in the daily dimension.

The score dimension aggregation module is used for scoring the samples in the basic monitoring data and performing corresponding aggregation calculation according to the model dimension, so that the relevant index characteristics of the sample scoring under the model dimension, namely the score abnormal index value and the score abnormal detection value set, are obtained. Note that the current calculation is performed in the daily dimension.

The time dimension crossing module is used for further crossing with the time dimension after performing variable, log and fractional dimension feature derivation according to the daily dimension of the basic monitoring data, so that more stable and effective features are obtained through aggregation. The time dimension crossing module may derive the features from the angles of R (recency), F (frequency), and M (monetry) to obtain the R feature, the F feature, and the M feature.

The model attenuation training module mainly comprises a target variable analysis module and an attenuation evaluation training model, wherein the target variable analysis module is used for analyzing and defining a model attenuation target variable, namely the target attenuation evaluation variable. The attenuation evaluation training model is used for performing corresponding model training by adopting the characteristics of the basic monitoring data after aggregation calculation and combining with model attenuation target variables to finally obtain an attenuation evaluation model, namely the performance attenuation evaluation model.

Target attenuation evaluation variable: analyzing the change of the model attenuation state through Roll Rate Analysis, and defining the MDD as the I day of the attenuation generated by the model_iIt can be observed that the model is in MDD_iAfter 30, the model effect is continuously reduced, and the continuous use of the model must be maintained through iterative optimization of the model; the Vintage Analysis is used to determine the period of stable decay, i.e. the decay performance period, of the model, e.g. defining a MOB as one month, four monthsThen, the attenuation amount of the model reaches a stable value, so that the attenuation expression period of the model set is MOB 4; then, combining the above analysis, the target variable of the model can be defined as MDD₃₀MOB₄I.e. the probability that the model decays more than 30 days during a performance period of 4 months after the target time point.

Model attenuation evaluation training: through the analysis, a model attenuation target variable is determined, a time dimension cross aggregation variable is adopted to carry out logistic regression model training by combining variable, log and fractional dimension module data, and finally a grading card model which is established by model dimension and can evaluate the attenuation state of the model can be obtained, namely the attenuation evaluation model is deployed to a model attenuation evaluation module to carry out calling and decision making after the performance evaluation of the traditional model reaches the standard.

In summary, compared with the existing model performance attenuation evaluation mode, the model performance attenuation evaluation method provided by the embodiment of the invention has the advantages that the observation dimension is more accurate, the relevant evaluation characteristics of the variable dimension and the log dimension are abstracted from the conventional service except the model performance index of the model dimension, and the detection range of the model risk is enlarged. The reasons can be timely positioned, and because the observation dimensionality is not limited to the observation of the model performance indexes, when the model is attenuated, the reasons of abnormal model division caused by the performance reduction of the non-model can be quickly positioned by analyzing the relevant data such as the variable dimensionality, the log dimensionality and the like, so that the invalid iteration of the model is avoided. The attenuation evaluation mode is more effective, analysis and establishment of model attenuation target variables are achieved through data collection, aggregation and feature processing of the three-dimensional module, finally, the model attenuation evaluation method is completed through the logistic regression model with strong interpretability, and compared with the traditional manual monitoring, the attenuation evaluation method is more persuasive, the model attenuation probability and feature scores of the model in the variable dimension, the log dimension and the score dimension can be effectively judged, and the original value can be traced conveniently to solve the positioning problem.

Referring to fig. 5, fig. 5 is a structural diagram of a model performance degradation evaluation apparatus according to an embodiment of the present invention. As shown in fig. 5, the model performance degradation evaluation apparatus 500 includes:

a first obtaining module 501, configured to obtain an initial evaluation feature of a target model; the initial evaluation features of the target model comprise at least one of N groups of service related features of the target model and model performance index values of the target model, the N groups of service related features of the target model correspond to N evaluation samples of the target model one to one, and each group of service related features of the target model comprise at least one of the following: inputting each input variable value of the target model by the evaluation sample, inputting log information obtained by the target model by the evaluation sample, and inputting a sample score obtained by the target model by the evaluation sample; n is an integer greater than 1;

a first determining module 502, configured to determine a target evaluation feature of the target model according to an initial evaluation feature of the target model;

an input module 503, configured to input the target evaluation feature of the target model into a pre-trained performance degradation evaluation model, so as to obtain a performance degradation evaluation result of the target model.

the first determining module is specifically configured to:

a set of log code vectors of the target model, where the set of log code vectors of the target model includes a vector obtained by performing one-hot encoding on each log code in a first log information set of the target model, and the first log information set of the target model includes log information located in the same statistical time in N pieces of log information;

the format detection value set of the target model comprises format detection values corresponding to all log information in a second log information set of the target model, the second log information set of the target model comprises log information in the same statistical time in the N pieces of log information, and the format detection values are used for indicating whether the in-out reference format in the corresponding log information is consistent with the reference format.

Optionally, each group of service-related features of the target model includes a sample score obtained by inputting the evaluation sample into the target model, and the aggregate evaluation feature corresponding to each statistical time of the target model includes at least one of the following items

The score abnormality index value of the target model is a value determined according to the number of the abnormality sample scores of the target model, the abnormality sample score of the target model is a sample score which is different from a corresponding reference sample score in a first sample score set of the target model, and the first sample score of the target model is a sample score which is located in the same statistical time in the N sample scores;

the score abnormality detection value set of the target model comprises score abnormality detection values corresponding to all sample scores in a second sample score set of the target model, the second sample score set of the target model is a sample score in the N sample scores within the same statistical time, and the score abnormality detection values are used for indicating whether the corresponding sample scores are within a preset interval.

the format detection value set of the target model comprises format detection values corresponding to all log information in a second log information set of the target model, the second log information set of the target model comprises log information in the same statistical time in the N log information, and the format detection values are used for indicating whether the in-out reference format in the corresponding log information is consistent with the reference format or not;

Optionally, the input module is specifically configured to:

Optionally, the performance decay evaluation model is a logistic regression model; the device further comprises:

and the second determination module is used for determining the feature scores of all the evaluation features of the target evaluation features according to the logistic regression model and outputting the feature scores.

Optionally, the target attenuation evaluation variable of the performance attenuation evaluation model is: the probability that the number of model decay days exceeds the target number of days in the target time period;

Optionally, the performance decay evaluation model is a logistic regression model or an ensemble tree model.

The model performance degradation evaluation apparatus 500 provided in the embodiment of the present invention can implement each process in the above-described model performance degradation evaluation method embodiment, and is not described here again to avoid repetition.

Referring to fig. 6, fig. 6 is a structural diagram of a model training apparatus according to an embodiment of the present invention. As shown in fig. 6, the model training apparatus 600 includes:

a second obtaining module 601, configured to obtain initial evaluation features of each model sample in S model samples, where the initial evaluation features of the model samples include at least one of P groups of service related features of the model samples and model performance index values of the model samples, the P groups of service related features of the model samples correspond to the P evaluation samples of the model samples one to one, and each group of service related features of the model samples includes at least one of: inputting various input variable values of the model sample by the evaluation sample, inputting log information obtained by the model sample by the evaluation sample, and inputting a sample score obtained by the model sample by the evaluation sample; s and P are both integers greater than 1;

a third determining module 602, configured to determine a target evaluation feature of each model sample according to the initial evaluation feature of each model sample;

a training module 603, configured to train a performance decay evaluation model according to the target evaluation characteristic of each model sample and the labeled parameter of each model sample, where the labeled parameter is used to indicate a true value of the model sample corresponding to the target decay evaluation variable.

Optionally, the initial evaluation features of the model samples include P groups of service-related features of the model samples, the P evaluation samples include Q evaluation samples input to the model samples within a statistical time, and Q is a positive integer;

the third determining module is specifically configured to:

The model training device 600 provided in the embodiment of the present invention can implement each process in the above-described model training method embodiment, and is not described here again to avoid repetition.

Referring to fig. 7, fig. 7 is a block diagram of a model performance degradation evaluation apparatus according to still another embodiment of the present invention, and as shown in fig. 7, a model performance degradation evaluation apparatus 700 includes: a processor 701, a memory 702 and a computer program stored on the memory 702 and executable on the processor, the various components of the model performance degradation evaluation apparatus 700 being coupled together by a bus interface 703, the computer program when executed by the processor 701 implementing the steps of:

It should be understood that, in the embodiment of the present invention, when being executed by the processor 701, the computer program can implement each process in the above-described embodiment of the model performance degradation evaluation method, and details are not described here to avoid repetition.

Referring to fig. 8, fig. 8 is a block diagram of a model training apparatus according to still another embodiment of the present invention, and as shown in fig. 8, a model training apparatus 800 includes: a processor 801, a memory 802 and a computer program stored on the memory 802 and executable on the processor, the various components in the model training apparatus 800 being coupled together by a bus interface 803, the computer program when executed by the processor 801 implementing the steps of:

It should be understood that, in the embodiment of the present invention, when being executed by the processor 801, the computer program can implement each process in the above-described embodiment of the model training method, and in order to avoid repetition, details are not described here.

An embodiment of the present invention further provides an electronic device, which includes a processor, a memory, and a computer program stored in the memory and capable of running on the processor, where the computer program, when executed by the processor, implements each process of the model performance degradation evaluation method embodiment or implements each process of the model training method embodiment, and can achieve the same technical effect, and is not described herein again to avoid repetition.

The embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements each process of the above-mentioned model performance degradation evaluation method embodiment, or implements each process of the above-mentioned model training method embodiment, and can achieve the same technical effect, and in order to avoid repetition, the computer program is not described here again. The computer-readable storage medium may be a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.

While the present invention has been described with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, which are illustrative and not restrictive, and it will be apparent to those skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1. A model performance decay evaluation method is characterized by comprising the following steps:

2. The method according to claim 1, wherein the initial evaluation features of the target model comprise N sets of traffic-related features of the target model, the N evaluation samples comprise evaluation samples input into the target model within M statistical times with the same statistical period as a statistical reference, and M is a positive integer;

3. The method according to claim 2, wherein each set of service-related features of the target model includes respective input variable values of the evaluation samples input into the target model, and each aggregate evaluation feature corresponding to each statistical time of the target model includes at least one of:

4. The method according to claim 2, wherein each group of service-related features of the target model includes log information obtained by inputting the evaluation sample into the target model, and the aggregate evaluation feature corresponding to each statistical time of the target model includes at least one of:

5. The method according to claim 2, wherein each group of service-related features of the target model includes a sample score obtained by inputting the evaluation sample into the target model, and each aggregate evaluation feature corresponding to each statistical time of the target model includes at least one of:

6. The method of claim 2, wherein M is an integer greater than 1, wherein the target evaluation features of the target model at least include RFM features of the target model, and wherein the RFM features of the target model are features obtained by performing RFM analysis on M aggregate evaluation features corresponding to statistical time of the target model.

7. The method of claim 6, wherein the aggregate evaluation feature for each statistical time of the target model comprises a second variable index value of the target model, a set of format detection values of the target model, and a fractional anomaly index value of the target model;

the score abnormal index value of the target model is a value determined according to the number of abnormal sample scores of the target model, the abnormal sample score of the target model is a sample score which is different from a corresponding reference sample score in a first sample score set of the target model, and the first sample score set of the target model comprises sample scores belonging to the same statistical time in N sample scores;

8. The method according to any one of claims 1 to 7, wherein the step of inputting the target evaluation features of the target model into a pre-trained performance degradation evaluation model to obtain a performance degradation evaluation result of the target model comprises the following steps:

9. The method of claim 1, wherein the target decay evaluation variables of the performance decay evaluation model are: the probability that the number of model decay days exceeds the target number of days in the target time period;

the performance attenuation evaluation result is used for representing whether a first value of performance attenuation exists in the target model, and the first value is a probability value that the number of model attenuation days of the target model in a target time period exceeds the number of target days.

10. The method of claim 9, wherein the performance degradation evaluation model is trained according to target evaluation characteristics corresponding to each of S model samples and a labeled parameter, the labeled parameter is used to indicate a true value of the model sample corresponding to the target degradation evaluation variable, and S is an integer greater than 1.

11. The method of claim 10, wherein the target number of days is a value determined by a rolling rate analysis of decay days for the S model samples.

12. The method of claim 10 or 11, wherein the target time period is a decay performance period determined by performing a credit analysis on the S model samples based on a target model decay indicator, the target model decay indicator being a number of model decay days exceeding the target number of days.

13. A method of model training, comprising:

14. The method according to claim 13, wherein the initial evaluation features of the model samples comprise P sets of traffic-related features of the model samples, the P evaluation samples comprise evaluation samples input into the model samples within Q statistical times with the same statistical period as a statistical reference, and Q is a positive integer;

15. A model performance degradation evaluation apparatus, comprising:

16. An electronic device comprising a processor, a memory, and a computer program stored on the memory and executable on the processor, the computer program, when executed by the processor, implementing the steps of the model performance degradation evaluation method of any one of claims 1 to 12, or implementing the steps of the model training method of any one of claims 13 to 14.

17. A computer-readable storage medium, characterized in that a computer program is stored thereon, which computer program, when being executed by a processor, carries out the steps of the model performance degradation evaluation method of one of the claims 1 to 12 or the steps of the model training method of one of the claims 13 to 14.