CN107274043B

CN107274043B - Quality evaluation method and device of prediction model and electronic equipment

Info

Publication number: CN107274043B
Application number: CN201610214413.9A
Authority: CN
Inventors: 车九洲; 刘磊; 姜骁
Original assignee: Alibaba Group Holding Ltd
Current assignee: Zhejiang Tmall Technology Co Ltd
Priority date: 2016-04-07
Filing date: 2016-04-07
Publication date: 2021-05-07
Anticipated expiration: 2036-04-07
Also published as: CN107274043A

Abstract

The application discloses a quality evaluation method and device of a prediction model, electronic equipment, and quality evaluation methods and devices of three other prediction models. The quality evaluation method of the prediction model comprises the following steps: acquiring historical service objects with actual implementation effect data to form a historical service object set for model evaluation; obtaining evaluation results of various models to be evaluated by the historical data-based model evaluation algorithms through at least two preset historical data-based model evaluation algorithms according to the historical service object set; obtaining a decision result of the model to be evaluated through a preset decision algorithm according to each obtained evaluation result; and taking the decision result as a quality evaluation result of the model to be evaluated. By adopting the method provided by the application, the quality of the model to be evaluated can be evaluated before the model to be evaluated is implemented, so that the effect of improving the operation efficiency is achieved.

Description

Quality evaluation method and device of prediction model and electronic equipment

Technical Field

The application relates to the technical field of machine learning, in particular to a quality evaluation method of a prediction model; corresponding to the method, the application also relates to a quality evaluation device of the prediction model and the electronic equipment, and other three quality evaluation methods and devices of the prediction model.

Background

With the continuous development and popularization of machine learning technology, more and more fields adopt a prediction model generated by a machine learning algorithm to guide the implementation of specific services, namely: and a decision is made by adopting a data operation mode instead of a manual operation mode. For example, the commodity auditing model is used for determining which commodities can participate in the cost-effective activities, and the forecast model of commodity inventory is used for guiding the allocation proportion of a certain commodity in each warehouse. Different from the traditional manual operation mode, the data operation guides the service through a prediction model, so that the efficient operation mode without manual intervention in the whole process is realized.

The prediction models are obtained by combining historical relevant data training based on some prediction algorithms, and the quality of the prediction models has great influence on the effect of service implementation. Specifically, a high-quality prediction model can accurately guide the service, whereas a poor-quality prediction model may completely falsely guide the service. It can be seen that quality evaluation of a prediction model is a problem to be solved, namely: if the effect of the implementation of the prediction model is evaluated. An effective prediction model quality evaluation method can evaluate the quality of a prediction model and can also guide the adjustment of the prediction model.

The existing prediction model evaluating method generally evaluates a prediction model by observing the actual business effect (namely, the business online effect). On the premise that the actual business effect has a direct correspondence with the prediction model, a long time is usually required from the implementation of the model to the final observation of the actual business effect. Therefore, the existing prediction model evaluation method has the problem that the quality evaluation can not be carried out on the model before the model is implemented.

Disclosure of Invention

The application provides a quality evaluation method of a prediction model, which aims to solve the problem that the quality evaluation cannot be carried out on the model before the implementation of the model in the prior art. The application further provides a quality evaluation device of the prediction model, electronic equipment, and quality evaluation methods and devices of the other three prediction models.

The application provides a quality evaluation method of a prediction model, which comprises the following steps:

acquiring historical service objects with actual implementation effect data to form a historical service object set for model evaluation;

obtaining evaluation results of various models to be evaluated by the historical data-based model evaluation algorithms through at least two preset historical data-based model evaluation algorithms according to the historical service object set;

obtaining a decision result of the model to be evaluated through a preset decision algorithm according to each obtained evaluation result;

and taking the decision result as a quality evaluation result of the model to be evaluated.

Optionally, the historical data-based model evaluation algorithm includes:

obtaining a business object which passes model auditing to be evaluated, and forming a business object set which passes the model auditing;

according to the characteristic data of the business object, obtaining the similarity between each business object included in the business object set which is audited through the model and each historical business object included in the historical business object set through a preset similarity algorithm;

aiming at each business object included in the business object set which passes model auditing, selecting a historical business object which is most similar to each business object included in the business object set which passes model auditing from the historical business object set according to the similarity, and forming a business object mapping set corresponding to the business object set which passes model auditing;

acquiring an average value of the actual implementation effect data of the historical service objects included in the service object mapping set according to the actual implementation effect data of each historical service object included in the historical service object set;

and generating an evaluation result of the model to be evaluated according to the average value of the actual implementation effect data, a preset minimum average value threshold value and a preset expected value of the average value.

Optionally, the historical data-based model evaluation algorithm includes:

aiming at each historical business object, taking the characteristic data of the historical business object as the input of a model to be evaluated, and calculating and acquiring the prediction implementation effect data of each historical business object through the model to be evaluated;

sequencing each historical service object according to a preset sequencing standard of the predicted implementation effect data to obtain a first sequencing list of each historical service object; sequencing each historical service object according to the preset sequencing standard of the actual implementation effect data to obtain a second sequencing list of each historical service object;

acquiring the similarity of the first sorted list and the second sorted list by a preset similarity algorithm;

and determining the evaluation result of the model to be evaluated according to the similarity.

Optionally, the historical data-based model evaluation algorithm includes:

aiming at each historical business object, taking the characteristic data of the historical business object as the input of a model to be evaluated, obtaining first prediction implementation effect data of each historical business object through the model to be evaluated, and obtaining a first historical business object set which passes model audit according to the first prediction implementation effect data; and aiming at each historical business object, taking the characteristic data of the historical business object as the input of a reference model, obtaining second prediction implementation effect data of each historical business object through the reference model, and obtaining a historical business object set checked by a second passing model according to the second prediction implementation effect data;

generating a first relative complement of the historical service object set subjected to model examination by the second pass model in the historical service object set subjected to model examination by the first pass model and a second relative complement of the historical service object set subjected to model examination by the first pass model in the historical service object set subjected to model examination by the second pass model;

according to the actual implementation effect data of the historical service object, acquiring a statistical value of first actual implementation effect data of the historical service object included in the first relative complement set and a statistical value of second actual implementation effect data of the historical service object included in the second relative complement set;

acquiring an evaluation score of the model to be evaluated according to the statistical value of the first actual implementation effect data and the statistical values of the actual implementation effect data of all the selected historical business objects; and obtaining the evaluation score of the reference model according to the statistical value of the second actual implementation effect data and the statistical values of the actual implementation effect data of all the selected historical business objects;

and determining an evaluation result of the model to be evaluated according to the evaluation score of the model to be evaluated and the evaluation score of the reference model.

Optionally, the preset decision algorithm adopts a voting decision algorithm, a statistic decision algorithm, or a weighting decision algorithm.

Optionally, the voting decision algorithm includes:

generating a normalization score of each evaluation result through an evaluation result normalization algorithm respectively corresponding to each model evaluation algorithm;

obtaining the quantity of the results passing the quality evaluation according to the normalization score of each evaluation result and a preset minimum normalization score threshold value corresponding to each model evaluation algorithm;

and if the result quantity is larger than the minimum voting number threshold value, judging that the decision result is that the model to be evaluated passes the quality decision.

Optionally, the statistical value decision algorithm includes:

and if the statistic value of the normalization score is larger than the minimum statistic value threshold, judging that the decision result is that the model to be evaluated passes the quality decision.

Optionally, the statistical value comprises an average value.

Optionally, the weighted decision algorithm includes:

generating a comprehensive score of the normalized score according to weights preset for evaluation results obtained by each model evaluation algorithm;

and if the comprehensive score is larger than the minimum comprehensive score threshold value, judging that the decision result is that the model to be evaluated passes the quality decision.

Optionally, the preset evaluation algorithm based on the historical data includes a specific evaluation algorithm customized for the model to be evaluated.

Optionally, the at least two preset historical data-based model evaluation algorithms are configurable.

Optionally, the method further includes:

and setting the at least two preset historical data-based model evaluation algorithms applied by the method.

Optionally, the predetermined decision algorithm is configurable.

Correspondingly, the present application also provides a quality evaluation device of a prediction model, comprising:

the historical data acquisition unit is used for acquiring historical business objects with actual implementation effect data and forming a historical business object set used for model evaluation;

the evaluation unit is used for obtaining evaluation results of various models to be evaluated by the historical data based model evaluation algorithms through at least two preset historical data based model evaluation algorithms according to the historical service object set;

the decision unit is used for acquiring a decision result of the model to be evaluated through a preset decision algorithm according to each acquired evaluation result;

and the judging unit is used for taking the decision result as the quality evaluation result of the model to be evaluated.

Correspondingly, the present application also provides an electronic device, comprising:

a display;

a processor; and

a memory for storing a program for implementing a quality evaluation method of a prediction model, the apparatus performing the following steps after being powered on and running the program for the quality evaluation method of the prediction model: acquiring historical service objects with actual implementation effect data to form a historical service object set for model evaluation; obtaining evaluation results of various models to be evaluated by the historical data-based model evaluation algorithms through at least two preset historical data-based model evaluation algorithms according to the historical service object set; obtaining a decision result of the model to be evaluated through a preset decision algorithm according to each obtained evaluation result; and taking the decision result as a quality evaluation result of the model to be evaluated.

In addition, the present application also provides a quality evaluation method of a prediction model, including:

acquiring a historical service object with actual implementation effect data;

a historical data acquisition unit for acquiring a historical service object with actual implementation effect data;

the model prediction unit is used for taking the characteristic data of each historical business object as the input of a model to be evaluated and calculating and acquiring the prediction implementation effect data of each historical business object through the model to be evaluated;

the sorting unit is used for sorting the historical business objects according to the preset sorting standard of the predicted implementation effect data to obtain a first sorting list of the historical business objects; sequencing each historical service object according to the preset sequencing standard of the actual implementation effect data to obtain a second sequencing list of each historical service object;

the similarity calculation unit is used for acquiring the similarity of the first sorted list and the second sorted list by a preset similarity calculation method;

and the evaluation result judging unit is used for determining the evaluation result of the model to be evaluated according to the similarity.

obtaining a business object which passes model auditing to be evaluated, and forming a business object set which passes the model auditing; acquiring historical service objects with actual implementation effect data as a historical service object set;

the data acquisition unit is used for acquiring the business objects which pass the model to be evaluated and audited to form a business object set which passes the model audit; acquiring historical service objects with actual implementation effect data as a historical service object set;

the similarity calculation unit is used for acquiring the similarity between each business object included in the business object set which is audited through the model and each historical business object included in the historical business object set through a preset similarity calculation method according to the characteristic data of the business object;

a mapping unit, configured to select, for each service object included in the service object set that passes model audit, a historical service object that is most similar to each service object included in the service object set that passes model audit from the historical service object set according to the similarity, and form a service object mapping set corresponding to the service object set that passes model audit;

an average implementation effect data calculation unit, configured to obtain, according to the actual implementation effect data of each historical service object included in the historical service object set, an average value of the actual implementation effect data of the historical service object included in the service object mapping set;

and the evaluation result judging unit is used for generating the evaluation result of the model to be evaluated according to the average value of the actual implementation effect data, a preset minimum average value threshold value and a preset expected value of the average value.

acquiring a historical service object with actual implementation effect data;

aiming at each historical business object, taking the characteristic data of the historical business object as the input of a first model to be evaluated, obtaining first prediction implementation effect data of each historical business object through the first model to be evaluated, and obtaining a first historical business object set which passes model auditing according to the first prediction implementation effect data; and for each historical business object, taking the characteristic data of the historical business object as the input of a second model to be evaluated, acquiring second prediction implementation effect data of each historical business object through the second model to be evaluated, and acquiring a historical business object set which is audited by a second passing model according to the second prediction implementation effect data;

acquiring an evaluation score of the first model to be evaluated according to the statistical value of the first actual implementation effect data and the statistical values of the actual implementation effect data of all the selected historical business objects; acquiring the evaluation score of the second model to be evaluated according to the statistical value of the second actual implementation effect data and the statistical values of the actual implementation effect data of all the selected historical business objects;

and determining a model for prediction according to the evaluation score and a preset selection rule.

Optionally, the preset selection rule includes:

and taking the model to be evaluated with the high evaluation score as a model for prediction.

Optionally, the statistical value comprises an average value.

the model prediction unit is used for taking the feature data of the historical business objects as the input of a first model to be evaluated aiming at each historical business object, acquiring first prediction implementation effect data of each historical business object through the first model to be evaluated, and acquiring a first historical business object set which passes model auditing according to the first prediction implementation effect data; and for each historical business object, taking the characteristic data of the historical business object as the input of a second model to be evaluated, acquiring second prediction implementation effect data of each historical business object through the second model to be evaluated, and acquiring a historical business object set which is audited by a second passing model according to the second prediction implementation effect data;

a relative complement generating unit, configured to generate a first relative complement of the historical service object set subjected to model verification by the second model in the historical service object set subjected to model verification by the first model, and a second relative complement of the historical service object set subjected to model verification by the first model in the historical service object set subjected to model verification by the second model;

a statistical value calculating unit, configured to obtain, according to the actual implementation effect data of the historical service object, a statistical value of first actual implementation effect data of the historical service object included in the first relative complement set, and a statistical value of second actual implementation effect data of the historical service object included in the second relative complement set;

an evaluation score obtaining unit, configured to obtain an evaluation score of the first model to be evaluated according to the statistical value of the first actual implementation effect data and the statistical values of the actual implementation effect data of all selected historical service objects; acquiring the evaluation score of the second model to be evaluated according to the statistical value of the second actual implementation effect data and the statistical values of the actual implementation effect data of all the selected historical business objects;

and the model selecting unit is used for determining a model for prediction according to the evaluation score and a preset selecting rule.

Compared with the prior art, the quality evaluation method of the prediction model, provided by the application, obtains the evaluation results of the models to be evaluated respectively by the evaluation algorithms based on the historical data through at least two preset evaluation algorithms based on the historical data according to the historical service object with actual implementation effect data, then obtains the decision results of the models to be evaluated through the preset decision algorithm according to the obtained evaluation results, and then takes the decision results as the quality evaluation results of the models to be evaluated.

By using the quality evaluation method of the prediction model, before the model to be evaluated is implemented, the quality of the model to be evaluated can be evaluated according to the historical business object with the actual implementation effect data, so that the effect of improving the operation efficiency is achieved.

Drawings

FIG. 1 is a flow chart of an embodiment of a method for quality assessment of a prediction model provided herein;

FIG. 2 is a schematic diagram of an embodiment of a quality evaluation apparatus of a prediction model provided in the present application;

FIG. 3 is a schematic diagram of an embodiment of an electronic device provided herein;

FIG. 4 is a flow chart of an embodiment of a method for evaluating the quality of a second predictive model provided herein;

FIG. 5 is a schematic diagram of an embodiment of a quality evaluation apparatus of a second prediction model provided in the present application;

FIG. 6 is a flow chart of an embodiment of a method for quality assessment of a third predictive model provided herein;

FIG. 7 is a schematic diagram of an embodiment of a quality evaluation apparatus of a third prediction model provided in the present application;

FIG. 8 is a flow chart of an embodiment of a method for quality assessment of a fourth predictive model provided herein;

fig. 9 is a schematic diagram of an embodiment of a quality evaluation apparatus of a fourth prediction model provided in the present application.

Detailed Description

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application. This application is capable of implementation in many different ways than those herein set forth and of similar import by those skilled in the art without departing from the spirit of this application and is therefore not limited to the specific implementations disclosed below.

In the present application, a method and an apparatus for evaluating the quality of a prediction model, and an electronic device, and another three methods and apparatuses for evaluating the quality of a prediction model are provided, and are described in detail in the following embodiments.

The first quality evaluation method for the prediction model provided by the application has the core basic idea that: according to a historical service object set with actual implementation effect data, obtaining evaluation results of models to be evaluated by each model evaluation algorithm through at least two preset model evaluation algorithms based on historical data; and then, according to each obtained evaluation result, obtaining a decision result of the model to be evaluated through a preset decision algorithm, wherein the decision result is used as a quality evaluation result of the model to be evaluated.

Please refer to fig. 1, which is a flowchart illustrating an embodiment of a quality evaluation method of a prediction model according to the present application. The method comprises the following steps:

step S101: and acquiring historical business objects with actual implementation effect data to form a historical business object set for model evaluation.

According to the quality evaluation method of the prediction model, before the model to be evaluated is actually implemented, the quality of the model to be evaluated is evaluated according to the historical business object with the actual implementation effect data.

The model to be evaluated is a prediction model obtained by learning from historical business data through a machine learning algorithm, for example, a commodity auditing model applied to the aggregation auction business, the model audits commodities according to characteristic data of commodities which are declared to participate in the aggregation auction business, and the commodities which can participate in the aggregation activity are finally determined by predicting the single pit output of the commodities.

The historical business object with actual implementation effect data refers to a business object which passes historical audit and is actually implemented. For example, in the money-calculating auction business, the commodities audited by a commodity audit model of a manual or historical version already actually participate in the money-calculating activity and have actual output data (such as sales generated during the participation of the commodities in the money-calculating activity), and such commodities are the historical business objects with actual implementation effect data, wherein the actual implementation effect data are the sales generated during the participation of the commodities in the money-calculating activity, and the like.

The historical business objects with actual implementation effect data are typically stored in a database. The specific implementation manner of the step can be as follows: and querying and acquiring historical service objects with actual implementation effect data from a related service database, and forming a historical service object set for model evaluation according to the acquired historical service objects with actual implementation effect data.

It should be noted that, in practical applications, the timeliness of the historical business object needs to be considered. In order to ensure the accuracy of the evaluation result, the historical service object with the actual implementation effect data needs to be selected, for example, the historical service object that generates the actual implementation effect data within a preset time range (e.g., within 7 days) may be selected. In addition, in order to reduce the consumption of resources such as calculation and the like when the method of the application is executed, the number of historical business objects can be controlled.

Step S103: and obtaining the evaluation result of each model to be evaluated by the model evaluation algorithm based on the historical data through at least two preset model evaluation algorithms based on the historical data according to the historical service object set.

After the historical service objects are obtained in the previous step, according to the historical service objects, various preset model evaluation algorithms based on historical data are adopted to obtain evaluation results of models to be evaluated respectively by various model evaluation algorithms.

Three general model evaluation algorithms are given below, and in practical application, the algorithms can be combined arbitrarily to perform comprehensive evaluation on the model to be evaluated.

1) Evaluation algorithm one

The core idea of the algorithm is as follows: and performing simulation examination (prediction) on the business objects which are not actually implemented by using the model to be evaluated, and for the business objects which are passed through the examination but not actually implemented, evaluating the implementation effect of the business objects which are passed through the examination but not actually implemented by using the actual implementation effect data of the historical business objects by finding the mapping relation (forming the mapping relation according to the similarity) between the business objects and the historical business objects with the actual implementation effect data.

Taking the commodity auditing model applied to the aggregation calculation auction service as an example, firstly, the model is used for carrying out simulation auditing on the registered commodities, and for the commodities which pass the auditing but do not actually open a group, the actual single-pit output of the online open group commodities is used for evaluating the output effect of the un-open group commodities which pass the auditing by finding the mapping relation with the online actual open group commodities.

The algorithm, when implemented, may include the steps of:

step S201: and obtaining the business object which passes the model to be evaluated to be audited, and forming a business object set which passes the model audit.

The business objects audited by the model to be evaluated include but are not limited to: business objects that have not been actually implemented (i.e., business objects that have not yet had actual implementation effect data).

The process of obtaining a business object that is audited by a model to be evaluated may be as follows: firstly, selecting a business object which is not actually implemented according to a preset selection rule, then respectively auditing the selected business objects through a model to be evaluated, and finally obtaining the business object which is audited through the model to be evaluated.

Taking the commodity audit model applied to the cost-effective auction service as an example, in the process of acquiring the business object which passes the audit of the model to be evaluated, the business object which is not actually implemented can be a commodity which is already registered to participate in the cost-effective auction service. The preset selection rule may include a selection rule related to the time of the business object, or a selection rule related to the number of the selected business objects, for example, the selection rule is 100 goods registered in the last week.

Step S203: and according to the characteristic data of the business objects, acquiring the similarity between each business object included in the business object set which is audited through the model and each historical business object included in the historical business object set through a preset similarity algorithm.

After the business objects which pass the model to be evaluated are obtained in the previous step, for each business object which passes the model to be evaluated, the similarity between the business object and each historical business object obtained in the step S101 is calculated according to the characteristic data of the business object. And calculating and acquiring the similarity according to the characteristic data of the business object.

The feature data of the business object refers to attributes of the business object that affect the auditing result of the business object, for example, the commodity feature data that measures whether a commodity can participate in the aggregation cost-effective activity includes data related to the commodity, such as the quality of the commodity, the credit of the merchant selling the commodity, and the like.

The similarity refers to the similarity between two business objects, and the smaller the value of the similarity metric is, the smaller the similarity between individuals is, and the larger the difference is. Applicable similarity algorithms include, but are not limited to: vector space cosine similarity, Pearson correlation coefficient or Jaccard similarity coefficient.

Step S205: and aiming at each business object included in the business object set which passes model auditing, selecting a historical business object which is most similar to each business object included in the business object set which passes model auditing from the historical business object set according to the similarity, and forming a business object mapping set corresponding to the business object set which passes model auditing.

And acquiring the similarity between each service object which passes the model examination and each historical service object through the previous step. According to the similarity obtained by calculation, selecting the historical service object which is most similar to each service object approved by the model (namely, has the largest similarity value) from the historical service object set, and forming a service object mapping set corresponding to the service object set approved by the model.

In specific implementation, the number of the business objects included in the historical business object set is usually greater than the number of the business objects included in the business object set which is audited through the model, so as to ensure the accuracy of the evaluation result.

Step S207: and acquiring the average value of the actual implementation effect data of the historical service objects included in the service object mapping set according to the actual implementation effect data of each historical service object included in the historical service object set.

The core idea of the algorithm is to use actual implementation effect data of historical business objects to evaluate the implementation effect of the business objects which pass the verification but are not actually implemented. In this step, according to the actual implementation effect data of each historical business object included in the business object mapping set generated in the previous step, an average value of the actual implementation effect data of each historical business object included in the business object mapping set is calculated and obtained.

Step S209: and generating an evaluation result of the model to be evaluated according to the average value of the actual implementation effect data, a preset minimum average value threshold value and a preset expected value of the average value.

Since the historical business objects included in the business object mapping set are most similar to the business objects included in the business object set which is audited through the model, the implementation effect corresponding to the business object set which is audited through the model can be evaluated by using the average value of the actual implementation effect data corresponding to the business object mapping set.

And the step generates an evaluation result of the model to be evaluated according to the average value of the actual implementation effect data obtained in the previous step, the preset minimum average value threshold and the preset expected value of the average value.

In specific implementation, the following formula can be adopted to generate the evaluation result of the model to be evaluated:

wherein, score is the evaluation score of the model to be evaluated, namely the evaluation result; v is the average value of the actual implementation effect data corresponding to the business object mapping set, V_btmIs a preset minimum mean threshold value, V_aveIs a desired value of a preset average value. Of the predetermined minimum mean threshold and of the predetermined meanThe desired value may be set empirically.

As can be seen from the above formula, if the average value of the actual implementation effect data corresponding to the service object mapping set is smaller than the preset minimum average value threshold, it indicates that the average value of the actual implementation effect data of the service object audited by the model is too small, the model is not available, and the evaluation score is 0; if the average value of the actual implementation effect data corresponding to the business object mapping set is larger than a preset minimum average value threshold value and smaller than a preset expected value of the average value, the model is available, and a calculation formula for evaluating the score is shown in the formula; if the average value of the actual implementation effect data corresponding to the business object mapping set is greater than the expected value of the preset average value, the average value of the actual implementation effect data of the business object which is audited through the model is very large, the model is available, and the evaluation score is 100.

2) Evaluation algorithm two

The core idea of the algorithm is as follows: for historical business objects with actual implementation effect data, the predicted implementation effect data of the business objects are obtained through the model to be evaluated, then the predicted implementation effect data is compared with the actual implementation effect data, and the closer the predicted implementation effect data is to the actual implementation effect data, the more accurate the model to be evaluated is represented.

In particular implementations, the algorithm may include the following steps:

step S301: and aiming at each historical business object, taking the characteristic data of the historical business object as the input of a model to be evaluated, and calculating and acquiring the prediction implementation effect data of each historical business object through the model to be evaluated.

In the step S101, for each historical service object obtained, the feature data of the historical service object is used as the input of the model to be evaluated, and the prediction implementation effect data of each historical service object is obtained through calculation by the model to be evaluated.

Step S303: sequencing each historical service object according to a preset sequencing standard of the predicted implementation effect data to obtain a first sequencing list of each historical service object; and sequencing the historical service objects according to the preset sequencing standard of the actual implementation effect data to obtain a second sequencing list of the historical service objects.

After the prediction implementation effect data of each historical business object is obtained through calculation of a model to be evaluated, sorting each historical business object according to a preset sorting standard of the prediction implementation effect data to obtain a first sorting list of each historical business object; and sequencing each historical service object according to a preset sequencing standard of the actual implementation effect data to obtain a second sequencing list of each historical service object.

The preset sorting standard comprises that the ascending order of the implementation effect data is used as the sorting standard, or the descending order of the implementation effect data is used as the sorting standard. It should be noted that the sort criteria by which the first sorted listing and the second sorted listing are formed should be consistent.

Step S305: and acquiring the similarity of the first sorted list and the second sorted list by a preset similarity calculation method.

In the step, the similarity of the first sorted list and the second sorted list is obtained through a preset similarity algorithm, and the similarity can be used for measuring the similarity of actual implementation effect data and predicted implementation effect data.

The preset similarity calculation method comprises algorithms such as vector space cosine similarity, Pearson correlation coefficient or Jaccard similarity coefficient. In practical application, one of the similarity algorithms can be selected arbitrarily according to specific requirements to calculate the similarity of the first sorted list and the second sorted list.

In specific implementation, the similarity between the first sorted list and the second sorted list can be calculated by the following formula:

wherein r is the similarity of the first sorted list and the second sorted list; x is the service objects included in the second sorted list, y is the service objects included in the first sorted list, the sorting position of x in the second sorted list is the same as that of y in the first sorted list, and n is the number of the service objects. It should be noted that the number of service objects included in the first ordered list is the same as the number of service objects included in the second ordered list.

Step S307: and determining the evaluation result of the model to be evaluated according to the similarity.

After the similarity is obtained in the last step, the evaluation result of the model to be evaluated can be determined according to the similarity, and the smaller the similarity is, the lower the prediction accuracy of the model to be evaluated is, and the worse the quality of the model to be evaluated is.

3) Evaluation algorithm three

The core idea of the algorithm is as follows: comparing the model to be evaluated with the model of the old version (also called as a reference model), wherein the quality of the two models depends on the difference part of prediction results obtained by respectively predicting the business object through the two models. After the evaluation is carried out by the algorithm, if the evaluation result is that the model to be evaluated is better than the reference model, the model to be evaluated passes the quality evaluation, otherwise, the model to be evaluated does not pass the quality evaluation.

In particular implementations, the algorithm may include the following steps:

step S401: aiming at each historical business object, taking the characteristic data of the historical business object as the input of a model to be evaluated, obtaining first prediction implementation effect data of each historical business object through the model to be evaluated, and obtaining a first historical business object set which passes model audit according to the first prediction implementation effect data; and aiming at each historical business object, taking the characteristic data of the historical business object as the input of a reference model, acquiring second prediction implementation effect data of each historical business object through the reference model, and acquiring a historical business object set checked by a second passing model according to the second prediction implementation effect data.

In the step S101, for each historical service object obtained in step S101, the historical service object is respectively audited through the model to be evaluated and the reference model, and according to the prediction implementation effect data obtained by prediction of each model, a first historical service object set audited through the model to be evaluated and a second historical service object set audited through the reference model are obtained.

Step S403: and generating a first relative complement of the historical business object set which is subjected to model examination by the second pass model in the historical business object set which is subjected to model examination by the first pass model and a second relative complement of the historical business object set which is subjected to model examination by the first pass model in the historical business object set which is subjected to model examination by the second pass model.

After the historical service object set which passes model audit and the historical service object set which passes model audit are obtained, a first relative complement of the historical service object set which passes model audit in the first model audit is generated, and a second relative complement of the historical service object set which passes model audit in the second model audit is generated.

The business objects included in the first relative complement set belong to the historical business object set which is audited by the first passing model, but do not belong to the historical business object set which is audited by the second passing model; the business objects included in the second relative complement set belong to the historical business object set which is audited by the second passing model, but do not belong to the historical business object set which is audited by the first passing model.

Step S405: and obtaining a statistical value of first actual implementation effect data of the historical service object included in the first relative complement set and a statistical value of second actual implementation effect data of the historical service object included in the second relative complement set according to the actual implementation effect data of the historical service object.

And for the two relative complement sets generated in the last step, respectively acquiring a statistical value of first actual implementation effect data of the historical service object included in the first relative complement set and a statistical value of second actual implementation effect data of the historical service object included in the second relative complement set according to actual implementation effect data of the historical service object.

The statistical value includes, but is not limited to, an average value, and may be other statistical values capable of characterizing the difference between the relative complement sets.

Step S407: acquiring an evaluation score of the model to be evaluated according to the statistical value of the first actual implementation effect data and the statistical values of the actual implementation effect data of all the selected historical business objects; and acquiring the evaluation score of the reference model according to the statistical value of the second actual implementation effect data and the statistical values of the actual implementation effect data of all the selected historical business objects.

In this step, according to the score conditions of the actual implementation effect (i.e. the statistical value of the first actual implementation effect data and the statistical value of the second actual implementation effect data) reflected by the two relative complementary sets, the score conditions of the two relative complementary sets are respectively compared with the statistical values of the actual implementation effect data of all the historical business objects selected in step S101, so as to obtain the evaluation scores of the two models.

In particular implementation, the evaluation scores of the two models can be calculated by adopting the following formula:

score＝50+50*(V_ave1-V_ave)/V_ave

wherein, score is the evaluation score of the model, and the value range of the score is [0,100]]，V_ave1For average actual implementation of the effect data, V, relative complement_aveAnd the average actual implementation effect data of all the selected historical business objects.

Step S409: and determining an evaluation result of the model to be evaluated according to the evaluation score of the model to be evaluated and the evaluation score of the reference model.

And finally, according to the evaluation score of the model to be evaluated and the evaluation score of the reference model, whether the model to be evaluated is superior to the reference model can be determined according to a preset selection rule, and then the evaluation result of the model to be evaluated is determined. The preset selection rules include, but are not limited to: the model with high evaluation score has better effect. And if the model to be evaluated is judged to be superior to the reference model, the evaluation score of the model to be evaluated, which is acquired in the previous step, can be used as the evaluation result of the model to be evaluated.

Taking the commodity audit model applied to the cost-effective auction service as an example, the two models to be compared are: a currently used commodity audit model (serving as a reference model) and a generated new version commodity audit model (serving as a model to be evaluated); using the two versions of the commodity auditing model to respectively carry out simulation auditing on the same commodities (such as 100 commodities) which are already produced with pit positions; assuming that 70 commodities (a historical business object set which is audited by a first passing model) are audited and passed through by the model to be evaluated, and 60 commodities (a historical business object set which is audited by a second passing model) are audited and passed through by the reference model; generating a first relative complement set according to the historical business object set which is audited by the first passing model and the historical business object set which is audited by the second passing model, wherein the first relative complement set comprises commodities which belong to the 70 commodities but do not belong to the 60 commodities, and generating a second relative complement set which comprises commodities which belong to the 60 commodities but do not belong to the 70 commodities; and then, calculating and acquiring the evaluation score of the reference model and the evaluation score of the model to be evaluated according to the actual pit position output of the commodities, and finally comparing the two scores to determine the evaluation result of the model to be evaluated.

Three preset evaluating algorithms based on historical data are explained above, and all the preset evaluating algorithms belong to universal model evaluating algorithms. It should be noted that, because different models to be evaluated have different characteristics, the preset evaluation algorithm further includes a specific evaluation algorithm customized for the models to be evaluated.

In specific implementation, at least two preset model evaluation algorithms based on historical data can be configured for the quality evaluation process of the model to be evaluated. Therefore, the method provided by the present application further comprises: and setting the at least two preset historical data-based model evaluation algorithms.

Step S105: and obtaining the decision result of the model to be evaluated through a preset decision algorithm according to each obtained evaluation result.

For a plurality of evaluation results obtained in the previous step, the results need to be normalized to a uniform output. The evaluation results can be integrated through the step, and the normalized evaluation result is obtained, so that the final model adjudication result is generated.

Three available model decision algorithms are given below, and in practical application, any one of the algorithms can be selected to perform comprehensive decision on an evaluation result.

1) A decision algorithm I and a voting decision algorithm.

The core idea of the algorithm is as follows: and judging a plurality of evaluation results in a voting decision mode. For example, more than half of the same evaluation results are taken as the final effective results.

In particular implementations, the algorithm may include the following steps:

step S401: and generating the normalization score of each evaluation result through the evaluation result normalization algorithm respectively corresponding to each model evaluation algorithm.

The evaluation results obtained by different model evaluation algorithms may have different score ranges, and the evaluation results in different score ranges have no comparability. Therefore, for the evaluation results with different score ranges, the algorithm firstly needs to generate a normalized score for each evaluation result through an evaluation result normalization algorithm corresponding to the model evaluation algorithm, so that each evaluation result has comparability.

Taking the evaluation algorithm two as an example, the value range of the similarity finally obtained by the evaluation algorithm is as follows: [ -1,1], therefore, it is necessary to normalize this value to the range of [0,100 ]. Specifically, normalization can be realized by adopting a linear mapping mode to ensure that the normalized result is completely faithful to the trend result of the similarity, and the mapping formula can be as follows: and Y is 50X +50, wherein X is the similarity, and Y is the normalized evaluation score of the model to be evaluated.

Step S403: and acquiring the quantity of the results passing the quality evaluation according to the normalization scores of the evaluation results and the preset minimum normalization score threshold value respectively corresponding to the model evaluation algorithms.

After the normalization score of each evaluation result is generated through the last step, the number of the results of the quality evaluation of the model to be evaluated can be determined according to the minimum normalization score threshold value preset for each model evaluation algorithm. For example, the second evaluation algorithm may specify: if the normalized score obtained by the algorithm is larger than 60 points, the quality evaluation of the model to be evaluated is represented. In specific implementation, the minimum normalized score thresholds respectively preset for different model evaluation algorithms may be the same or different.

Step S405: and if the result quantity is larger than the minimum voting number threshold value, judging that the decision result is that the model to be evaluated passes the quality decision.

And the step of evaluating the number of the results obtained in the last step and passing the quality evaluation, and if the number of the results is greater than the threshold value of the minimum vote number, judging that the decision result is the passing quality decision of the model to be evaluated.

2) Decision algorithm two and a statistic decision algorithm.

The core idea of the algorithm is as follows: and carrying out statistics on the multiple evaluation scores to generate a statistic value, and determining a decision result according to the statistic value. For example, a plurality of evaluation scores are averaged, and if the average score exceeds a certain score (for example, 60 scores), the model to be evaluated is determined to be available.

In particular implementations, the algorithm may include the following steps: 1) generating a normalization score of each evaluation result through an evaluation result normalization algorithm respectively corresponding to each model evaluation algorithm; 2) and if the statistic value of the normalization score is larger than the minimum statistic value threshold, judging that the decision result is that the model to be evaluated passes the quality decision.

Like the voting decision algorithm, the application of the algorithm also requires forced normalization of various evaluation scores to the same value range, for example, forced normalization to 0-100.

The statistical value includes, but is not limited to, an average value, and the minimum statistical value threshold may be empirically determined. Taking the average value of the normalized scores of each evaluation result as an example, if the normalized score is in the range of 0-100, the average score of 50 can represent that the prediction effect of the model to be evaluated is consistent with the prediction effect of the currently used model, when the average score is less than 50 minutes, the smaller the average score is, the worse the prediction effect of the model to be evaluated is, and when the average score is more than 50 minutes, the larger the average score is, the better the prediction effect of the model to be evaluated is.

3) A decision algorithm III and a weighted decision algorithm.

The core idea of the algorithm is as follows: and the user can self-define the score weight value and the final passing score threshold value of each evaluation result, and the user can adjust the decision process according to the own model target.

In particular implementations, the algorithm may include the following steps: 1) generating a normalization score of each evaluation result through an evaluation result normalization algorithm respectively corresponding to each model evaluation algorithm; 2) generating a comprehensive score of the normalized score according to weights preset for evaluation results obtained by each model evaluation algorithm; 3) and if the comprehensive score is larger than the minimum comprehensive score threshold value, judging that the decision result is that the model to be evaluated passes the quality decision.

Three preset decision algorithms are explained above, and all the preset decision algorithms belong to universal decision algorithms. It should be noted that, in specific implementation, a preset decision algorithm applied to the evaluation process of the model to be evaluated may be configured.

Step S107: and taking the decision result as a quality evaluation result of the model to be evaluated.

And finally, taking the decision result of the model to be evaluated, which is obtained by a preset decision algorithm, as the quality evaluation result of the model to be evaluated, wherein the result can be finally used as the basis for judging whether the model to be evaluated can be actually applied or not, so as to realize the data operation.

In the above embodiments, a quality evaluation method of a prediction model is provided, and correspondingly, the present application also provides a quality evaluation device of a prediction model. The apparatus corresponds to an embodiment of the method described above.

Please refer to fig. 2, which is a schematic diagram of an embodiment of a quality evaluation apparatus of the prediction model of the present application. Since the apparatus embodiments are substantially similar to the method embodiments, they are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for relevant points. The device embodiments described below are merely illustrative.

The quality evaluation device of a prediction model of the present embodiment includes: a historical data acquiring unit 101, configured to acquire a historical service object with actual implementation effect data, and form a historical service object set used for model evaluation; the evaluating unit 103 is configured to obtain, according to the historical service object set, evaluation results of models to be evaluated by the historical data-based model evaluating algorithms through at least two preset historical data-based model evaluating algorithms; the decision unit 105 is configured to obtain a decision result of the model to be evaluated through a preset decision algorithm according to each obtained evaluation result; and the judging unit 107 is used for taking the decision result as the quality evaluation result of the model to be evaluated.

Please refer to fig. 3, which is a schematic diagram of an embodiment of an electronic device according to the present application. Since the apparatus embodiments are substantially similar to the method embodiments, they are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for relevant points. The device embodiments described below are merely illustrative.

An electronic device of the present embodiment includes: a display 101; a processor 102; and a memory 103 for storing a program for implementing a quality evaluation method of the prediction model, the apparatus performing the following steps after being powered on and running the program for the quality evaluation method of the prediction model: acquiring historical service objects with actual implementation effect data to form a historical service object set for model evaluation; obtaining evaluation results of various models to be evaluated by the historical data-based model evaluation algorithms through at least two preset historical data-based model evaluation algorithms according to the historical service object set; obtaining a decision result of the model to be evaluated through a preset decision algorithm according to each obtained evaluation result; and taking the decision result as a quality evaluation result of the model to be evaluated.

Corresponding to the quality evaluation method of the prediction model, the application also provides a quality evaluation method of a second prediction model.

Please refer to fig. 4, which is a flowchart illustrating an embodiment of a quality evaluation method of a second prediction model according to the present application. Since the present embodiment has been described in detail in the above-mentioned first method embodiment, the description is relatively simple, and relevant points can be found in a corresponding part (see second evaluation algorithm) in the first method embodiment. The method embodiments described below are merely illustrative.

The second quality evaluation method for the prediction model provided by the application comprises the following steps:

step S101: and acquiring a historical business object with actual implementation effect data.

Step S103: and aiming at each historical business object, taking the characteristic data of the historical business object as the input of a model to be evaluated, and calculating and acquiring the prediction implementation effect data of each historical business object through the model to be evaluated.

Step S105: sequencing each historical service object according to a preset sequencing standard of the predicted implementation effect data to obtain a first sequencing list of each historical service object; and sequencing the historical service objects according to the preset sequencing standard of the actual implementation effect data to obtain a second sequencing list of the historical service objects.

Step S107: and acquiring the similarity of the first sorted list and the second sorted list by a preset similarity calculation method.

Step S109: and determining the evaluation result of the model to be evaluated according to the similarity.

Step S101 in the method is the same as step S101 in the first embodiment of the method, and the detailed descriptions of step S103 to step S109 in the method are given in the second part of the evaluation algorithm in the first embodiment of the method.

In the above embodiments, the present application provides a quality evaluation method of the second prediction model, and correspondingly, the present application also provides a quality evaluation device of the second prediction model. The apparatus corresponds to an embodiment of the method described above.

Please refer to fig. 5, which is a schematic diagram of an embodiment of a quality evaluation apparatus of a second prediction model of the present application. Since the apparatus embodiments are substantially similar to the method embodiments, they are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for relevant points. The device embodiments described below are merely illustrative.

The second quality evaluation device for a prediction model according to the present embodiment includes: a historical data acquiring unit 101, configured to acquire a historical service object having actual implementation effect data; the model prediction unit 103 is used for taking the feature data of each historical business object as the input of a model to be evaluated, and calculating and acquiring the prediction implementation effect data of each historical business object through the model to be evaluated; a sorting unit 105, configured to sort each historical service object according to a preset sorting criterion of the predicted implementation effect data, and obtain a first sorting table of each historical service object; sequencing each historical service object according to the preset sequencing standard of the actual implementation effect data to obtain a second sequencing list of each historical service object; a similarity calculation unit 107, configured to obtain similarities of the first sorted list and the second sorted list by using a preset similarity calculation method; and the evaluation result judging unit 109 is configured to determine an evaluation result of the model to be evaluated according to the similarity.

Corresponding to the quality evaluation method of the prediction model, the application also provides a quality evaluation method of a third prediction model.

Please refer to fig. 6, which is a flowchart illustrating an embodiment of a quality evaluation method of a third prediction model provided in the present application. Since the present embodiment has been described in detail in the above-mentioned first method embodiment, the description is relatively simple, and relevant points can be found in the corresponding part (see the first evaluation algorithm) in the first method embodiment. The quality evaluation method of the third prediction model provided by the application comprises the following steps:

step S101: obtaining a business object which passes model auditing to be evaluated, and forming a business object set which passes the model auditing; and acquiring historical business objects with actual implementation effect data as a historical business object set.

Step S103: and according to the characteristic data of the business objects, acquiring the similarity between each business object included in the business object set which is audited through the model and each historical business object included in the historical business object set through a preset similarity algorithm.

Step S105: and aiming at each business object included in the business object set which passes model auditing, selecting a historical business object which is most similar to each business object included in the business object set which passes model auditing from the historical business object set according to the similarity, and forming a business object mapping set corresponding to the business object set which passes model auditing.

Step S107: and acquiring the average value of the actual implementation effect data of the historical service objects included in the service object mapping set according to the actual implementation effect data of each historical service object included in the historical service object set.

Step S109: and generating an evaluation result of the model to be evaluated according to the average value of the actual implementation effect data, a preset minimum average value threshold value and a preset expected value of the average value.

The step of obtaining the historical business object with the actual implementation effect data in step S101 in the method is the same as the step S101 in the first embodiment of the method, and the step of obtaining the business object which is audited by the model to be evaluated in step S101 in the method, and the detailed descriptions of step S103 to step S109 are found in a part of the evaluation algorithm in the first embodiment.

In the above embodiments, a third quality evaluation method of a prediction model is provided, and correspondingly, the present application also provides a quality evaluation device of the third prediction model. The apparatus corresponds to an embodiment of the method described above.

Please refer to fig. 7, which is a schematic diagram of an embodiment of a quality evaluation apparatus of a third prediction model of the present application. Since the apparatus embodiments are substantially similar to the method embodiments, they are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for relevant points. The device embodiments described below are merely illustrative.

The quality evaluation device of the third prediction model in this embodiment includes: the data acquisition unit 101 is configured to acquire a service object that passes model audit to be evaluated, and form a service object set that passes model audit; acquiring historical service objects with actual implementation effect data as a historical service object set; a similarity calculation unit 103, configured to obtain, according to the feature data of the service object, similarities between the service objects included in the service object set that is audited by the model and the historical service objects included in the historical service object set by using a preset similarity algorithm; a mapping unit 105, configured to select, for each service object included in the service object set that passes model audit, a historical service object that is most similar to each service object included in the service object set that passes model audit from the historical service object set according to the similarity, and form a service object mapping set corresponding to the service object set that passes model audit; an average implementation effect data calculation unit 107, configured to obtain, according to the actual implementation effect data of each historical service object included in the historical service object set, an average value of the actual implementation effect data of the historical service object included in the service object mapping set; and the evaluation result judging unit 109 is configured to generate an evaluation result of the model to be evaluated according to the average value of the actual implementation effect data, a preset minimum average value threshold, and a preset expected value of the average value.

In addition, the application also provides a quality evaluation method of a fourth prediction model. By applying the quality evaluation method of the fourth prediction model, the two models to be evaluated can be compared, so that the model with better prediction effect is determined. The core idea of the method is as follows: the quality of the two models depends on the difference of prediction results obtained by respectively predicting the business objects through the two models.

Please refer to fig. 8, which is a flowchart illustrating an embodiment of a quality evaluation method for a fourth prediction model provided in the present application, wherein parts of the present embodiment that are the same as the first embodiment are not repeated, and please refer to corresponding parts in the first embodiment. The quality evaluation method of the fourth prediction model provided by the application comprises the following steps:

This step is the same as step S101 in the first embodiment of the method, and is not described again here.

Step S103: aiming at each historical business object, taking the characteristic data of the historical business object as the input of a first model to be evaluated, obtaining first prediction implementation effect data of each historical business object through the first model to be evaluated, and obtaining a first historical business object set which passes model auditing according to the first prediction implementation effect data; and aiming at each historical business object, taking the characteristic data of the historical business object as the input of a second model to be evaluated, acquiring second prediction implementation effect data of each historical business object through the second model to be evaluated, and acquiring a historical business object set which is audited by a second passing model according to the second prediction implementation effect data.

In the step S101, for each historical service object obtained in step S101, the historical service object is respectively audited through a first model to be evaluated and a second model to be evaluated, and according to prediction implementation effect data obtained by prediction of each model, a first set of historical service objects audited through the model that passes auditing of the first model to be evaluated and a second set of historical service objects audited through the model that passes auditing of the second model to be evaluated are obtained.

Step S105: and generating a first relative complement of the historical business object set which is subjected to model examination by the second pass model in the historical business object set which is subjected to model examination by the first pass model and a second relative complement of the historical business object set which is subjected to model examination by the first pass model in the historical business object set which is subjected to model examination by the second pass model.

Step S107: and obtaining a statistical value of first actual implementation effect data of the historical service object included in the first relative complement set and a statistical value of second actual implementation effect data of the historical service object included in the second relative complement set according to the actual implementation effect data of the historical service object.

Step S109: acquiring an evaluation score of the first model to be evaluated according to the statistical value of the first actual implementation effect data and the statistical values of the actual implementation effect data of all the selected historical business objects; and acquiring the evaluation score of the second model to be evaluated according to the statistical value of the second actual implementation effect data and the statistical values of the actual implementation effect data of all the selected historical business objects.

In specific implementation, the following formula can be adopted to calculate the evaluation scores of the two models to be evaluated:

score＝50+50*(V_ave1-V_ave)/V_ave

wherein, score is the evaluation score of the model to be evaluated, and the value range of the score is [0,100]]，V_ave1For average actual implementation of the effect data, V, relative complement_aveAnd the average actual implementation effect data of all the selected historical business objects.

Step S110: and determining a model for prediction according to the evaluation score and a preset selection rule.

And finally, determining a model with better prediction effect according to a preset selection rule according to the evaluation score of the first model to be evaluated and the evaluation score of the second model to be evaluated, and using the model as a model for prediction. The preset selection rule includes but is not limited to: and taking the model to be evaluated with the high evaluation score as a model for prediction.

Taking the commodity audit model applied to the cost-effective auction service as an example, the two models to be compared are: a currently used commodity audit model and a generated commodity audit model of a new version; using the two versions of the commodity auditing model to respectively carry out simulation auditing on the same commodities (such as 100 commodities) which are already produced with pit positions; assuming that the currently used commodity audit model audits 70 commodities (a historical business object set audited by the first model), and the new-version commodity audit model audits 60 commodities (a historical business object set audited by the second model); generating a first relative complement set according to the historical business object set which is audited by the first passing model and the historical business object set which is audited by the second passing model, wherein the first relative complement set comprises commodities which belong to the 70 commodities but do not belong to the 60 commodities, and generating a second relative complement set which comprises commodities which belong to the 60 commodities but do not belong to the 70 commodities; and then, calculating and acquiring the evaluation score of the currently used commodity audit model and the evaluation score of the new-version commodity audit model according to the actual pit position output of the commodities, and finally, taking the model with high score as the finally applied model.

In the above embodiment, a fourth quality evaluation method of a prediction model is provided, and correspondingly, the present application also provides a quality evaluation apparatus of the fourth prediction model. The apparatus corresponds to an embodiment of the method described above.

Please refer to fig. 9, which is a schematic diagram of an embodiment of a quality evaluation apparatus of a fourth prediction model of the present application. Since the apparatus embodiments are substantially similar to the method embodiments, they are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for relevant points. The device embodiments described below are merely illustrative.

The quality evaluation device of the fourth prediction model according to the present embodiment includes: a historical data acquiring unit 101, configured to acquire a historical service object having actual implementation effect data; the model prediction unit 103 is configured to, for each historical business object, use feature data of the historical business object as input of a first model to be evaluated, obtain first prediction implementation effect data of each historical business object through the first model to be evaluated, and obtain a first set of historical business objects that are audited by a model according to the first prediction implementation effect data; and for each historical business object, taking the characteristic data of the historical business object as the input of a second model to be evaluated, acquiring second prediction implementation effect data of each historical business object through the second model to be evaluated, and acquiring a historical business object set which is audited by a second passing model according to the second prediction implementation effect data; a relative complement generating unit 105, configured to generate a first relative complement of the historical service object set subjected to model verification by the second model in the historical service object set subjected to model verification by the first model, and a second relative complement of the historical service object set subjected to model verification by the first model in the historical service object set subjected to model verification by the second model; a statistical value calculating unit 107, configured to obtain, according to the actual implementation effect data of the historical service object, a statistical value of first actual implementation effect data of the historical service object included in the first relative complement set, and a statistical value of second actual implementation effect data of the historical service object included in the second relative complement set; an evaluation score obtaining unit 109, configured to obtain an evaluation score of the first model to be evaluated according to the statistical value of the first actual implementation effect data and the statistical values of the actual implementation effect data of all selected historical business objects; acquiring the evaluation score of the second model to be evaluated according to the statistical value of the second actual implementation effect data and the statistical values of the actual implementation effect data of all the selected historical business objects; and the model selecting unit 110 is configured to determine a model for prediction according to the evaluation score and a preset selection rule.

Although the present application has been described with reference to the preferred embodiments, it is not intended to limit the present application, and those skilled in the art can make variations and modifications without departing from the spirit and scope of the present application, therefore, the scope of the present application should be determined by the claims that follow.

In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.

1. Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include non-transitory computer readable media (transient media), such as modulated data signals and carrier waves.

2. As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

Claims

1. A quality evaluation method of a prediction model is characterized by comprising the following steps:

obtaining a decision result of the model to be evaluated through a preset decision algorithm according to each obtained evaluation result, wherein the decision result refers to a decision result of the model to be evaluated obtained by comparing an actual implementation result with a predicted implementation result, and the decision result comprises a quality decision of the model to be evaluated and a quality decision of the model to be evaluated;

2. The method for quality evaluation of a prediction model according to claim 1, wherein the model evaluation algorithm based on historical data comprises:

3. The method for quality evaluation of a prediction model according to claim 1, wherein the model evaluation algorithm based on historical data comprises:

4. The method for quality evaluation of a prediction model according to claim 1, wherein the model evaluation algorithm based on historical data comprises:

5. The method of claim 1, wherein the predetermined decision algorithm is a voting decision algorithm, a statistical decision algorithm, or a weighted decision algorithm.

6. The method of claim 5, wherein the voting decision algorithm comprises:

7. The method of claim 5, wherein the statistical value decision algorithm comprises:

8. The method of claim 7, wherein the statistical value comprises an average value.

9. The method of claim 5, wherein the weighted decision algorithm comprises:

10. The method for quality evaluation of a predictive model according to claim 1, wherein the predetermined evaluation algorithm based on historical data comprises a specific evaluation algorithm customized for the model to be evaluated.

11. The method for quality assessment of a prediction model according to claim 1, wherein said at least two pre-defined historical data based model evaluation algorithms are configurable.

12. The method of evaluating the quality of a prediction model according to claim 11, further comprising:

13. The method of claim 1, wherein the predetermined decision algorithm is configurable.

14. An apparatus for evaluating the quality of a prediction model, comprising:

the decision unit is used for obtaining a decision result of the model to be evaluated through a preset decision algorithm according to each obtained evaluation result, wherein the decision result refers to a decision result of the model to be evaluated obtained by comparing an actual implementation result with a predicted implementation result, and the decision result comprises a quality decision of the model to be evaluated and a quality decision of the model to be evaluated;

15. An electronic device, comprising:

a display;

a processor; and

a memory for storing a program for implementing a quality evaluation method of a prediction model, the apparatus performing the following steps after being powered on and running the program for the quality evaluation method of the prediction model: acquiring historical service objects with actual implementation effect data to form a historical service object set for model evaluation; obtaining evaluation results of various models to be evaluated by the historical data-based model evaluation algorithms through at least two preset historical data-based model evaluation algorithms according to the historical service object set; obtaining a decision result of the model to be evaluated through a preset decision algorithm according to each obtained evaluation result; taking the decision result as a quality evaluation result of the model to be evaluated;

the decision result refers to a decision result of the model to be evaluated, which is obtained by comparing an actual implementation result with a predicted implementation result, wherein the decision result comprises a quality decision of the model to be evaluated, and a quality decision of the model to be evaluated, which is not passed.

16. A quality evaluation method of a prediction model is characterized by comprising the following steps:

acquiring a historical service object with actual implementation effect data;

determining an evaluation result of the model to be evaluated according to the similarity;

and the evaluation result comprises the passing quality decision of the model to be evaluated and the failing quality decision of the model to be evaluated.

17. An apparatus for evaluating the quality of a prediction model, comprising:

and the evaluation result judging unit is used for determining the evaluation result of the model to be evaluated according to the similarity, and the evaluation result comprises the passing quality decision of the model to be evaluated and the failing quality decision of the model to be evaluated.

18. A quality evaluation method of a prediction model is characterized by comprising the following steps:

generating an evaluation result of the model to be evaluated according to the average value of the actual implementation effect data, a preset minimum average value threshold value and a preset expected value of the average value;

19. An apparatus for evaluating the quality of a prediction model, comprising:

and the evaluation result judging unit is used for generating an evaluation result of the model to be evaluated according to the average value of the actual implementation effect data, a preset minimum average value threshold value and a preset expected value of the average value, wherein the evaluation result comprises a passing quality decision of the model to be evaluated and a failing quality decision of the model to be evaluated.

20. A quality evaluation method of a prediction model is characterized by comprising the following steps:

acquiring a historical service object with actual implementation effect data;

21. The method of claim 20, wherein the predetermined selection rule comprises:

22. The method of claim 20, wherein the statistical value comprises an average value.

23. An apparatus for evaluating the quality of a prediction model, comprising: