CN118035110A

CN118035110A - Informationized model evaluation method and system based on big data

Info

Publication number: CN118035110A
Application number: CN202410247405.9A
Authority: CN
Inventors: 闫丽
Original assignee: Beijing Yijiubaba Electric Power Technology Development Co ltd
Current assignee: Beijing Yijiubaba Electric Power Technology Development Co ltd
Priority date: 2024-03-05
Filing date: 2024-03-05
Publication date: 2024-05-14
Anticipated expiration: 2044-03-05
Also published as: CN118035110B

Abstract

The invention discloses an informationized model evaluation method and system based on big data, and relates to the technical field of model evaluation, wherein the method comprises the following steps: importing a model to be evaluated into an informationized model evaluation system, and setting an evaluation configuration item; the informationized model evaluation system generates an evaluation set according to the imported model and the evaluation configuration item; the informationized model evaluation system analyzes the imported model once by using an evaluation set to generate an evaluation index set; and carrying out secondary analysis on the imported model, and generating a model evaluation report based on the generated evaluation index set and the analysis result. Based on the data of each application scene in each field, data assurance is provided for different informationized models, data evaluation in the system is unified, indexes are generated, fairness of model evaluation is guaranteed, and meanwhile optimization references are provided.

Description

Informationized model evaluation method and system based on big data

Technical Field

The invention relates to the technical field of model evaluation, in particular to an informationized model evaluation method and system based on big data.

Background

Along with the wide application of big data technology, informatization models become the mainstream of data analysis processing, for example, the current GPT and intelligent assistants of various applications are all informatization models, they can be fast to give a reply according to the requirement of the user, also have informatization models for prediction and decision, how to evaluate these models, and the technical field of evaluating the practicality and accuracy of the models is still relatively vacant, and no unified evaluation system is available for evaluating these different types of models.

Disclosure of Invention

The invention provides an informationized model evaluation method based on big data, which comprises the following steps:

Step1, importing a model to be evaluated into an informationized model evaluation system, and setting an evaluation configuration item;

step2, an informationized model evaluation system generates an evaluation set according to the imported model and an evaluation configuration item;

Step3, the informationized model evaluation system uses an evaluation set to analyze the imported model once to generate an evaluation index set;

Step4, performing secondary analysis on the imported model, and generating a model evaluation report based on the generated evaluation index set and the analysis result.

The informatization model evaluation method based on big data, as described above, wherein the configuration items in the evaluation configuration item page comprise application fields, purposes, data scenes, base models and test cases; the configuration items are selected and filled in by single selection, multiple selection and text boxes, and after configuration is completed, a configuration file is formed and shares an identification ID with the imported model.

The informationized model evaluation method based on big data, as described above, wherein an evaluation set is generated according to an imported model and an evaluation configuration item, specifically comprises the following sub-steps:

inquiring the corresponding evaluation configuration item content according to the identification ID of the imported model;

Inquiring corresponding test data according to the classification identification of the content of the evaluation configuration item;

And constructing a data scene of the evaluation model based on the test data, and arranging and outputting the data scene as an evaluation set.

The informationized model evaluation method based on big data, as described above, wherein the imported model is analyzed once by using the evaluation set to generate the evaluation index set, specifically comprises the following sub-steps:

According to the content of the input set of the test case, extracting corresponding data from the evaluation set, inputting the data into the model, and storing an output result as a verification set;

extracting corresponding data from the evaluation set according to the content of the test case output set and storing the data as a comparison set;

and determining an evaluation standard of the model according to the model application, taking data of the evaluation standard, the verification set and the comparison set as parameters of an evaluation index value calculation function, and outputting an evaluation index set.

The informationized model evaluation method based on big data, as described above, wherein the imported model is subjected to secondary analysis, and a model evaluation report is generated based on the generated evaluation index set and the analysis result, specifically comprising the following sub-steps:

analyzing various model parameters according to the characteristics of a base model adopted by the model to be tested;

The relevance of each model parameter and each evaluation index in the evaluation index set is analyzed;

and generating a model evaluation report based on the relevance of the model parameters and the evaluation indexes.

The informationized model evaluation method based on big data, as described above, wherein each model parameter is resolved according to the characteristics of a base model adopted by a model to be tested, specifically comprises the following sub-steps:

intercepting model data items around basic model parameters in an extending mode to serve as monitoring items;

Along with the input of the input set, the interception range of the monitoring item is gradually reduced until the value of the monitoring item is not changed any more, and the parameters except the basic model parameters in the monitoring item are extracted to be used as additional model parameters of the evaluation model.

The informationized model evaluation method based on big data, as described above, wherein the correlation between each model parameter and each evaluation index in the evaluation index set is analyzed, specifically comprises the following sub-steps:

in the input process of the input set, each model parameter is finely adjusted;

and comparing the changes of each evaluation index set before and after the model parameter adjustment, and analyzing the relevance of the model parameter and the evaluation index.

The invention also provides an informationized model evaluation system based on big data, which comprises the following steps: the system comprises a model importing module, an evaluation set generating module, a model analyzing module and an evaluation result displaying module;

the model importing module is used for importing a model to be evaluated into the informationized model evaluation system and setting an evaluation configuration item;

The evaluation set generation module is used for generating an evaluation set according to the imported model and the evaluation configuration item;

the model analysis module is used for analyzing the imported model by using the evaluation set to obtain various model parameters;

and the evaluation result display module is used for generating a model evaluation report based on the relevance of the model parameters and the evaluation indexes.

The informationized model evaluation system based on big data comprises a model analysis module, a data analysis module and a data analysis module, wherein the model analysis module comprises a primary analysis sub-module and a secondary analysis sub-module;

The primary analysis sub-module is used for generating an evaluation index set;

and the secondary analysis sub-module is used for analyzing the relevance between the model parameters and the evaluation indexes.

The beneficial effects achieved by the invention are as follows: based on the data of each application scene in each field, data assurance is provided for different informationized models, data evaluation in the system is unified, indexes are generated, fairness of model evaluation is guaranteed, and meanwhile optimization references are provided.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present invention, and other drawings may be obtained according to these drawings for a person having ordinary skill in the art.

Fig. 1 is a flowchart of an informationized model evaluation method based on big data according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Example 1

As shown in fig. 1, a first embodiment of the present invention provides an informationized model evaluation method based on big data, including:

Step S10: importing a model to be evaluated into an informationized model evaluation system, and setting an evaluation configuration item;

The homepage of the informatization model evaluation system is provided with a model import frame which supports informatization models developed based on a plurality of languages such as Python, C, C #, C++, and the like, and after importing is completed, the informatization model evaluation system jumps to an evaluation configuration item page which is used for configuring details of the informatization models, wherein configuration items in the evaluation configuration item page comprise: application field, purpose, data scene, base model and test case; the configuration items are selected and filled in by single selection, multiple selection and text boxes, after configuration is completed, the configuration files and the imported models share one identification ID, and the identification ID can be an ID generated by a snowflake algorithm or an ID generated by a self-elevating sequence, so long as the unique identification can be used.

Step S20: the informationized model evaluation system generates an evaluation set according to the imported model and the evaluation configuration item;

the informationized model has no quality, only proper effects can be exerted in proper data scenes, a large amount of test data are maintained in the informationized model evaluation system to build the data scenes of the evaluated model, the data comprise data of each field and each data scene, the data are marked by different classification identifiers, the classification identifiers are consistent with the contents of each configuration item in an evaluation configuration item page, and the accuracy of the generated evaluation set is ensured, and the method comprises the following steps:

1. inquiring the corresponding evaluation configuration item content according to the identification ID of the imported model;

2. inquiring corresponding test data according to the classification identification of the content of the evaluation configuration item;

3. Constructing a data scene of an evaluation model based on the test data, and sorting and outputting the data scene as an evaluation set;

Taking a wind power prediction model as an example, extracting data types of an input set and an output set according to test cases in an evaluation configuration item, extracting corresponding data from test data according to the data types to form an evaluation set, and inquiring related data for wind power prediction according to the application field and the content of a data scene configuration item, wherein the related data comprises scene data of the weather data, wind power equipment data, wind power data and other models in normal application, and the scene data is newly added into the evaluation set.

Step S30: the informationized model evaluation system analyzes the imported model once by using an evaluation set to generate an evaluation index set;

The evaluation standards of the different models are different, the informationized model evaluation system analyzes the imported model, and a group of evaluation index sets are generated based on the evaluation standards of the model and data feedback of the evaluation sets and used for describing the model, in particular:

1. According to the content of the input set of the test case, extracting corresponding data from the evaluation set, inputting the data into the model, and storing an output result as a verification set;

The input set content extraction function is: t ₁(P)＝{x_i f_i＝f₀∧a_i＝a₀∧s_i＝s₀ }, wherein P is an evaluation set, x _i is the ith item of data in the evaluation set P, f _i is a field code of the ith item of data in the evaluation set P, f ₀ is a field code in the test case input set, a _i is a field code of the ith item of data in the evaluation set P, a ₀ is a field code in the test case input set, s _i is a scene code of the ith item of data in the evaluation set P, and s ₀ is a scene code in the test case input set;

The extracted content is input into the evaluation model to be tested as a new input set, and the output result is stored as a verification set and is expressed as Y _p.

2. Extracting corresponding data from the evaluation set according to the content of the test case output set and storing the data as a comparison set;

The output set content extraction function is: t ₂(P)＝{x_i g_i＝g₀+t₀\c∧f_i＝fo₀ }, wherein x _i is the ith item of data in the evaluation set P, g _i is the group code of the ith item of data in the evaluation set P, g ₀ is the group code of the data items in the input set, T ₀ is the timestamp of the data items in the input set, c is the prediction period of the evaluation model to be tested, and is integer division, when the evaluation model to be tested is the prediction model, the expression of T ₀ \c exists, f _i is the field code of the ith item of data in the evaluation set P, and fo ₀ is the field code in the output set of the test case.

After the extraction is finished, storing the extract as a comparison set, and marking the comparison set as Y _r;

3. determining an evaluation standard of the model according to the model application, taking data of the evaluation standard, the verification set and the comparison set as parameters of an evaluation index value calculation function, and outputting an evaluation index set;

The evaluation standard refers to the hardness performance of the model according to specific application, including real-time performance, accuracy and associative ductility, wherein the performances are described by a set of evaluation indexes, and meanwhile, the performance has a standard, the model which cannot reach the standard cannot be regarded as a qualified model naturally, and the specific analysis method is to use an evaluation index calculation function to obtain a series of index values of the evaluation indexes, and the index values are used for representing the performance of the model.

The evaluation index value calculation function is expressed as:

Wherein B is an evaluation standard set, B ₁～b₃ is a data item in the B set, B ₀ is a standard item to be reached by an evaluation model, e ₁～e₅ is different evaluation indexes, a default value is 0, yp _k is a kth data item in a verification set Y _p, yr _k is a kth data item in a comparison set Y _r, k takes a value of 1-m, m is the total number of the data items in the verification set, lambda is a standard normal distribution coefficient, rt is response time of the model, and delta is a standard association sensitivity coefficient; the evaluation index value calculation function is of a double-layer verification structure, when the result of the first-layer verification b ₁＝b₀ is true, the second-layer verification is entered, and when the second-layer verification expression is When the result is true, a corresponding result (e ₁, 1) is returned, when the second-level expressionWhen the result is true, then a corresponding result (e ₂, 1) is returned, and so on.

After the calculation is completed, an evaluation index set d= { e _j,v_j},e_j represents an evaluation index with a subscript of j, and v _j represents an index value corresponding to the evaluation index with the subscript of j.

Step S40: performing secondary analysis on the imported model, and generating a model evaluation report based on the generated evaluation index set and analysis results, specifically:

1. Analyzing various model parameters according to the characteristics of a base model adopted by the model to be tested;

The characteristics of different base models are different, model parameters to be set in training are also different, the basic model parameters originally contained in various base models are maintained in an informationized model evaluation system, a plurality of informationized models are additionally added with some parameters to adapt to actual application scenes besides the original model parameters of the base models, the basic model parameters can be inquired in the system according to the content of the base model configuration items in an evaluation configuration page, and other model parameters are required to be resolved, and the resolving method comprises the following steps:

① Intercepting model data items around basic model parameters in an extending mode to serve as monitoring items;

For example, the model to be tested is a linear model, y=ax1+bx2+cx3+d, and a and d are basic model parameters, then a is taken as a base point and extends backwards until the next basic model parameter is encountered, then the next basic model parameter is taken as a d base point and extends backwards for interception until the last item of data, and the monitoring items after connection symbol interception are removed are ax1, bx2 and cx3;

② Along with the input of the input set, the interception range of the monitoring item is slowly reduced until the value of the monitoring item is not changed any more, and the parameters except the basic model parameters in the monitoring item are extracted to be used as additional model parameters of an evaluation model;

Along with the change of the input sets x 1-x 3, the values of the monitoring items ax1, bx2 and cx3 are also changed, the basic model parameters are removed, and the first monitoring item only remains x1, and the x1 item is always changed, so that the value is not the model parameter; the second term bx2 is firstly contracted by one term backwards, and only x2 and x2 are always changed, and no data item is left behind x2, so that the first term is taken for verification, b is unchanged along with the change of the input set, and b is a model parameter.

2. The relevance of each model parameter and each evaluation index in the evaluation index set is analyzed, and the analysis method comprises the following steps of;

① In the input process of the input set, each model parameter is finely adjusted;

Fine tuning refers to iterating, ascending or descending the model parameters, with each iteration step preferably being a bit smaller.

② Comparing the changes of each evaluation index set before and after the model parameter adjustment, and analyzing the relevance of the model parameter and the evaluation index;

the same group of input sets are repeatedly input along with iteration of model parameters, each evaluation index set is repeatedly calculated according to model output, only one model parameter is adjusted at a time, and then the association degree of the adjusted model parameter and each evaluation index is calculated respectively according to a gray association degree calculation formula.

3. Generating a model evaluation report based on the relevance of the model parameters and the evaluation indexes;

The evaluation report firstly displays each evaluation index, and the model parameter with the highest association degree is displayed in a prompting mode under the unqualified evaluation index for reference of the participators.

Example two

The second embodiment of the invention provides an informationized model evaluation system based on big data, which comprises: the system comprises a model importing module, an evaluation set generating module, a model analyzing module and an evaluation result displaying module;

(1) The model importing module is used for importing a model to be evaluated into the informationized model evaluation system and setting an evaluation configuration item;

(2) The evaluation set generation module is used for generating an evaluation set according to the imported model and the evaluation configuration item;

(3) The model analysis module is used for analyzing the imported model by using the evaluation set to obtain various model parameters, and comprises the following steps: a primary parsing sub-module and a secondary parsing sub-module;

The primary analysis sub-module is used for generating an evaluation index set;

The evaluation index value calculation function is expressed as:

Wherein B is an evaluation standard set, B ₁～b₃ is a data item in the B set, e ₁～e₅ is different evaluation indexes, a default value is 0, yp _k is a kth data item in a verification set Y _p, yr _k is a kth data item in a comparison set Y _r, k takes a value of 1-m, m is the total number of the data items in the verification set, lambda is a standard normal distribution coefficient, rt is response time of a model, delta is a standard association sensitivity coefficient; the evaluation index value calculation function is of a double-layer verification structure, when the result of the first-layer verification b ₁＝b₀ is true, the second-layer verification is entered, and when the second-layer verification expression is When the result is true, a corresponding result (e ₁, 1) is returned, when the second layer expression/>When the result is true, then a corresponding result (e ₂, 1) is returned, and so on.

The secondary analysis sub-module is used for analyzing the relevance between the model parameters and the evaluation indexes, and specifically:

repeatedly inputting the same input set along with iteration of the model parameters, repeatedly calculating each evaluation index set according to model output, only adjusting one model parameter at a time, and then respectively calculating the association degree of the adjusted model parameter and each evaluation index according to a gray association degree calculation formula;

(4) The evaluation result display module is used for generating a model evaluation report based on the relevance of the model parameters and the evaluation indexes;

The foregoing embodiments have been provided for the purpose of illustrating the general principles of the present invention in further detail, and are not to be construed as limiting the scope of the invention, but are merely intended to cover any modifications, equivalents, improvements, etc. based on the teachings of the invention.

Claims

1. An informationized model evaluation method based on big data comprises the following steps:

2. The informationized model evaluation method based on big data according to claim 1, wherein the configuration items in the evaluation configuration item page comprise application fields, purposes, data scenes, base models and test cases; the configuration items are selected and filled in by single selection, multiple selection and text boxes, and after configuration is completed, a configuration file is formed and shares an identification ID with the imported model.

3. The informationized model evaluation method based on big data according to claim 1, wherein the evaluation set is generated according to the imported model and the evaluation configuration item, specifically comprising the following sub-steps:

4. The informationized model evaluation method based on big data according to claim 1, wherein the method comprises the following sub-steps:

5. The informationized model evaluation method based on big data according to claim 1, wherein the imported model is subjected to secondary analysis, and a model evaluation report is generated based on the generated evaluation index set and the analysis result, and specifically comprises the following sub-steps:

6. The method for evaluating an informationized model based on big data according to claim 5, wherein each model parameter is resolved according to the characteristics of a base model adopted by the model to be tested, and specifically comprises the following sub-steps:

Along with the input of the input set, the interception range of the monitoring item is continuously reduced until the value of the monitoring item is not changed any more, and the parameters except the basic model parameters in the monitoring item are extracted to be used as additional model parameters of the evaluation model.

7. The informationized model evaluation method based on big data according to claim 5, wherein the correlation between each model parameter and each evaluation index in the evaluation index set is analyzed, and the method specifically comprises the following sub-steps:

in the input process of the input set, each model parameter is finely adjusted;

8. An informationized model evaluation system based on big data, comprising: the system comprises a model importing module, an evaluation set generating module, a model analyzing module and an evaluation result displaying module;

9. The informationized model evaluation system based on big data according to claim 8, wherein the model analysis module comprises a primary analysis sub-module and a secondary analysis sub-module;

The primary analysis sub-module is used for generating an evaluation index set;