CN112257332B

CN112257332B - Simulation model evaluation method and device

Info

Publication number: CN112257332B
Application number: CN202011011804.3A
Authority: CN
Inventors: 赖李媛君; 彭斯雨; 张霖
Original assignee: Beihang University; Beijing Simulation Center
Current assignee: Beihang University; Beijing Simulation Center
Priority date: 2020-09-23
Filing date: 2020-09-23
Publication date: 2022-12-13
Anticipated expiration: 2040-09-23
Also published as: CN112257332A

Abstract

The application discloses a simulation model evaluation method and a simulation model evaluation device, wherein the method comprises the following steps: the method comprises the steps of extracting characteristic points of simulation model output data and reference data through a characteristic point extraction algorithm suitable for multi-dimensional data to compress the size of the reference data as much as possible, then carrying out incremental clustering on the reference data by utilizing an enhanced self-organizing incremental neural network algorithm to form a reference data rule and establish a topological network structure of the reference data, then determining a simulation model evaluation index according to the topological network structure, and evaluating a simulation result of a simulation model according to the evaluation index. The method and the device solve the technical problems of long evaluation process and low efficiency of the simulation model in the prior art.

Description

Simulation model evaluation method and device

Technical Field

The present application relates to the field of model evaluation technologies, and in particular, to a method and an apparatus for evaluating a simulation model.

Background

Simulation models are typically used to simulate the composition and operation of complex systems. Because the simulation model is not exactly the same as the actual model, the output data of the system after simulation by the simulation model and the output data of the actual model have a certain difference, and therefore the reliability of the output data of the simulation model needs to be verified.

At present, the conventional method for verifying the reliability of the output data of the simulation model is to verify and evaluate the output data by verification, verification and verification (VV & a), and the process is as follows: and directly comparing the output data of the simulation model with the reference data, and determining the reliability of the simulation model according to the comparison result and the preset evaluation parameters. However, because the simulation model may have a large uncertainty, that is, under the condition that the input parameters are the same, the output data obtained by running the same simulation model for multiple times may also have a large difference, the output result is directly compared with the preset reference data, so that the accuracy of the evaluation result of the simulation model is poor.

Disclosure of Invention

The technical problem that this application was solved is: in the scheme provided by the embodiment of the application, not only the output data set and the reference data set of the simulation model are compressed through a preset feature point extraction algorithm to obtain a compressed output data set and a compressed reference data set, but also the data quantity of the output data and the reference data is reduced, further the calculation quantity in the simulation model evaluation process is reduced, and the efficiency of the simulation model evaluation is improved; and a topological network structure between reference data in the compressed reference data set is established through the incremental clustering model, namely, an internal rule between the reference data is determined, and the reliability of the simulation model is automatically evaluated according to the internal rule between the reference data, so that the evaluation automation performance of the simulation model is improved.

In a first aspect, an embodiment of the present application provides a method for evaluating a simulation model, where the method includes:

performing incremental clustering processing on a preset reference data set according to a preset incremental clustering model to establish a topological network structure of reference data;

determining an output data set of a simulation model, and performing credibility evaluation on the simulation model according to the output data set, the topological network structure and a preset similarity threshold.

Optionally, compressing each data in the output data set and the reference data set according to a preset feature point extraction algorithm to obtain a compressed output data set and a compressed reference data set, including:

respectively extracting feature points of each data in the output data set and the reference data set according to a preset feature point extraction algorithm to obtain a plurality of feature points, wherein each data comprises a plurality of data points;

and compressing each data according to the plurality of characteristic points corresponding to each data to obtain compressed data, and obtaining the compressed output data set and the compressed reference data set according to the compressed data.

Optionally, the performing feature point extraction on each data in the output data set and the reference data set according to a preset feature point extraction algorithm to obtain a plurality of feature points includes:

taking the head and tail data points in each data in the reference data set as initial characteristic points, calculating a data point with the maximum distance from the initial characteristic points according to a preset joint distance calculation formula, and taking the data point as a new characteristic point;

and recalculating the data point with the maximum distance between any two feature points according to the new feature point and the initial feature point, and using the data point as the feature point until the feature points with the preset number are determined.

Optionally, performing incremental clustering processing on the compressed reference data set according to a preset incremental clustering model to establish a topological network structure of the reference data, including:

optionally constructing two data in the compressed reference data set to obtain an initial topological network structure,

and sequentially adding the data in the compressed reference data set except the optional two data into the initial topological network structure to obtain the topological network structure.

Optionally, sequentially adding data in the compressed reference data set except for the optional two data to the initial topological network structure to obtain the topological network structure, including:

optionally selecting one data from the compressed reference data set except the optional two data as data to be added, and determining adjacent first data and second data with the shortest distance and the second shortest distance from the data to be added from the data contained in the current topological network structure;

calculating a similarity threshold value between the data to be added and the first data and the second data;

judging whether the distance between the data to be added and the first data and the second data is larger than the similarity threshold value or not;

and if so, adding the data to be added into the current topological network structure until all the data in the compressed reference data set are added into the current topological network structure, so as to obtain the topological network structure.

Optionally, the method further comprises: if the distance between the data to be added and the first data and the second data is not larger than the similarity threshold, adding 1 to the existing time length corresponding to the edge connected with the two adjacent data with the shortest distance between the data to be added;

judging whether the two adjacent data have the same class label, whether any adjacent data is not assigned to any class or whether the classes of the two adjacent data meet a preset density condition;

if so, connecting the two adjacent data, updating the weight vectors of the second data and the adjacent data thereof, and deleting the edge with the existing duration being greater than a preset threshold;

if not, updating the data density, the class label of the data and the connection relation thereof contained in the topological network structure;

judging whether the number of the plurality of data contained in the topological network structure is an integral multiple of a preset parameter lambda or not;

and if so, updating the class labels of all the nodes in the topological network structure, and deleting the noise nodes according to a preset noise condition.

Optionally, performing reliability evaluation on the simulation model according to the compressed output data set, the topological network structure, and a preset similarity threshold, including:

selecting any output data from the compressed output data set, and calculating the matching degree of the any output data and a reference data set according to the any output data, the topological network structure and a preset similarity threshold;

calculating a first average matching degree of any output data corresponding to the topological network structure according to the matching degree, determining a second average matching degree corresponding to all data in the compressed output data set according to the first average matching degree, and evaluating the reliability of the simulation model according to the second average matching degree.

Optionally, calculating a matching degree of the any output data and a reference data set according to the any output data, the topological network structure, and a preset similarity threshold, including:

determining first reference data with the shortest distance to any output data according to the topological network structure;

and calculating the matching degree corresponding to any output data according to the distance between the first reference data and any output data and a preset similarity threshold.

Optionally, determining a second average matching degree corresponding to all data in the output data set according to the first average matching degree includes:

calculating the second average matching degree by the following formula:

wherein F represents the second average degree of matching; Δ u represents the first average degree of matching; h ^(x) Representing the compressed output data set.

In a second aspect, an embodiment of the present application provides an apparatus for evaluating a simulation model, where the apparatus includes:

the simulation system comprises a compression unit, a simulation model generation unit and a simulation model generation unit, wherein the compression unit is used for determining an output data set and a reference data set of the simulation model, and compressing each data in the output data set and the reference data set according to a preset feature point extraction algorithm to obtain a compressed output data set and a compressed reference data set;

the establishing unit is used for performing incremental clustering processing on the compressed reference data set according to a preset incremental clustering model to establish a topological network structure of reference data;

and the evaluation unit is used for evaluating the reliability of the simulation model according to the compressed output data set, the topological network structure and a preset similarity threshold.

Optionally, the compressing unit is specifically configured to:

Optionally, the establishing unit is specifically configured to:

constructing an initial topological network structure from any two data in the compressed reference data set,

Optionally, the establishing unit is specifically configured to:

Optionally, the establishing unit is further configured to:

if the distance between the data to be added and the first data and the second data is not larger than the similarity threshold, adding 1 to the existing time length corresponding to the edge connected with the two adjacent data with the shortest distance between the data to be added;

Optionally, the evaluation unit is specifically configured to:

calculating the second average matching degree by the following formula:

Compared with the prior art, the scheme provided by the embodiment of the application has the following beneficial effects:

1. according to the scheme provided by the embodiment of the application, the output data set and the reference data set of the simulation model are compressed through the preset feature point extraction algorithm to obtain the compressed output data set and the compressed reference data set, so that the data quantity of the output data and the reference data is reduced, the calculated quantity in the simulation model evaluation process is further reduced, and the efficiency of the simulation model evaluation is improved; and a topological network structure between reference data in the compressed reference data set is constructed through the incremental clustering model, namely, internal rules between the reference data are determined, and the reliability of the simulation model is automatically evaluated according to the internal rules between the reference data, so that the automation performance of the evaluation of the simulation model is improved.

2. According to the scheme provided by the embodiment of the application, the evaluation index of the simulation model is dynamically determined according to the output data set of the simulation model, and the problem of poor applicability caused by determining the evaluation index of the simulation model according to empirical data is avoided.

Drawings

Fig. 1 is a schematic flowchart illustrating an evaluation method of a simulation model according to an embodiment of the present disclosure;

fig. 2 is a schematic diagram of a process of feature point identification according to an embodiment of the present application;

fig. 3 is a schematic diagram of determining feature points according to an embodiment of the present disclosure;

fig. 4 is a schematic diagram of a process of performing incremental clustering processing by using an ESOINN algorithm according to an embodiment of the present application;

fig. 5 is a schematic structural diagram of an evaluation apparatus for a simulation model according to an embodiment of the present disclosure.

Detailed Description

In order to better understand the technical solutions, the technical solutions of the present application are described in detail below with reference to the drawings and specific embodiments, and it should be understood that the specific features in the embodiments and examples of the present application are detailed descriptions of the technical solutions of the present application, and are not limitations of the technical solutions of the present application, and the technical features in the embodiments and examples of the present application may be combined with each other without conflict.

The specific implementation manner of the method can comprise the following steps (the flow of the method is shown in fig. 1):

step 101, determining an output data set and a reference data set of a simulation model, and compressing each data in the output data set and the reference data set according to a preset feature point extraction algorithm to obtain a compressed output data set and a compressed reference data set.

In particular, in the scheme provided by the embodiment of the application, first, assume that one output data of the simulation model is x = { x = } ₁ ，x ₂ ，…，x _n } ^T Where n represents the time series length of the output data. In order to compare the output data with the reference data, the output data of the simulation model should coincide with the time length of the reference data. But sometimes it is of interest to have more than one output data of the simulation model, and therefore,the output data and the reference data of the simulation model may be represented by:

where D represents the number of output data of interest to the simulation model.

Further, let s ₁ ，s ₂ ，…，s _m Representing the values of m input parameters of the simulation model, then the output of the simulation model the parameters can be defined as s = { s = ₁ ，s ₂ ，…，s _m The expression of the input and output data after the simulation model runs once is as follows:

X′＝{s X} (3)

due to the dynamic property of the real object, the experimental sample comes from simulation output obtained by running the simulation model for multiple times under different parameter configurations, namely the expression of the output data obtained by running the simulation model for multiple times is H ^x ＝{X ₁ ，X ₂ ，…，X _I }; in the same way, the expression of the reference sample is Y ^y ＝{Y ₁ ，Y ₂ ，…，Y _J Where I is the number of data contained in the experimental sample, and J represents the number of data contained in the reference sample.

Furthermore, because the reference sample data has high dimensionality and more data, comparing the output data of the simulation model according to the original reference sample data brings huge calculation amount, and further reduces the working efficiency of simulation model evaluation. In order to improve the working efficiency of simulation model evaluation, the reference data and the output data need to be compressed, and there are various specific ways of compressing the reference data and the output data, which will be described below by taking a preferred way as an example.

In a possible implementation manner, compressing each data in the output data set and the reference data set according to a preset feature point extraction algorithm to obtain a compressed output data set and a compressed reference data set, includes: respectively extracting feature points of each data in the output data set and the reference data set according to a preset feature point extraction algorithm to obtain a plurality of feature points, wherein each data comprises a plurality of data points; and compressing each data according to the plurality of characteristic points corresponding to each data to obtain compressed data, and obtaining the compressed output data set and the compressed reference data set according to the compressed data.

In a possible implementation manner, the performing feature point extraction on each data in the output data set and the reference data set according to a preset feature point extraction algorithm to obtain a plurality of feature points includes: taking the head and tail data points in each data in the reference data set as initial characteristic points, calculating a data point with the maximum distance from the initial characteristic points according to a preset joint distance calculation formula, and taking the data point as a new characteristic point; and recalculating the data point with the maximum distance between any two feature points according to the new feature point and the initial feature point, and using the data point as the feature point until the feature points with the preset number are determined.

Specifically, the reference data is compressed by a feature Point extraction (PIP) algorithm. The idea of the PIP algorithm for extracting the feature points is to compress the reference data to a preset length by collecting peak values and peak-valley values in the reference data. For the sake of understanding, the PIP algorithm process is briefly described below, and a one-dimensional data PIP algorithm is described as an example.

Referring to fig. 2, a schematic diagram of a process of feature point identification is shown; there are 7 feature points in fig. 2. Assuming that the original reference data length is n, the compressed reference data length is expected to be k (k < n), i.e. the purpose of the PIP algorithm is to select the most significant first k peak (valley) values from the length n, these peak (valley) values are called feature points.

Firstly, selecting a head point and a tail point in original reference data as initial feature points, and iteratively selecting a point with the largest distance between adjacent feature points to obtain the remaining feature points in the original reference data. Specifically, the distance between adjacent feature points is calculated as follows:

referring to FIG. 3, if it is assumed that the point at which the distance is to be calculated is P ₃ (t ₃ ，x ₃ ) With adjacent feature point P ₁ (t ₁ ，x ₁ )、P ₂ (t ₂ ，x ₂ ) Point P ₃ At point P ₁ P ₂ The vertical projection on the connecting line is P _c (t _c ，x _c )，P ₃ And P _c The distance between them is the point P ₃ Distance d (P) ₃ ). Specifically, the distance calculation formula is as follows:

when P is present ₃ After being selected as a characteristic point, the whole data is traversed again to find the point P ₁ 、P ₃ Or P ₃ 、P ₂ The point with the largest distance between the feature points is changed into the fourth feature point until the k feature points are found out in an iteration mode.

Further, in order to adapt to most of data, the PIP algorithm needs to be improved, and the specific improvement process is as follows: first a joint distance needs to be defined. If the reference data shares D-dimensional data, P ₃ (t ₃ ，x ₃₁ ，x ₃₂ ，…，x _3D ) To its neighboring feature point P ₁ (t ₁ ，x ₁₁ ，x ₁₂ ，…，x _1D )、P ₂ (t ₂ ，x ₂₁ ，x ₂₂ ，…，x _2D ) Is defined as d _joint (P ₃ ). In particular, d _joint (P ₃ ) The calculation formula of (a) is as follows:

in the scheme provided by the embodiment of the application, the joint distance reflects the comprehensive fluctuation degree of the distances in all dimensions.

Further, the distance of the multi-dimensional data is defined as the minimum value of the distance of each dimension data and the joint distance, namely:

d _mul (P ₃ )＝min{d ₁ (P ₃ )，d ₂ (P ₃ )，…，d _D (P ₃ )，d _joint (P ₃ )} (6)

by finding d between adjacent feature points according to the above equation (6) _mul (P _i ) The largest point i gets the new feature point from this iteration.

For understanding, the PIP algorithm is briefly introduced below, specifically as follows:

inputting: data P = { P ₁ ,P ₂ ,…,P _L And the input data length is L, the output data dimension is D, and the compressed reference data length is k.

And (3) outputting: the compressed data is Q = { Q = { [ Q ] ₁ ,Q ₂ ,…,Q _k }。

The specific calculation process and code of the PIP algorithm are as follows:

according to the scheme provided by the embodiment of the application, a plurality of feature points are extracted from the reference data set according to a preset feature point extraction algorithm, the topological network structure is established according to the plurality of feature points and the incremental clustering model, namely, the reference data is compressed in a feature point extraction mode, the data volume of the reference data is reduced, further, the calculated amount in the simulation model evaluation process is reduced, and the simulation model evaluation efficiency is improved.

And 102, performing incremental clustering processing on the compressed reference data set according to a preset incremental clustering model to establish a topological network structure of reference data.

Specifically, the reference data in the original reference data set is scattered and unrelated, and can be regarded as an independent data point, the simulation process of the simulation model is a dynamic evolution process, the simulation result of one time cannot represent the simulation rule, the simulation results of multiple times only represent countless specific conditions, the reference data and the simulation results of the simulation model are directly compared one by one, the fitting is probably specific to the specific conditions, the internal rule of the reference data set cannot be fitted with the simulation results of the simulation model, and the reliability verification result of the simulation model according to the fitting result is low in accuracy. Therefore, in order to improve the accuracy of the simulation model reliability verification, the association relationship between the reference data in the reference data set, that is, the association rule contained in the reference data, needs to be determined. In the solution provided in the embodiment of the present application, there are various ways of determining the association relationship between the reference data in the reference data set, and an enhanced self-organizing incremental neural network (ESOINN) is taken as an example for description.

Specifically, there are various ways to establish the topology network structure of the reference data according to the ESOINN algorithm and the compressed reference data set, and a preferred way is described as an example below.

In a possible implementation manner, performing incremental clustering processing on the compressed reference data set according to a preset incremental clustering model to establish a topological network structure of reference data, includes: and constructing an initial topological network structure from any two data in the compressed reference data set, and sequentially adding the data in the compressed reference data set except the any two data to the initial topological network structure to obtain the topological network structure.

In a possible implementation manner, sequentially adding the data in the compressed reference data set except for the optional two data to the initial topological network structure to obtain the topological network structure, including: optionally selecting one data from the compressed reference data set except the optional two data as data to be added, and determining adjacent first data and second data with the shortest distance and the second shortest distance from the data to be added from the data contained in the current topological network structure; calculating a similarity threshold value between the data to be added and the first data and the second data; judging whether the distance between the data to be added and the first data and the second data is larger than the similarity threshold value or not; and if so, adding the data to be added into the current topological network structure until all the data in the compressed reference data set are added into the current topological network structure, so as to obtain the topological network structure.

In one possible implementation manner, the method further includes: if the distance between the data to be added and the first data and the second data is not larger than the similarity threshold, adding 1 to the existing time length corresponding to the edge connected with the two adjacent data with the shortest distance between the data to be added; judging whether the two adjacent data have the same class label, whether any adjacent data is not assigned to any class or whether the classes of the two adjacent data meet a preset density condition; if so, connecting the two adjacent data, updating the weight vectors of the second data and the adjacent data thereof, and deleting the edge with the existing duration being greater than a preset threshold; if not, updating the data density, the class label of the data and the connection relation thereof contained in the topological network structure; judging whether the number of the plurality of data contained in the topological network structure is an integral multiple of a preset parameter lambda or not; and if so, updating the class labels of all the nodes in the topological network structure, and deleting the noise nodes according to a preset noise condition.

Specifically, it is assumed that each reference data in the reference data set is a k × D matrix, where k represents the length of the compressed reference data, and D represents the number of output data of the simulation model; when reference data are subjected to incremental clustering processing through an ESOINN algorithm, each reference data is regarded as a row vector of 1 xkD, and the row vector is called as a weight vector W of the reference data _i 。

The incremental clustering process of the ESOINN algorithm will be briefly described below, and with reference to fig. 4, the specific steps are as follows.

Step 1, selecting any first reference data epsilon from the compressed reference data set, and inputting the first reference data epsilon into an incremental clustering model.

Step 2, finding out second reference data a with minimum distance from the first reference data ₁ And the second smallest reference data a ₂ 。

Step 3, calculating reference data a ₁ And a ₂ The similarity threshold of (2).

Step 4, judging whether the distance between the first reference data and any second reference data is larger than a ₁ And a ₂ A similarity threshold of (a); and if so, inserting the first reference data classes.

Step 5, judging whether the reference data a is connected or not ₁ And a ₂ ；

Step 6, if yes, the step a is carried out ₁ And a ₂ Connecting, otherwise updating the weight vector of the first reference data epsilon and connecting all the reference data a in the topological network structure ₁ The preset presence duration of the connected edge is increased by 1.

Step 7, if the distance between the first reference data and any second reference data is not more than a ₁ And a ₂ The similarity threshold value of the first reference data epsilon is updated, and all the first reference data epsilon and the reference data a in the topological network structure are compared ₁ The preset presence duration of the connected edge is increased by 1.

And 8, deleting the edges with the existing time length larger than a preset threshold value.

And 9, judging whether the number of the plurality of reference data is an integral multiple of a preset parameter lambda or not.

And step 10, if yes, updating class labels of all nodes in the topological network structure, and deleting noise nodes according to a preset noise condition.

Step 11, judging whether all reference data in the parameter data set are input completely; if yes, outputting a topological network result; otherwise, jumping to step 1.

Further, the scheme provided by the embodiment of the application is additionally provided with a repetition iteration rate gamma which reflects the number of times each signal is learned. Assuming that J reference data are contained in the reference data set, the total number of iterations of ESOINN is γ J. The purpose of gamma is to reduce the sensitivity of the clustering algorithm to random sequence input, ensure that the clustering result is more complete and comprehensive, and ensure that the content rule of the reference sample is known as far as possible.

And 103, performing reliability evaluation on the simulation model according to the compressed output data set, the topological network structure and a preset similarity threshold.

In a possible implementation manner, performing reliability evaluation on the simulation model according to the compressed output data set, the topological network structure, and a preset similarity threshold includes: selecting any output data from the compressed output data set, and calculating the matching degree of the any output data and a reference data set according to the any output data, the topological network structure and a preset similarity threshold; calculating a first average matching degree of any output data corresponding to the topological network structure according to the matching degree, determining a second average matching degree corresponding to all data in the compressed output data set according to the first average matching degree, and evaluating the reliability of the simulation model according to the second average matching degree.

According to the scheme provided by the embodiment of the application, the matching degree corresponding to any time point is calculated according to the output data set of the simulation model, the topological network structure of the reference data and the preset similarity threshold value, then the first average matching degree corresponding to any output data is calculated according to the matching degree, the second average matching degree corresponding to all data in the output data set is determined according to the first average matching degree, the simulation model is evaluated according to the second average matching degree, namely the evaluation index of the simulation model is dynamically determined according to the output data set of the simulation model, and the problem of poor applicability caused by the fact that the evaluation index of the simulation model is determined according to empirical data is avoided.

In a possible implementation manner, calculating a matching degree of any output data with a reference data set according to any output data, the topological network structure and a preset similarity threshold includes: determining first reference data with the shortest distance to any output data according to the topological network structure; and calculating the matching degree corresponding to any output data according to the distance between the first reference data and any output data and a preset similarity threshold.

In a possible implementation manner, determining a second average matching degree corresponding to all data in the output data set according to the first average matching degree includes:

calculating the second average matching degree by the following formula:

Specifically, it is assumed that a topological network structure constructed by an incremental clustering model is represented by G = { N, C }, where N represents a compressed and sorted reference data set, and reference data in the reference data set may be regarded as nodes in some vector forms; c represents an edge set established during incremental clustering, and the edges reflect the association degree between nodes. In the scheme provided by the embodiment of the application, the matching degree of the output data of the simulation model and the reference data is used as an evaluation index of the reliability of the simulation model. For convenience of understanding, the process of determining the evaluation index (degree of matching) will be briefly described below.

Assume that the output data set of the simulation model is H ^(x) ＝{X ₁ ，X ₂ ，…，X _I }. Firstly, any output data X of the simulation model is determined _i And determining the reference data Y closest to any output data from the preset reference data set _g (ii) a Then, X is examined _i And Y _g Distance between themWhether the ion is less than Y _g Mu-fold similarity threshold T _g If it is smaller than the threshold value, the output data X is ordered _i With reference to data Y _g The matching degree Deltau of (a) is 1; if not, then output data X is processed _i Wherein, each time point is independently matched with the similarity threshold value, and Δ u is equal to the ratio of the number of time points with successful matching to the total number of time points, but if Δ u is less than 0.1, Δ u =0.

The above results in a matching degree Δ u corresponding to a single output data, since the output data set H ^(x) If the simulation model comprises a plurality of output data, defining a simulation model reliability index F as an average matching degree corresponding to all the output data in the output data set:

in order to facilitate understanding of the calculation process of the evaluation index of the reliability of the simulation model, the following briefly introduces the steps of the design process of the evaluation index.

Inputting: h ^(x) ＝{X ₁ ，X ₂ ，…，X _I }; clustered reference samples G = { N, C }; the similarity threshold magnification mu.

And (3) outputting: F.

in the scheme provided by the embodiment of the application, each data in the output data set and the reference data set of the simulation model is compressed through a preset feature point extraction algorithm to obtain a compressed output data set and a compressed reference data set. And then carrying out incremental clustering processing on the compressed reference data set according to a preset incremental clustering model to establish a topological network structure of the reference data, and carrying out reliability evaluation on the simulation model according to the compressed output data set, the topological network structure and a preset similarity threshold. Therefore, in the scheme provided by the embodiment of the application, the output data set and the reference data set of the simulation model are compressed through the preset feature point extraction algorithm to obtain the compressed output data set and the compressed reference data set, so that the data volume of the output data and the reference data is reduced, the calculated amount in the simulation model evaluation process is further reduced, and the efficiency of the simulation model evaluation is improved; and a topological network structure between reference data in the compressed reference data set is constructed through the incremental clustering model, namely, internal rules between the reference data are determined, and the reliability of the simulation model is automatically evaluated according to the internal rules between the reference data, so that the automation performance of the evaluation of the simulation model is improved.

Based on the same inventive concept as the method shown in fig. 1, the embodiment of the present application provides an evaluation apparatus of a simulation model, see fig. 5, the apparatus including:

a compression unit 501, configured to determine an output data set and a reference data set of a simulation model, and compress each data in the output data set and the reference data set according to a preset feature point extraction algorithm to obtain a compressed output data set and a compressed reference data set;

an establishing unit 502, configured to perform incremental clustering processing on the compressed reference data set according to a preset incremental clustering model to establish a topological network structure of reference data;

an evaluation unit 503, configured to perform reliability evaluation on the simulation model according to the compressed output data set, the topology network structure, and a preset similarity threshold.

Optionally, the compressing unit 501 is specifically configured to:

and recalculating the data point with the maximum distance between any two feature points according to the new feature point and the initial feature point, and taking the data point as the feature point until determining the feature points with the preset number.

Optionally, the establishing unit 502 is specifically configured to:

Optionally, the establishing unit 502 is further configured to:

if the distance between the data to be added and the first data and the second data is not larger than the similarity threshold, adding 1 to the existing duration corresponding to the edge connected with the two adjacent data with the shortest distance between the data to be added;

Optionally, the evaluation unit 503 is specifically configured to:

calculating the second average matching degree by the following formula:

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims

1. A method for evaluating a simulation model, comprising:

determining an output data set and a reference data set of a simulation model, and compressing each data in the output data set and the reference data set according to a preset feature point extraction algorithm to obtain a compressed output data set and a compressed reference data set; wherein the content of the first and second substances,

recalculating a data point with the maximum distance between any two feature points according to the new feature point and the initial feature point, and taking the data point as the feature point until a preset number of feature points are determined; compressing each data according to the plurality of characteristic points corresponding to each data to obtain compressed data, and obtaining the compressed output data set and the compressed reference data set according to the compressed data;

performing incremental clustering processing on the compressed reference data set according to a preset incremental clustering model to establish a topological network structure of reference data; constructing an initial topological network structure from any two data in the compressed reference data set, and sequentially adding the data in the compressed reference data set except the any two data to the initial topological network structure to obtain the topological network structure;

and evaluating the reliability of the simulation model according to the compressed output data set, the topological network structure and a preset similarity threshold.

2. The method of claim 1, wherein sequentially adding data in the compressed reference data set other than the optional two data to the initial topological network structure results in the topological network structure, comprising:

3. The method of claim 2, further comprising: if the distance between the data to be added and the first data and the second data is not larger than the similarity threshold, adding 1 to the existing time length corresponding to the edge connected with the two adjacent data with the shortest distance between the data to be added;

4. The method according to any one of claims 1 to 3, wherein performing a confidence evaluation on the simulation model according to the compressed output data set, the topological network structure, and a preset similarity threshold comprises:

5. The method of claim 4, wherein calculating a degree of matching of the any output data with a reference data set according to the any output data, the topological network structure, and a preset similarity threshold comprises:

6. The method of claim 5, wherein determining a second average degree of matching for all data in the output data set based on the first average degree of matching comprises:

calculating the second average matching degree by the following formula:

7. An evaluation apparatus for a simulation model, comprising: