CN117421565B

CN117421565B - Markov blanket-based equipment assessment method and device and computer equipment

Info

Publication number: CN117421565B
Application number: CN202311742697.5A
Authority: CN
Inventors: 孙建彬; 徐博浩; 崔瑞靖; 姜江; 于海跃; 剧伦豪; 涂莉; 秦宇琪; 姚雪湄
Original assignee: National University of Defense Technology
Current assignee: National University of Defense Technology
Priority date: 2023-12-18
Filing date: 2023-12-18
Publication date: 2024-03-12
Anticipated expiration: 2043-12-18
Also published as: CN117421565A

Abstract

The utility model relates to a device evaluation method, a device and a computer device based on Markov blanket, which are characterized in that firstly, parent-child characteristics in a candidate characteristic set are screened in an iterative process and added into the candidate parent-child characteristic set, then, the partner characteristics of corresponding nodes are identified according to the found parent-child characteristics, and the scale of the candidate characteristic set is controlled by alternately finding the parent-child characteristics and the partner characteristics of a target variable, so that a candidate parent-child characteristic superset and a candidate partner characteristic superset are obtained. Then, in the characteristic contraction stage, the candidate father-son characteristic superset and the candidate spouse characteristic superset are trimmed, compared with the prior art, the method reduces the subset searching times and repeated condition independence test, compresses unnecessary calculation cost, and simultaneously does not lose calculation precision, namely, the time efficiency of screening can be greatly improved on the basis of ensuring the accuracy of screening key equipment test data, so that the accuracy and the real-time performance of equipment evaluation can be realized.

Description

Markov blanket-based equipment assessment method and device and computer equipment

Technical Field

The present disclosure relates to the technical field of equipment performance evaluation, and in particular, to an equipment evaluation method, apparatus and computer device based on a markov blanket.

Background

The equipment test data acquisition is basic work for supporting equipment test evaluation work, massive equipment test evaluation data are often generated in the equipment test evaluation work, however, the cost is high in a large amount of test data acquisition process, the problem that the test data are difficult to test and evaluate due to high dimension, redundancy, low utilization rate and the like is often caused, and the data utilization rate is generally low. That is, when the test data is used to evaluate the equipment, a large number of redundant features can reduce the evaluation effect and the evaluation efficiency, and how to scientifically and accurately mine key features or attribute information from massive historical data has important significance for guiding the acquisition and evaluation of the subsequent test evaluation data.

The feature selection can mine key features from historical data, so that the dimension reduction and feature contraction of the data are realized, and the method is greatly helpful for efficient data utilization and auxiliary decision making. Markov carpets of class attributes, as the smallest subset of features with the greatest predictability, have great utility in the feature selection process of mining key features. The markov blanket-based causal feature selection may therefore be used to identify key features of the rig test assessment data and make the identification process more accurate and efficient.

The balanced Markov blanket discovery algorithm is a constraint-based Markov blanket discovery algorithm, searches a subset of a current candidate feature set through a conditional independence test to find father-son nodes and partner nodes of a target node, integrates the characteristics of a synchronous Markov blanket learning algorithm and a divide-and-conquer Markov blanket learning algorithm, and considers the accuracy and time efficiency of the algorithm. The balanced markov blanket discovery algorithm sacrifices some time efficiency in the implementation process as compared to maintaining higher algorithm accuracy. Equalization as a method developed from a class-division Markov blanket discovery algorithmThe markov blanket discovery algorithm uses only a subset of the candidate feature set as a condition set to perform a condition independence test when making a decision, and reduces the number of samples required during learning, but when the scale of the currently selected feature is large, the true or false of each feature needs to be judged on a subset-by-subset basis, so that a large number of condition independence tests are generated, the calculation cost is greatly increased, and the time efficiency of feature selection is remarkably reduced. In the aspect of the organization structure of the algorithm, the method continuously performs the processes of finding father and child features, identifying spouse features, removing false positive features and the like. When a new parent-child feature in the candidate feature set Starting to be identified, a step will be performed to identify the target variable +.>About->And simultaneously judging whether false positive features appear in the newly added father-son features and the spouse features. In this process, it is necessary to continuously take each subset of the candidate father-son feature set and the candidate spouse feature set as a condition set to determine whether the feature is for +.>Independently, this would amplify the computational cost of the balanced Markov carpet discovery algorithm, further reducing the time efficiency of feature selection, which is more apparent over larger data samples. In summary, the algorithm is adopted to perform feature selection, so that high accuracy and time efficiency of equipment evaluation data screening cannot be considered, and the evaluation effect and the evaluation efficiency of equipment evaluation cannot be ensured.

Disclosure of Invention

In view of the foregoing, it is desirable to provide a markov blanket-based equipment evaluation method, apparatus, and computer device, which can greatly improve the time efficiency of screening on the basis of ensuring the accuracy of screening critical equipment test data, so as to achieve the accuracy and instantaneity of equipment evaluation.

A markov blanket-based equipment assessment method, the method comprising:

S1, selecting and deleting the associated feature with the highest association degree with the target variable from the candidate feature set of the current iteration to obtain a new candidate feature set; wherein the candidate feature set is a feature set that does not include the target variable; the target variable is class label data of equipment; the category label data are obtained by converting data of one dimension characteristic appointed in the test evaluation data set of the equipment; the data in the trial evaluation dataset includes features of multiple dimensions;

s2, if the subset of the candidate father-son feature set of the current iteration does not exist so that the associated feature is independent of the target variable condition, adding the associated feature to the candidate father-son feature set of the current iteration to obtain a new candidate father-son feature set, and if the subset of the relative difference set of the new candidate father-son feature set and the candidate father-son feature set is not exist so that the corresponding candidate father-son feature is independent of the target variable condition, executing step S3;

s3, if the features in the relative difference set of the candidate feature set of the initial iteration and the new candidate father-son feature set are dependent on the target variable condition when the feature set is given as well as the newly added candidate father-son feature set, and the features are in the new candidate feature set, moving the features from the new candidate feature set to the candidate partner feature set of the target variable about the candidate father-son feature set, and obtaining the candidate feature set output by the current iteration and the candidate partner feature set corresponding to the newly added candidate father-son feature set;

S4, repeating the steps S1-S3 until the candidate feature set is an empty set, and obtaining a candidate father-son feature superset of the target variable and candidate spouse feature supersets corresponding to the candidate father-son features;

s5, selecting and deleting the associated parent-child feature with the highest association degree with the target variable from the candidate parent-child feature superset of the current iteration to obtain a new candidate parent-child feature superset;

s6, if the subset of the union set of the candidate partner characteristic superset of the initial iteration and the new candidate father-son characteristic superset does not exist, so that the associated father-son characteristic is independent of the target variable condition, adding the associated father-son characteristic into the father-son characteristic set of the current iteration;

s7, repeating the steps S5-S6 until the candidate parent-child feature superset is an empty set, and obtaining a target parent-child feature set;

s8, selecting and deleting the associated partner feature with the highest association degree with the corresponding target father-son feature from the candidate partner feature superset of the current iteration to obtain a corresponding new candidate partner feature superset;

s9, if the subset of the union set of the candidate father-son feature superset of the initial iteration and the corresponding partner feature set of the corresponding target father-son feature in the current iteration does not exist, so that the associated partner feature is independent of the target variable condition, the associated partner feature is added into the corresponding partner feature set as a real partner feature, and a new partner feature set is obtained;

S10, repeating the steps S8-S9 until the corresponding candidate spouse feature superset is an empty set, taking the obtained spouse feature set as a new spouse feature superset, and repeating the steps S8-S9 until the corresponding spouse feature superset is an empty set, so as to obtain a plurality of target spouse feature sets;

and S11, obtaining a Markov blanket of the target variable according to the target father-son feature set and the target spouse feature set, evaluating the performance of the equipment according to test evaluation data of corresponding dimensions of features in the Markov blanket, and correspondingly processing the equipment according to an evaluation result.

A markov blanket-based equipment assessment device, the device comprising:

the feature growth module is used for executing the following steps:

the characteristic contraction module is used for executing the following steps:

And the equipment performance evaluation module is used for obtaining a Markov blanket of the target variable according to the target father-son feature set and the target spouse feature set, evaluating the performance of the equipment according to test evaluation data of corresponding dimensions of features in the Markov blanket, and carrying out corresponding processing on the equipment according to an evaluation result.

A computer device comprising a memory storing a computer program and a processor implementing the steps of the method described above when the processor executes the computer program.

In the markov blanket-based equipment evaluation method, the markov blanket-based equipment evaluation device and the computer equipment, the father-son characteristics in the candidate characteristic set are screened in the iterative process and added into the candidate father-son characteristic set, then the spouse characteristics of corresponding nodes are identified according to the found father-son characteristics, and the scale of the candidate characteristic set is controlled by alternately finding the father-son characteristics and the spouse characteristics of the target variable, so that the candidate father-son characteristic superset and the candidate spouse characteristic superset are obtained. Then, in the characteristic contraction stage, the candidate father-son characteristic superset and the candidate spouse characteristic superset are trimmed, compared with the prior art, the method reduces the subset searching times and repeated condition independence test, compresses unnecessary calculation cost, can improve the discovery efficiency of the Markov blanket, and meanwhile, can greatly improve the time efficiency of screening on the basis of ensuring the accuracy of screening key equipment test data without losing calculation precision, thereby realizing the accuracy and real-time of equipment evaluation.

Drawings

FIG. 1 is a schematic diagram of equipment test data tagging;

FIG. 2 is a schematic diagram of a separate feature learning and pruning process;

FIG. 3 is an algorithm flow chart of the feature growth phase of the present method;

FIG. 4 is a schematic diagram of a simple Bayesian network feature screening process;

FIG. 5 is an algorithm flow chart of the feature shrink phase;

fig. 6 is an internal structural diagram of a computer device in one embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.

In one embodiment, a markov blanket-based equipment assessment method is provided, comprising the steps of:

step 102, selecting and deleting the associated feature with the highest association degree with the target variable from the candidate feature set of the current iteration to obtain a new candidate feature set.

The candidate feature set is a feature set which does not comprise a target variable, the target variable is category label data of equipment, the category label data is obtained by converting data of one dimension feature appointed in test evaluation data set of the equipment, and the data in the test evaluation data set comprise features of multiple dimensions.

When the equipment test evaluation feature selection is carried out, the time series data is required to be converted into category label data, then the category label data is used as a target variable to find a local causal relationship structure of the category label data, so that the feature selection is realized, namely key features are mined, and redundant features are removed. In the case that the equipment test evaluation data does not provide label information, the score evaluation data is converted into grade evaluation data through a data processing process, specifically, unclassified evaluation data are classified into different grade data (such as excellent, good, medium, poor and the like), and then the converted grade label data is used as a target variable to learn a Markov blanket of a grade variable, so that feature dimension reduction is realized. As shown in fig. 1, a schematic diagram of equipment test data labeling is provided.

The method comprises the steps of converting data of one dimension characteristic appointed in test evaluation data set of equipment to obtain category label data, namely selecting one characteristic data which is most closely related to the performance to be evaluated from multi-dimension characteristic data according to the performance of the equipment to be evaluated, and converting the characteristic data into the category label data. For example, the rocket engine operation failure rate is taken as an evaluation object, and the rocket engine test evaluation data set comprises 26 attributes, wherein the second attribute is the residual life of the rocket engine and is expressed in the residual cycle number. The residual life of the engine directly influences the failure rate of the engine during operation, so that the residual cycle number of the rocket engine can be converted into a category variable, and Markov blanket learning can be performed by taking the category variable as a target variable.

And step 104, if the subset of the candidate father-son feature set of the current iteration does not exist so that the associated feature is independent of the target variable condition, adding the associated feature serving as the candidate father-son feature set of the current iteration to obtain a new candidate father-son feature set, and if the subset of the relative difference set of the new candidate father-son feature set and the candidate father-son feature set does not exist so that the corresponding candidate father-son feature is independent of the target variable condition, executing step 106.

After the associated feature is selected from the candidate feature set in step 102, it is determined whether the associated feature is a candidate parent-child feature, and further whether it is required to be added to the candidate parent-child feature set. If the new associated feature condition is independent of the target variable, it is not considered a candidate parent-child feature because the parent-child feature of the target variable is necessarily conditional on the target variable.

In addition, after a new candidate parent-child feature set is obtained, the candidate parent-child features added in the history iteration are required to be screened, specifically, a subset meeting the relative difference set between the new candidate parent-child feature set and each candidate parent-child feature which enable the candidate parent-child feature set to be independent of the target variable condition is found, and because the new candidate parent-child feature is added in the current iteration, the number of elements in the relative difference set between the new candidate parent-child feature set and each candidate parent-child feature set is increased, the number of subsets is correspondingly increased, and therefore, when each candidate parent-child feature set is expanded, sequential rough feature contraction is required to be carried out, so that the calculation cost in a formal feature contraction stage is reduced. It will be appreciated that in this rough feature shrink in the current iteration, the newly added associated feature will not be deleted in the current iteration due to the same judgment conditions.

It can be seen that in the method, when judging whether to add the associated feature, only testing the condition set including the associated feature, so that the following rough feature shrinkage stage and the following formal feature shrinkage stage, namely, checking each candidate parent-child feature in the candidate parent-child super-set, and deleting the wrong candidate parent-child feature in the search, the process of making the size of the candidate parent-child feature set as small as possible is more efficient.

And 106, if the features in the relative difference set of the candidate feature set of the initial iteration and the new candidate parent-child feature set are dependent on the target variable condition when the feature set is given as well as the newly added candidate parent-child feature set and the features are in the new candidate feature set, moving the features from the new candidate feature set to the candidate partner feature set of the target variable about the candidate parent-child feature, and obtaining the candidate feature set output by the current iteration and the candidate partner feature set corresponding to the newly added candidate parent-child feature.

The candidate feature set of the initial iteration is the equipment test data feature setUWith the target variableTIs the relative difference set of (2)Is shown inUBut is not inTFeatures of (a)。The relative difference set of the candidate feature set of the initial iteration and the new candidate feature set comprises the following steps The new candidate feature set obtained in step 102, as well as features in the historical iterations that were deleted from the candidate feature set and not added to the candidate parent-child feature set, include in the relative difference set all non-candidate parent-child features in which there may be candidate partner features of the target variable with respect to the associated features in the newly added candidate parent-child feature set in step 104, i.e. there may be non-candidate parent-child features and associated features and the target variable forming a V structure, i.e. there may be non-candidate parent-child features and target variable pointing to the same associated feature. It will be appreciated that not only true partner features but also parent features of the true partner features may be included in these non-candidate parent-child features, as the parent features may point indirectly to the child features to which the target variable points directly through the true partner features.

And if the separated set of a certain non-candidate father-son feature and the newly added union set of the candidate father-son features make the non-candidate father-son feature dependent on the target variable condition, the non-candidate father-son feature is regarded as the candidate spouse feature of the target variable. 1) If the non-candidate father-son feature is in the new candidate feature set of the current iteration, directly deleting the non-candidate father-son feature from the new feature set in the current iteration, adding the non-candidate father-son feature to the partner feature set of the target variable about the current association feature, and not calculating the association degree between the non-candidate father-son feature and the target variable in the subsequent iteration, namely judging whether the non-candidate father-son feature is the candidate father-son feature or not; 2) If the non-candidate feature is a feature that has been previously deleted but not added to the candidate parent-child feature set, it is directly added to the corresponding partner feature set, and at this time, the new candidate feature set of the current iteration is the candidate feature set output by the current iteration and is used as the source set of the candidate parent-child feature in the next iteration.

In summary, it can be seen that in the present method, once a new feature is determined to be added to the candidate parent-child feature set, the search for a candidate partner set for the target variable for the new feature is triggered immediately.

And step 108, repeating the steps 102-106 until the candidate feature set is an empty set, and obtaining a candidate father-son feature superset of the target variable and candidate spouse feature supersets corresponding to the candidate father-son features.

By a superset of candidate parent-child features is meant that not only true parent-child features but also non-child offspring features, i.e. offspring features to which the target variable is indirectly directed by the true child features, are included in the set, but that non-child offspring features are still added to the superset of candidate parent-child features during the feature growth phase, but are removed from the superset of candidate parent-child features as erroneous nodes during the feature contraction phase of the candidate, since no subset within the true parent-child feature set makes the non-child offspring feature condition independent of the target variable.

By superset of candidate partner features is meant that not only true partner features, but also possibly parent features of the true partner features are included in the set.

In this application, nodes, features, and variables have the same meaning.

Step 110, selecting and deleting the associated parent-child feature with the highest association degree with the target variable from the candidate parent-child feature superset of the current iteration to obtain a new candidate parent-child feature superset.

And step 112, if the subset of the union set of the candidate partner characteristic superset of the initial iteration and the new candidate father-son characteristic superset does not exist, so that the associated father-son characteristic is independent of the target variable condition, adding the associated father-son characteristic into the father-son characteristic set of the current iteration.

Step 114, repeating the steps 110-112 until the candidate parent-child feature superset is an empty set, and obtaining the target parent-child feature set.

And 116, selecting and deleting the associated partner feature with the highest association degree with the corresponding target father-son feature from the candidate partner feature superset of the current iteration to obtain a corresponding new candidate partner feature superset.

Since the candidate spouse of the candidate spouse feature set of the target variable about a certain target father-son feature is the parent feature of the candidate father-son feature, it is necessary to determine whether the currently most likely spouse feature is to be added to the spouse feature set corresponding to the target father-son feature according to the degree of association, so as to obtain the real spouse feature set as soon as possible.

It should be noted that, in the feature shrinking stage of the scheme, the target father-son feature set can be obtained first, then the candidate spouse feature set corresponding to the target father-son feature is shrunk, and the shrinkage of the candidate spouse feature set corresponding to the target father-son feature can be triggered while determining one target father-son feature at a time.

And 118, if the subset of the union set of the candidate father-son feature superset of the initial iteration and the corresponding partner feature set of the corresponding target father-son feature in the current iteration does not exist, so that the associated partner feature is independent of the target variable condition, adding the associated partner feature as a real partner feature into the corresponding partner feature set to obtain a new partner feature set.

Step 120, repeating steps 116-118 until the corresponding candidate spouse feature superset is an empty set, taking the obtained spouse feature set as a new spouse feature superset, and repeating steps 116-118 until the corresponding spouse feature superset is an empty set, so as to obtain a plurality of target spouse feature sets;

and step 122, obtaining a Markov blanket of the target variable according to the target father-son feature set and the target spouse feature set, evaluating the performance of the equipment according to test evaluation data of corresponding dimensions of features in the Markov blanket, and correspondingly processing the equipment according to an evaluation result.

As shown in fig. 2, a schematic diagram of the independent feature learning and pruning process in the present solution is provided.

According to the Markov blanket-based equipment evaluation method, firstly, the father-son characteristics in the candidate characteristic set are screened in the iterative process and added into the candidate father-son characteristic set, then the partner characteristics of the corresponding nodes are identified according to the found father-son characteristics, and the scale of the candidate characteristic set is controlled by alternately finding the father-son characteristics and the partner characteristics of the target variable, so that the candidate father-son characteristic superset and the candidate partner characteristic superset are obtained. Then, in the characteristic contraction stage, the candidate father-son characteristic superset and the candidate spouse characteristic superset are trimmed, compared with the prior art, the method reduces the subset searching times and repeated condition independence test, compresses unnecessary calculation cost, can improve the discovery efficiency of the Markov blanket, and meanwhile, can greatly improve the time efficiency of screening on the basis of ensuring the accuracy of screening key equipment test data without losing calculation precision, thereby realizing the accuracy and real-time of equipment evaluation.

In one embodiment, selecting the associated feature with the highest degree of association with the target variable from the candidate feature set of the current iteration comprises:

Wherein,representing the associated feature->Representing the target variable +.>Representing candidate feature set,/->Representing the degree of association of the candidate feature in the candidate feature set with the target variable.

The method for calculating the association degree comprises the steps of calculating the association degree, namely, a conditional independence test, using G2 test for discrete data and Z test for continuous data, and selecting a corresponding test method according to the data type of the candidate feature set.

In one embodiment, in step 104, further comprising:

if the subset of the relative difference set of the new candidate parent-child feature set and the candidate parent-child feature set in the new candidate parent-child feature set exists so that the corresponding candidate parent-child feature set is independent of the target variable condition, deleting the corresponding candidate parent-child feature set from the new candidate parent-child feature set to obtain an updated candidate parent-child feature set, and simultaneously clearing the target variable about the candidate partner feature set of the corresponding candidate parent-child feature.

It can be seen that in this scheme, each time an associated feature is successfully added to the candidate parent-child feature set, a rough feature contraction of the new candidate parent-child feature set for the current iteration and an increase of candidate partner features corresponding to the associated feature are triggered.

In one embodiment, in step 104, further comprising:

if the subset of the candidate parent-child feature set of the current iteration exists so that the associated feature and the target variable condition are independent, the associated feature is not added into the candidate parent-child feature set of the current iteration, the current iteration is ended, meanwhile, the subset of the candidate parent-child feature set of the current iteration is used as a separation set of the associated feature, and the associated feature is added into the independent set of the target variable current iteration to obtain a new independent set.

Wherein the independent feature in the independent set of target variables is that the candidate feature set satisfies a feature independent of the target variables with respect to the empty set, i.e. the associated feature deleted from the candidate feature set in an iteration but not determined to be added to the candidate parent-child feature set, it follows from the above that the independent feature may be a spouse feature of the target variables with respect to one or more candidate parent-child features.

After the current iteration is finished, the next iteration is started, namely, the associated feature with the highest association degree with the target variable is selected and deleted from the new candidate feature set of the current iteration, and the new candidate feature set is obtained again.

In one embodiment, in step 108, further comprising:

When the associated feature of the current iteration is the candidate father-son feature added to the candidate father-son feature set last, if the union set of the isolated feature in the current isolated set and the last candidate father-son feature does not exist so that the corresponding isolated feature is independent from the target variable condition, the corresponding isolated feature is added to the candidate spouse feature set corresponding to the last candidate father-son feature.

When the associated feature of the current iteration is the last candidate parent-child feature added to the candidate parent-child feature set, the candidate parent-child feature set is determined, new candidate parent-child features are not added any more, meanwhile, at most, non-candidate parent-child features remain in the current candidate feature set, and the non-candidate parent-child features are added to the independent set, so that whether the candidate partner feature corresponding to the last candidate parent-child feature exists in the independent set is only needed to be judged.

As shown in fig. 3, an algorithm flow chart of the feature growth phase of the present method is provided.

To this end, the feature growth phase can be summarized as follows:

defining a set of features asThe designated target node is +>. First calculate +.>Except->All features except- >Given the degree of dependence of the empty set and characterized by being independent of +.>A separate set of the features is stored. The steps are then performed until the candidate feature set is empty.

SearchingIs a candidate parent-child feature set of->And candidate partner feature set->. First select and +.>The most relevant feature->Then delete +.>. If->Make->And->At->The above is conditional independent, then the next feature in the candidate feature set is directly considered and the next step is not performed. Otherwise, will->Added to->. Due to the new features->Has been successfully added to->Thus the next stage is to checkAnd delete the wrong node during the search, thereby making +.>The size of (2) is as small as possible. Furthermore, in this step from +.>The deletion in (a) is independent of->Only the feature including newly added is tested>To make the search process efficient.

At the same time, once the new featureAdded to->In will immediately trigger the search +.>Is about->Is a candidate partner of (a). Since in a faithful Bayesian network if +.>Is>，/>，/>Form V structure->If and only if->And->，/>. Thus this step is not directly from Is->Find +.>Is the partner of->Find->Is a partner of (a). If->Features of->At a given +.>Separate set of (A) and->When and->Condition dependency, then->Is regarded as->Is added to(about->Is->Candidate partner of (2) and finally will each +.>The combination is->Is a candidate partner feature set of (a).

Figure 4 provides a schematic illustration of a simple bayesian network feature screening process.

But as shown in fig. 4, the error node (e.g.And->) Can be added to +.>And->Is a kind of medium. Therefore, after the execution of the finishing growth phase, it is also necessary to go from +.>And->These erroneous nodes are deleted.

In one embodiment, in step 116, selecting the associated partner feature from the superset of candidate partner features of the current iteration that has the highest degree of association with the corresponding target parent-child feature, comprising:

wherein,representing associated spouse characteristics->Candidate partner feature set representing the target variable with respect to the associated feature +.>Isolated set representing candidate partner characteristics, +.>Representing the target variable.

In one embodiment, in step 118, if there is a subset of the union of the superset of candidate parent-child features of the initial iteration and the corresponding superset of candidate parent-child features in the current iteration such that the associated partner feature is independent of the target variable condition, the associated partner feature is not added to the corresponding superset of partner features, ending the current iteration.

As shown in fig. 5, an algorithm flow chart of the feature shrink phase is provided.

The flow of the feature shrink phase can be summarized as follows:

step 1, fromAnd deleting the fake partner feature. In this step, < > use +.>Is taken as a condition set from->And the false partner deleted. At the beginning of this step +.>About->Is->Set to null and then select at +.>Has the formula->The feature with highest association degree is added to +.>. At the same time, the feature will be from->And deleted. Due to->The partner in (a) is->Thus, need to be +.>Is added to +.>In order to delete spurious spouses as soon as possible. Then, for->Each feature of->If at the set->There is a subset let->The conditions are independent of->Will->From the slaveIs removed. This process will be repeated until +.>Is empty.

Step 2, fromAnd deleting the false parent-child features from the candidate parent-child nodes of (a). Step 2 uses the same strategy as step 1, nor is it directly from +.>The erroneous node is deleted. But instead assume the set parent-child feature set +.>Initially empty and will be associated with +.>Middle and->The feature with highest association degree is added to +.>Is a kind of medium. Then, for->Each feature of- >If at the set->There is a subset of->Condition independent of and->Characteristics->Will be from->Is removed. At the same time, will->（/>Is about->Is a partner of (c) is emptied directly. This process is repeated until a candidate set of parent-child featuresIs empty.

Further, in step 1, evenWithout spouse, step 2 will still be performed. For example, as shown in FIG. 4, assume that currently，/>，/>. Then (I)>Added to +.>Although itIs empty. In the systolic phase, due to +.>Can use the assumption of condition set +.>From->Delete->Is not behind children of (2)And (5) replacing nodes.

In the growth phase, the feature addition process can find the target nodeIs a real markov blanket. First of all conditional dependence on->Is added to the candidate parent-child feature set +.>And due to the fact that for any two different featuresAnd->If->And->There is an edge between, then for subset +>There is->，/>Is true, thus can be achieved by>The subset of the inner is adjusted from +.>Some of the false positive parent-child features removed in (a). Characteristics->Given all->Depends on +.>Thus->Comprises->Is a true parent-child feature of (c). In search of +.>At the same time, identify +.>But this process does not affect +. >Is a discovery of (3). If the characteristic->Is a clash machine forming a V-structure>Features->Is regarded as +.>Is a candidate partner of (a). Since an exhaustive search is performed, given an empty set or some false parent-child nodes, all the conditions in the candidate feature set are independent of +.>Any true positive partner of all features of (a) will not be missed, so we have found all true positive partner features. Thus (S)>And->Together comprise all->True parent-child features and spouse features of (a).

Thus, there are two types of false positives in the candidate set: 1)Non-child offspring nodes of (a); 2)/>Is>Is a candidate partner of (a). Due to->Some nodes and->Some of the nodes in (a) may be directly composed +.>Is thus directly from +.>The parent of the deleted partner node in the candidate partner subset of (a). The true spouse will not be deleted, since the condition set always contains +.>And a common child node of the true partner node. After step 2 only +.>Is a true partner characteristic of (c). />Comprises->Is thus +.>Subset of (d) and->Corresponding subsets of true-partner features of (a) are combined together to form the target node +.>Is then step 3 from +. >Non-child offspring nodes are deleted. Due to->Under the condition of any given subset of (3), the true parent-child feature always depends on +.>Therefore, only +.>True parent-child features of (a). After the screening of all features in the candidate feature set has been performed, there is obtained +.>Is a markov blanket.

In the shrink phase, it is necessary to remove what is found in the build phaseAnd->False positive features in (a). There are two types of false positives found in Markov carpets 1)>Non-child offspring nodes of (a); 2)/>Is>Is a candidate partner of (a). In the candidate feature set of the Markov blanket, some father-son nodes and spouse nodes are given with +.>The conditional independent relationship of (c) would directly indicate false markov carpet feature nodes. Then, no false positive partners are ensured to exist by detailed subset search and conditional independence test. />Subset of (2) and corresponding->Together form->Is a complete and realistic markov blanket because of +.>All real father and son nodes are contained; meanwhile, since in the re-faithful Bayesian network, +.>Markov blanket->，/>Conditionally independent of->All remaining features of (a) delete- >Will only remain +.>Is a true parent-child feature of (c). Thus, after the systolic phase, ->And->All together and only true +.>And a partner feature set. In other words, a->And->Comprises all and only true real markov carpet nodes.

The protocol removes false positive features during the systolic phase byAnd->A subset search is performed above to discriminate false positive features, but when the subset search is performed at this stage,/->And->Is the complete candidate parent subset and candidate partner set after the completion of the growth phase execution, not for +.>And->Subset search and feature recognition are performed, thus reducing subset search timesThe number and repeated condition independence test compresses unnecessary calculation cost, can improve the discovery efficiency of the Markov blanket, and does not lose the calculation precision of the algorithm.

The computational complexity of the constraint-based markov blanket discovery algorithm depends on the number of conditional independence tests. ECMB is first according toCharacteristic pair->Is ordered and then an exhaustive subset search is performed on the currently selected candidate parent-child feature set in each iteration. Thus, algorithm two (+) >) Is +.>. Algorithm III) The computational complexity of step 1 is +.>The calculation complexity of the third step 2 of the algorithm is that. The computational complexity of ECMB overall is

Table 1 summarizes the computational complexity of several typical constraint-based markov blanket discovery algorithms.

Table 1 computational complexity of constraint-based MB discovery algorithm

According to Table 1, ECMB (the present method) has the same computational complexity as BAMB as a whole and is intermediate between the synchronous and divide-and-conquer algorithms, but the computational complexity of the algorithm in the process of cycling through the candidate feature set is onlyWhereas the computational complexity of BAMB is actually +.>Therefore, compared with BAMB, ECMB reduces the computational complexity, further improves the algorithm operation efficiency, and greatly improves the screening time efficiency on the basis of ensuring the accuracy of screening the key equipment test data, thereby realizing the accuracy and the instantaneity of equipment evaluation.

The validity of the present protocol was verified experimentally as follows:

the dataset consists of a plurality of multivariate time series recording parameters during operation of the rocket engine for assessing the remaining life of the engine. Each data set is further divided into a training subset and a testing subset. The data set contains 26 nodes, and the sample size is the sum of the time sequences of all engines in the engine queue. The engine operates normally at the beginning of each time sequence and fails at some point in the sequence. In the training set, the magnitude of the fault is getting larger and larger until the system fails. In the test set, the time series ends some time before the system fails.

For rocket launch datasetsThere are four sets of engine operating data over a time series, the algorithm is used to learn the Markov blanket one by one, and the size of the learned Markov blanket is compared to the time the algorithm is run and the number of CI tests is recorded. Before learning a Markov carpet on a dataset using an algorithm, the actual dataset needs to be considered as an equipment trial evaluation datasetThere is no tag or class variable, so it is also not possible to learn its Markov blanket directly with the evaluation result as the target variable. It is necessary to convert the evaluation result data into category label data and then learn a markov carpet of a target variable with the category label data as the target variable before performing an experiment based on the actual data set. In particular toThe dataset, as shown in table 2, contains 26 nodes, i.e., has 26 attributes, as shown in table 3, providing 26 attributes of the rocket-launched dataset. The second attribute is the remaining life of the rocket engine in terms of the number of cycles remaining. As shown in Table 5, the remaining life of the engine directly affects the failure rate during the starting operation, so that the number of remaining cycles of the rocket engine can be converted into a class variable, and Markov blanket learning can be performed by taking the class variable as a target variable, and the rocket emission dataset +. >Key features of (2). />

Meanwhile, as the reference Bayesian dataset is discrete data, the condition independence test adopted by the experiment is G ² Checking andthe data sets are all discrete data, so that the experimental section in actual data adopts Fisher-z test, but each algorithm is not changed.

Table 2 engine remaining life data example

The remaining life data of the engine is classified according to the remaining life classification label of the engine used in table 2, and the remaining life data of the engine in units of the number of remaining cycles is converted into category data.

Table 3 26 attributes of rocket launch dataset

TABLE 4 Engine remaining life Classification Label

Table 5 engine remaining life level data example

According to table 5, the unlabeled rocket engine operation data is converted into engine operation data with classification labels, the Markov blanket of the class variable of the engine life grade is learned by taking the residual life grade of the engine as a target variable, and key features of the rocket engine operation data set are mined. Six Markov carpet learning algorithms of different classes are also used for learningAnd evaluating each method by using the Markov blanket of the target node.

Table 6 learning markov carpet size on actual dataset

Table 7 number of CI tests performed to learn markov carpets on actual dataset

Table 8 algorithm run time for learning markov carpets on actual dataset

According to tables 6, 7 and 8, ECMB is improved by 43%, 69%, 88% and 65% with respect to the BAMB running speed on four data sets, respectively, while this section uses the size of the learned markov blanket (i.e. the number of features in the markov blanket) as the standard comparison algorithm accuracy, and the variation of ECMB with respect to the BAMB is 0%, -23.5%, 0% and-31.8%, respectively, it can be seen that ECMB maintains the calculation accuracy equivalent to that of the BAMB at least on a general data set, and at the same time, since there is no reference bayesian network in the actual data set, i.e. there is no standard markov blanket as a measurement criterion, there is a certain false positive feature of the learned markov blanket in the actual data set. In the case of unreliable algorithm calculation accuracy, by simply comparing the operation efficiency of the algorithm, it can be seen that the ECMB improvement relative to BAMB is still reliable on the actual data set.

It should be understood that, although the steps in the flowchart of fig. 1 are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in fig. 1 may include multiple sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, nor do the order in which the sub-steps or stages are performed necessarily performed in sequence, but may be performed alternately or alternately with at least a portion of other steps or sub-steps of other steps.

In one embodiment, there is provided a markov blanket-based equipment assessment device comprising:

the feature growth module is used for executing the following steps:

and the equipment performance evaluation module is used for obtaining the Markov blanket of the target variable according to the target father-son feature set and the target spouse feature set, evaluating the performance of equipment according to test evaluation data of feature corresponding dimensions in the Markov blanket, and correspondingly processing the equipment according to an evaluation result.

The specific definition of the markov carpet-based equipment assessment device may be found in the above description of the markov carpet-based equipment assessment method, which is not described in detail herein. The various modules in the markov blanket-based equipment assessment device described above may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 6. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is for storing equipment test evaluation data. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a markov carpet-based equipment assessment method.

It will be appreciated by those skilled in the art that the structure shown in fig. 6 is merely a block diagram of some of the structures associated with the present application and is not limiting of the computer device to which the present application may be applied, and that a particular computer device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.

In an embodiment a computer device is provided comprising a memory storing a computer program and a processor implementing the steps of the method of the above embodiments when the computer program is executed.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the various embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The above examples merely represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the invention. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application is to be determined by the claims appended hereto.

Claims

1. A markov blanket-based equipment assessment method, the method comprising:

S2, if a subset of the candidate father-son feature set of the current iteration does not exist so that the association feature is independent of the target variable condition, the association feature is added to the candidate father-son feature set of the current iteration to obtain a new candidate father-son feature set, and if a subset of a relative difference set of the new candidate father-son feature set and the candidate father-son feature set does not exist so that the corresponding candidate father-son feature is independent of the target variable condition, the step S3 is executed;

in the step S2, further includes: if a subset of the candidate parent-child feature set of the current iteration exists so that the associated feature and the target variable condition are independent, the associated feature is not added into the candidate parent-child feature set of the current iteration, the current iteration is ended, meanwhile, the subset of the candidate parent-child feature set of the current iteration is used as a separation set of the associated feature, and the associated feature is added into an independent set of the target variable current iteration to obtain a new independent set; wherein the independent feature of the independent set of target variables is that the candidate feature set satisfies the independent feature of the target variables with respect to the empty set;

s3, if the features in the relative difference set of the candidate feature set of the initial iteration and the new candidate parent-child feature set are dependent on the target variable condition when given the separated set of the features and the newly added candidate parent-child feature set, and the features are in the new candidate feature set, moving the features from the new candidate feature set to the candidate partner feature set of the target variable about the candidate parent-child feature, and obtaining a candidate feature set output by the current iteration and a candidate partner feature set corresponding to the newly added candidate parent-child feature;

s9, if the subset of the union set of the candidate father-son feature superset of the initial iteration and the corresponding partner feature set of the corresponding target father-son feature in the current iteration does not exist, and the associated partner feature is independent of the target variable condition, the associated partner feature is added into the corresponding partner feature set as a real partner feature, and a new partner feature set is obtained;

and S11, obtaining a Markov blanket of the target variable according to the target father-son feature set and the target spouse feature set, evaluating the performance of equipment according to test evaluation data of corresponding dimensions of features in the Markov blanket, and correspondingly processing the equipment according to an evaluation result.

2. The method according to claim 1, characterized in that in said step S2, it further comprises:

and if the subset of the relative difference set of the new candidate father-son feature set and the candidate father-son feature set exists so that the corresponding candidate father-son feature set is independent of the target variable condition, deleting the corresponding candidate father-son feature set from the new candidate father-son feature set to obtain an updated candidate father-son feature set, and simultaneously clearing the target variable about the candidate spouse feature set of the corresponding candidate father-son feature set.

3. The method according to claim 1, characterized in that in said step S4, further comprising:

4. The method according to claim 1, wherein in the step S1, selecting the associated feature with the highest association degree with the target variable from the candidate feature set of the current iteration includes:

5. The method according to claim 4, wherein in said step S8, selecting the associated partner feature having the highest degree of association with the corresponding target parent-child feature from the superset of candidate partner features of the current iteration comprises:

wherein,representation switchCouple characteristics (I)>A candidate set of spouse characteristics representing the target variable with respect to the associated characteristic,representing an isolated set of candidate spouse characteristics.

6. The method according to claim 1, characterized in that in said step S9, further comprising:

if the subset of the union set of the candidate parent-child feature superset of the initial iteration and the corresponding candidate parent-child feature set in the current iteration exist, so that the association partner feature is independent of the target variable condition, the association partner feature is not added into the corresponding partner feature set, and the current iteration is ended.

7. A markov blanket-based equipment assessment device, the device comprising:

the feature growth module is used for executing the following steps:

In the step S2, further includes: if a subset of the candidate parent-child feature set of the current iteration exists so that the associated feature and the target variable condition are independent, the associated feature is not added into the candidate parent-child feature set of the current iteration, the current iteration is ended, meanwhile, the subset of the candidate parent-child feature set of the current iteration is used as a separation set of the associated feature, and the associated feature is added into an independent set of the target variable current iteration to obtain a new independent set;

wherein the independent feature of the independent set of target variables is that the candidate feature set satisfies the independent feature of the target variables with respect to the empty set;

8. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 6 when the computer program is executed.