CN113495800B

CN113495800B - Diagnostic prediction data and feature re-cognition method based on extended multi-attribute decision

Info

Publication number: CN113495800B
Application number: CN202110331591.0A
Authority: CN
Inventors: 陶来发; 索明亮; 王超; 程玉杰; 丁宇; 吕琛
Original assignee: Beihang University
Current assignee: Beihang University
Priority date: 2020-04-02
Filing date: 2021-03-26
Publication date: 2024-06-18
Anticipated expiration: 2041-03-26
Also published as: CN113495800A

Abstract

The invention discloses a diagnosis prediction data and feature re-cognition method based on extended multi-attribute decision, which comprises the following steps: defining a sensor for detecting equipment faults as a conditional attribute of the extended multi-attribute decision data model, and defining a label of a typical fault mode or state of the equipment as a decision attribute of the extended multi-attribute decision data model; selecting a condition attribute subset according to the dependency relationship between the decision attribute and the condition attribute so as to screen a valuable sensor set from a plurality of sensors for detecting equipment faults; obtaining the attribute weight of each condition attribute in the condition attribute subset by distributing weight to each condition attribute in the selected condition attribute subset; and inputting the data samples acquired by the sensors corresponding to the condition attribute subsets and subjected to data preprocessing and the attribute weights of the condition attribute subsets into corresponding fault diagnosis prediction models, and diagnosing and predicting equipment faults.

Description

Diagnostic prediction data and feature re-cognition method based on extended multi-attribute decision

Technical Field

The invention relates to a fault diagnosis knowledge and prediction knowledge extraction method, in particular to a diagnosis prediction data and feature re-cognition method based on an extended multi-attribute decision, and belongs to a method for generating fault diagnosis prediction data based on an extended labeling multi-attribute decision.

Background

The labeled multi-attribute decision (LMADM) is a decision theory specifically proposed for labeled datasets. The decision attributes (i.e., labels) in the decision system provide only some rough classification information for the instance, which is typically manually set or calibrated according to a given practical situation. Tagged multi-attribute decisions are a special case of multi-attribute decisions (MADM) that differ by LMADM more decision attribute (tag) assistance than in MADM applications. The essence of the labeled multi-attribute decision is to extract further inferred decision information, such as ordering information of decision attributes and attribution problems of new sample data, by mining the dependency relationship between condition attributes and decision attributes. It is this particular utility that makes labeled multi-attribute decisions well suited for diagnosis and predictive problem studies in fault Prediction and Health Management (PHM). Wherein the knowledge used for fault prediction is essentially a gradient relation of analysis data, namely a time sequence ordering relation; the knowledge used for fault diagnosis is essentially a distinguishing relationship of the analysis data, i.e. a kind of attribution relationship of the samples. Therefore, the method applies the labeled multi-attribute decision theory to fault diagnosis and predictive knowledge extraction.

However, it is worth noting that the research of the existing labeling multi-attribute decision theory focuses on the improvement and application of the conditional attribute weighting and clustering operators, and attempts are made to use various data depiction means and manual intervention means to realize more accurate conditional attribute weighting and clustering operations. However, with the increasing complexity and architecture of PHM application objects, the measurement points and acquired data collected by PHM systems also show a rapid trend. The mass data contains rich knowledge which can be used for PHM practice, and how to apply the labeled multi-attribute decision theory to extract the knowledge for fault diagnosis and prediction in the mass test data is a difficult problem to be solved in the PHM application of the equipment at the present stage. The problem is the problem which is mainly solved by the invention, and the problem is expanded on the basis of the original labeling multi-attribute decision theory so as to cope with challenges brought by massive data to PHM application and realize the extraction of effective fault diagnosis knowledge and prediction knowledge under the condition of massive data.

Disclosure of Invention

The present invention aims to solve the problems in two aspects: in the aspect of engineering application, the problem of inconvenient knowledge extraction of fault diagnosis and prediction under the condition of massive data is solved; in theory, the problem that the labeled multi-attribute decision cannot effectively mine data hiding knowledge and cope with massive data mining is solved.

The invention relates to a diagnosis prediction data and characteristic re-cognition method based on extended multi-attribute decision, which comprises the following steps:

defining a sensor for detecting equipment faults as a conditional attribute of the extended multi-attribute decision data model, and defining an equipment typical fault mode or state as a decision attribute of the extended multi-attribute decision data model, wherein the typical fault mode or state can be a data set obtained from a knowledge base or a data set obtained according to inspection;

Selecting a subset of condition attributes according to the dependency relationship between the decision attributes and the condition attributes so as to screen a valuable set of sensors (i.e. sensors playing a key role in fault diagnosis) from a plurality of sensors for detecting equipment faults;

Distributing weight to each condition attribute in the selected condition attribute subset to obtain attribute weight of each condition attribute in the condition attribute subset;

and inputting the data samples acquired by the sensors corresponding to the condition attribute subsets and subjected to data preprocessing and the attribute weights of the condition attribute subsets into corresponding fault diagnosis models (namely, any known fault diagnosis model) to diagnose equipment faults. Similarly, for a dataset with decision attributes of device states, conditional attribute subset data and weights are input to a corresponding prediction model to predict future states of the device.

Preferably, the data samples subjected to the data preprocessing are normalized data samples that do not change the original data space structure.

Preferably, for data samples acquired by the sensor that monotonically decrease over time, a revenue normalization model is employed to scale the data samples to [0,1].

Preferably, for data samples acquired by the sensor that monotonically increase over time, a cost normalization model is employed to scale the data samples to [0,1].

Preferably, the extended multi-attribute decision data model is a decision system ds= (U, C U D), u= { x ₁,x₂,…,x_m } is a set of candidate data sets, c= { C ₁,c₂,…,c_n } is a set of conditional attributes, and d= { D ₁,d₂,…,d_K } (k+.m) is a set of device typical failure modes or states.

Preferably, the dependency relationship between the condition attribute and the decision attribute refers to the dependency relationship between a data sample collected by a sensor for detecting equipment failure and a typical failure mode or state; the condition attribute subset selection according to the dependency relationship between the decision attribute and the condition attribute comprises the following steps:

Determining a fuzzy neighborhood decision rough set model by using the fuzzy neighborhood radius delta;

Obtaining a global decision risk R and an attribute importance Sig _risk of an attribute reduction algorithm for selecting the conditional attribute subset by using the determined fuzzy neighborhood decision rough set model;

And inputting the DS= (U, C U D) and the attribute reduction parameter zeta into the attribute reduction algorithm, and taking the output of the attribute reduction algorithm as a selected conditional attribute subset.

Preferably, a global decision risk R is utilized to assign a weight to each condition attribute of the selected subset of condition attributes.

Preferably, the fuzzy neighborhood decision rough set model is;

POS＝{x∈U|P(X|[x]^δ)≥α}，

BND＝{x∈U|β<P(X|[x]^δ)<α}，

NEG＝{x∈U|P(X|[x]^δ)≤β}(8)，

Wherein POS is positive domain, BND is boundary domain, NEG is negative domain, P (X [ X ] ^δ) is fuzzy conditional probability; x is the category to which the data sample X is assigned in the decision system DS, [ X ] ^δ is a fuzzy neighborhood set; alpha and beta are a pair of threshold parameters, and beta is more than or equal to 0 and less than or equal to 1.

Preferably, the global decision risk R is:

Where λ _PN,λ_BP,λ_BN and λ _NP are loss functions.

Preferably, the attribute importance based on the global decision risk is:

Wherein, C is C, C is a condition attribute set in a decision system, R _B is a global decision risk of B, R _C is a global decision risk of C, and R _B∪c is a global decision risk of B U.C.

Preferably, the attribute weight w _ci of each condition attribute in the condition attribute subset is:

Wherein,

The method has the beneficial technical effects that the problems that the fault diagnosis and the predicted knowledge extraction are inconvenient under the condition of mass data at present and the problem that the labeled multi-attribute decision cannot effectively mine the hidden knowledge of the data and deal with mass data mining are solved; the method can meet the requirements of data-driven fault diagnosis and predictive knowledge extraction under the condition of massive data.

Drawings

FIG. 1 is a technical block diagram of fault diagnosis and prediction knowledge of the present invention;

FIG. 2 is a schematic diagram of a labeled multi-attribute decision data model of the present invention;

FIG. 3 is a schematic representation of the spatial distribution of a given decision system of the present invention;

FIG. 4 is a schematic diagram of the distribution of all decision attributes of the present invention;

FIG. 5 is a schematic diagram of the process of approximating decision attributes of the present invention;

FIG. 6 is a schematic diagram of the relationship of the extended labeled multi-attribute decision result of the present invention to fault diagnosis and prediction;

fig. 7 is a schematic diagram of a typical distribution of FD001 data of the present invention;

FIG. 8 is a schematic diagram of statistics for all selected attributes;

FIG. 9 is a schematic diagram of the statistical risk for each attribute associated with 80 training units;

FIG. 10 is a schematic diagram of normalized weights for each attribute;

FIG. 11 is a schematic of a windowed score result for 20 test units;

FIG. 12 is a schematic diagram of statistics of window score data;

fig. 13 is a schematic diagram of the statistical result of the slope of the fitting function.

Detailed Description

The invention aims to solve the defects existing in PHM engineering application and LMADM theory. In PHM engineering application, along with the increasing complexity of PHM application objects and the huge system, the collection points and the quantity of data are increased. The increase of data brings trouble to the mining of fault diagnosis knowledge and predictive knowledge in PHM. Analysis shows that the labeling multi-attribute decision is very suitable for diagnosis knowledge and prediction knowledge extraction. However, the traditional labeling multi-attribute decision cannot effectively cope with decision making under the condition of mass data due to the limitation of a theoretical framework. Therefore, the invention can be used for fault diagnosis and predictive knowledge extraction under the condition of processing mass data in the PHM field, and can also be applied to other decision problems containing classification and sequencing.

The diagnosis and prediction knowledge extraction technology based on the expansion labeling multi-attribute decision is mainly composed of five steps, classification and sequencing knowledge mining of mass data can be achieved through the five steps, and good classification and prediction results are achieved. The technical route is shown in fig. 1.

Step one, data preprocessing: including a normalization (normalization) process to account for the effects of data dimensional inconsistencies for granular computation of metrology data uncertainty.

Step two, selecting an attribute subset: some conditional attributes are selected that facilitate decision-making, constituting a subset of the attributes. There are many selection methods of attribute subsets, including a filtering method, an embedded method, a packaging method, and the like, which only consider a conditional attribute set, and an attribute subset screening method, which comprehensively considers a conditional attribute set and a decision attribute. The selection of the attribute subset is an NP-Hard problem, and is usually implemented by using an optimization method or heuristic ideas.

Step three, weight distribution: and (3) distributing proper weight for the condition attribute in the attribute subset obtained by screening. The assigned weights of the attributes comprise an objective weighting method, a subjective weighting method and a mixed weighting method. Each weighting method has respective advantages, disadvantages and adaptability, and a reasonable method is required to be selected according to specific conditions when the weighting method is applied to specific applications.

Fourth, rule extraction: in this step, the type of rule established is not limited. The established rule knowledge serves mainly the actual requirements. For example, a classification rule is established for diagnosing problems, and a ranking rule is established for predicting problems. The established rule set is constructed as a rule knowledge base.

Step five, aggregation calculation: according to the actual requirements, a decision result is obtained by using some aggregation operators. It is worth noting that the result obtained by the aggregation operation can be directly output as a result of decision making on one hand, and can be put into a rule knowledge base as rule knowledge for specified decision on the other hand, and final decision output is completed by combining other models. For example, the clustered result obtained by a simple weighted clustering operator can be used as a judging value for diagnosis or prediction, and can be used as a decision reference boundary to be combined with a classification model, a clustering model and a regression model to obtain a further decision result.

Step six, outputting a decision: the calculation results are organized and output to the decision maker. The expression of the result form comprises graphical expression, quantitative expression and qualitative expression, and can also give interactive result form so as to facilitate the decision maker to further infer.

The flow, steps and descriptions of the detailed technical scheme of the invention are as follows.

1) Data system and labeled multi-attribute decision data model

The research object of the labeled multi-attribute decision theory is a decision system, and two types of attributes are included in the decision system: conditional attributes and decision attributes, as distinguished from information systems that contain only a set of conditional attributes. The general definition of a decision system is as follows.

The decision system is composed of a four-element DS= (U, { A|A=C U }, { V _a|a∈A},{I_a |a E A }, U is a limited object set called a domain, U= { x ₁,x₂,…,x_m }, A is an attribute set, C is a conditional attribute set, D is a decision attribute set (generally, only one decision attribute set is composed of one attribute, and a plurality of decision attributes can be converted into one decision attribute for research),V _a is the set of values for a ε A and I _a is an information function about a ε A. The Decision System (DS) can be simply expressed as ds= (U, C U D) again.

Given a decision system DS, the labeled multi-attribute decision theory can be implemented to a specific data model as follows. U= { x ₁,x₂,…,x_m } is a set of candidate data sets, C= { C ₁,c₂,…,c_n } is a set of conditional attributes, D= { D ₁,d₂,…,d_K } (K.ltoreq.m) is a set of labels for the candidate data sets, W= { W ₁,w₂,…,w_n } is a set of weights for the conditional attributes, andW _j≥0,V＝[v_ij]_m×n is the decision reference matrix given by the decision maker, v _ij represents the preference value of candidate set sample x _i with respect to c _j. For PHM systems, the decision reference matrix v= [ V _ij]_m×n ] given by the decision maker is usually the acquired data, including sensor data, data recorded during maintenance, etc.

Based on the two basic definitions, a two-dimensional labeled multi-attribute decision data model can be constructed, which can be represented as shown in fig. 2.

2) Dialectical analysis of extended tagged multi-attribute decision model

In the traditional labeling multi-attribute decision theory, the characterization of the conditional attributes only stays in attribute weighting, which is difficult to be applicable to mass data sets. With the increase of the data volume, the number of the conditional attributes is tens or hundreds, and the weights among the conditional attributes are not different in actual calculation after the attribute is weighted, so that the function of attribute weighting is reduced. The attribute subset selection is an effective method for dealing with mass data processing, and can extract valuable non-redundant data subsets from mass data, thereby improving the efficiency and accuracy of data mining. Therefore, the invention adopts an ablation test mode to dialectically analyze the functions of attribute weighting and attribute subset selection in the labeling multi-attribute decision theory, and lays a foundation for providing an expanded labeling multi-attribute decision model.

For a given one decision system ds= { U, C ∈d }, c= { C ₁,c₂,c₃,c₄ }. In order to not lose generality, the data in U is exemplified by a gaussian distribution. The decision attribute D is represented by data having some gaussian distribution form to reflect the implication results abstracted from the knowledge extension, which meets the original purpose of decision system establishment.

The weight assignment and attribute subset selection by this analysis is determined based on the similarity of the gaussian distribution form, i.e., the greater the similarity of the distribution of c _i (i=1, 2,3, 4) to D, the greater the weight of c _i, and c _i will be preferentially selected for the attribute subset. A given decision system is shown in fig. 3 (a) and fig. 3 (b), where fig. 3 (a) is a data distribution diagram of a three-dimensional view, and fig. 3 (b) is a diagram for converting fig. 3 (a) into a two-dimensional plane.

As can be easily seen from fig. 3 (a), the distribution of the attribute c ₃ shows a unimodal distribution, which is completely contrary to the distribution of D, the data distribution of c ₄ shows a bimodal distribution characteristic, but the peak positions are different from D, and the distribution differences between c ₁ and c ₂ and D are smaller. Based on this, four possible schemes for design LMADM are as follows:

1) The average sum of all conditional properties, the Arithmetic Average (AA) operator:

2) All conditional attribute weighted sums, i.e., weighted Arithmetic Average (WAA) operators, are noted as Where w= { W ₁,w₂,w₃,w₄ = {0.5,0.3,0.1,0.2}.

3) Average summation of attribute subsets, i.e., subset Arithmetic Average (SAA) operator:

4) The attribute subset weighted sum, i.e., subset Weighted Arithmetic Average (SWAA) operator, is expressed as: Where w= { W ₁,w₂,w₄ = {0.5,0.4,0.1}.

In the above four schemes, the selected subset of attributes is C _sub＝{c₁,c₂,c₄,Is a set of approximate decision attributes produced by the four schemes. The attribute weight is set according to the consistency principle of data distribution, namely, the stronger the consistency between the condition attribute and the decision attribute is, the larger the weight is. Thus, the distribution of all decision attributes can be described in FIG. 4.

As can be seen from the results of fig. 4, the distribution of D1 is completely different from that of the target D, and the distributions of D2 and D3 in the range of [8, 12] are not identical to those of D. In contrast, the distribution of D4 is more ideal than other distributions, and thus the target attribute D can be appropriately approximated. This benefits from a rational selection of a subset of attributes and an assignment of appropriate weights.

Thus, it can be seen from this example that the description of the data distribution characteristics should not be limited to weight assignment only, but should include attribute subset selection. Thus, as shown in FIG. 5, a process of approximating decision properties may be described.

Through the dialectical analysis, the attribute weighting and the attribute subset selection exert great importance in the labeling multi-attribute decision process. Thus, conventional labeling multi-attribute decisions simply comprising attribute weighting is not sufficient to adequately perform the decision and does not address the challenges of massive data. Therefore, the invention comprehensively considers attribute weighting and attribute subset selection to obtain an extended labeling multi-attribute decision frame, so that the frame can fully express the essential characteristics of data and can better cope with challenges of mass data.

3) Extended tagged multi-attribute decision model

The extended labeling multi-attribute decision model is mainly calculated by the following steps:

Step 1: data normalization

In general, in order to avoid a problem of data engulfment due to a difference in dimension of data, normalization processing of the data is required before performing subsequent calculations. The normalization processing of the data is to map the original data to a unified threshold space through a certain deformation formula, and the mapping does not change the spatial structure of the original data.

The invention adopts a cost-type and income-type normalized model to map the original data to the [0,1] space. In a given DS system, for a given C _j ε C, the cost normalization model (equation (1)) and revenue normalization model (equation (2)) are expressed as follows:

where v _ij is the normalized element, max (·) and min (·) are the maximum and minimum operators, respectively, i.e., max _j(v_ij) represents the maximum value of the element in the j-th column, and min _j(v_ij) is the minimum value of the element in the j-th column.

Step2: attribute subset selection

The selection of the subset of attributes may also be referred to as feature selection, attribute reduction. In decision systems, the selection of a subset of attributes often needs to be filtered by means of dependencies between decision attributes and conditional attributes. In the present invention, a Decision-making rough set model (precision-Theoretic Rough Set, DTRS) that is good at attribute reduction is selected. The decision rough set model is a generalized form of a probability rough set model, the model is derived from three decision theories, and a final risk minimized decision result is obtained by applying a minimum Bayesian risk decision principle. In order to realize the processing capacity of the mixed data, the invention adopts a neighborhood relation model which is good at processing the mixed data system to measure the spatial structure among samples. In addition, in order to more effectively describe the spatial relationship of the data, a fuzzy relationship is added to describe the uncertainty relationship between the data on the basis of the neighborhood relationship. Thus, the fuzzy neighborhood decision rough set model is selected as a means of attribute subset selection.

For a given two samples x, y, the fuzzy relationship of the two can be described by the euclidean distance as:

Wherein, for Fu Haoxing attributes, if x _i＝y_i, x _i-y_i =0, whereas x _i-y_i =1.

Thus, the set of fuzzy neighbors of the available samples is expressed as:

[x]^δ＝{y∈U|r(x,y)≥δ} (4)

Wherein, delta is the radius of the fuzzy neighborhood, and delta is more than or equal to 0 and less than or equal to 1.

Therefore, based on the fuzzy neighborhood relation, the fuzzy condition probability that the sample belongs to a certain partition X can be obtained, and expressed as:

Obviously, 0<P (X < X > ^δ) is not more than 1.

Thus, the fuzzy neighborhood decision rough set model can be expressed as:

POS＝{x∈U|P(X|[x]^δ)≥α} (6)

BND＝{x∈U|β<P(X|[x]^δ)<α} (7)

NEG＝{x∈U|P(X|[x]^δ)≤β} (8)

Wherein POS is positive domain, BND is boundary domain, NEG is negative domain, alpha, beta is a pair of threshold parameters, beta < alpha < 1 > is more than or equal to 0. Typically, the pair of parameters is determined by an expert based on the actual situation. Furthermore, the general procedure of deriving through the decision-making asperity model yields a set of loss functions, the loss function matrix being represented as shown in table 1.

TABLE 1 loss function matrix

Where a _P,a_B,a_N is performing the partitioning of the samples into positive, boundary and negative domains, respectively. Lambda represents the loss value from different operations. Thus, two parameters in the coarse set of decisions can also be expressed as:

Using the fuzzy neighborhood rough set, a property subset selection strategy based on global decision risk minimization can be designed, and the global decision risk can be expressed as:

the attribute importance based on global decision risk is expressed as:

Wherein, C is C, C is a set of conditional attributes in the decision system.

Therefore, a greedy heuristic attribute subset selection algorithm can be designed using the concept of global decision risk minimization described above.

The algorithm for attribute subset selection comprises two main loops, the first loop being an increase operation and the second loop being a decrease operation. Both operations are based on greedy ideas, and the best attribute and the worst attribute are selected and removed respectively. The algorithm described above therefore belongs to a heuristic property subset selection method.

Step 3: attribute weighting

The global decision risk according to equation (9) can be deduced to obtain the weights of the various attributes, expressed as:

Wherein, Obviously 0<w _ci.ltoreq.1. Therefore, the attribute weight of the normalized output isAnd/>

It is noted that, according to the extended labeling multi-attribute decision theory, the weights of the final output conditional attributes are set for the subset of the attributes obtained by screening, and those condition attributes that are removed are not in the considered range.

Step 4: rule extraction

The rule extraction link is to store the obtained attribute subset selection and attribute weighting result corresponding to the rule into a knowledge base, so that the subsequent direct calling and use are facilitated. The extracted rules are also to some extent adapted appropriately by the applied object to accommodate more complex application scenarios.

Step 5: aggregation operation

The aggregation operation is to reasonably fuse the extracted rule knowledge by using an aggregation operator. After the aggregation operation, a decision maker can obtain a desired decision basis by using the aggregation operation result. The most common aggregation operator is the weighted average operator (WA):

D＝V·W^T (13)

Wherein D ε R ^m×1 is the decision vector and T is the transpose operator.

In addition to the above-mentioned collective operations, there are arithmetic average operators (AA), geometric average operators (GA), weighted geometric average operators (WGA), sequential weighting Operators (OWA), and the like.

Step 6: decision output

The decision output means that a decision maker gives out a required decision result according to the aggregated operation result, and in the invention, the given decision result comprises a diagnosis result and a predicted health state evaluation result.

4) Extraction of fault diagnosis knowledge and prediction knowledge

In PHM technical system, fault diagnosis can be equivalent to pattern recognition in machine learning, belonging to a classification problem; the fault prediction can be equivalently regarded as a regression problem in machine learning, and the source of the regression problem needs to give a regular sequence relation of time sequence data. On the other hand, for the result of labeling multi-attribute decision output, two types of results are mainly included: classification rules for samples and ordering rules for samples. Thus, the output results of the labeled multi-attribute decisions may correspond entirely to fault diagnosis and prediction in the PHM.

After the aggregation output result (an index result obtained after comprehensive calculation) obtained by the extended labeling multi-attribute decision model, knowledge beneficial to diagnosis and prediction can be obtained by judging the spatial relationship and the time sequence structural relationship of the index result. For the diagnosis problem, the more discrete the spatial distribution of the application data is and the more obvious the distinguishing is, the more favorable the fault diagnosis is performed and the higher the accuracy of the diagnosis result is; for the prediction problem, the larger the time sequence structure deviation of the application data is, the more obvious the monotonic gradual trend is, the more favorable the prediction analysis is performed, and the higher the accuracy of the prediction result is.

The index data distribution obtained by the clustering operation is assumed to be 4 distribution cases as shown in fig. 6, in which the distribution of fig. 6 (c) (d) has a time series characteristic. If the two sets of data in fig. 6 (a) and (b) are used for fault diagnosis, on the basis of using the same diagnosis model and parameters, the accuracy of the diagnosis result obtained in fig. 6 (a) will be higher, because the data in fig. 6 (a) is more clear than that in fig. 6 (b), and the classification characteristics of the sample are easier to distinguish. Therefore, it is more advantageous to perform the fault diagnosis using the index data of fig. 6 (a) and the process rules for generating these data as the fault diagnosis knowledge. Also, for the predictive application of fig. 6 (c) (d), the trend of the time series data of fig. 6 (c) is more remarkable, which is more beneficial to training a model and performing a predictive study, and the accuracy of the predicted result is higher. Therefore, it is more advantageous to perform prediction problem research using the index data of fig. 6 (c) and the process rules for generating these data as state prediction knowledge.

Therefore, in order to obtain index data as shown in fig. 6 (a) and (c), data characteristics are deeply mined, and obtaining the true characteristics of data by the method which can best represent the data essence is a key for satisfying the extraction of fault diagnosis knowledge and prediction knowledge. The extended labeling multi-attribute decision model provided by the invention is used for carrying out data mining and decision making operation based on the thought of fully mining the essential characteristics of the data, so that the model is more suitable for diagnosis and predictive knowledge extraction research.

For fault diagnosis and predictive knowledge extraction technology, the 6 steps of the extended labeling multi-attribute decision provided by the invention can be used as main steps of knowledge extraction, and the obtained index results can be output to be applied to training and testing of diagnosis models and predictive models.

Description of the preferred embodiments

A failure diagnosis and residual life prediction study is performed on a C-MAPSS dataset consisting of a plurality of time series signals generated by a gas turbine simulation platform. The dataset contains four types of data, representing different fault and operational states, respectively. Of these, the first type of data represented by FD001 is the most commonly used data, including 100 units and 20631 samples, we use the FD001 dataset as a test object, with the first 80 units as training sets and the last 20 units as test sets.

The nature of fault diagnosis may be equivalent to classifying problems: the greater the difference between the samples used, the more accurate the diagnostic result will be and vice versa. For the remaining lifetime prediction, the more pronounced the continuous monotonicity of the time series data, the more advantageous the prediction. Thus, the following two criteria were chosen as criteria for evaluation LMADM:

1) Sample differences;

2) Successive monotonicity of time series samples.

The basic steps for using the extension LMADM-based on the C-MAPSS dataset are as follows, with the auxiliary participation in the operation being a fuzzy neighborhood decision asperity:

Step one, data preprocessing

A typical distribution of FD001 data sets is shown in fig. 7. The FD001 dataset has three typical distribution characteristics. The first monotonically decreasing over time (see the line highlighted in fig. 7 for a warm color), the second monotonically increasing over time (see the line marked cold in fig. 7), and the third being a constant that is time independent (see the red straight line in fig. 7).

Thus, the revenue normalization model (equation (2)) is used for samples featuring a first type of distribution (sensor ID:7, 9, 12, 14, 20, 21) for scaling the samples to [0,1]. Instead, the cost normalization model (equation (1)) is used for samples with the second distribution type (sensor ID:2, 3, 4, 8, 11, 13, 15, 17). Pretreatment of samples with the third distribution type (sensor ID:1, 5, 6, 10, 16, 18, 19) is independent of the two normalized models.

Step two, selecting attribute subset

The original given C-MAPSS dataset does not include its corresponding tag. We first divide the life of each engine into three states, namely a healthy state, a degraded state and a failed state, each with equal samples.

We then selected valuable attributes (gas turbine sensors) from the data of 80 training units using a simple three-parameter decision model based on fuzzy neighborhood relations with δ=0.95 and ζ=0.2, the statistics are shown in fig. 8. As can be seen from fig. 8, the sensor 17 is selected at most for a maximum of 55 times. But the selected time for the constant type sensor (sensor ID:1, 5, 6, 10, 16, 18, 19) is 0.

Thus, based on the statistics of the selected attributes in FIG. 8, we finally select those attributes with a probability greater than 25% selected, i.e., those attributes that are selected greater than 20 times (80X 25%). Thus, the sensors with IDs 3, 8, 13 and 17 are selected as the sensors in the attribute subset.

Step three, weight distribution

A statistical risk is obtained for each attribute of the 15 training units as shown in fig. 9.

From a comparison of the results in fig. 8 and 9, it can be seen that although the sensor is selected the most 17 times, it also carries a relatively high risk, except that these sensors (sensor IDs: 1, 5, 6, 10, 16, 18, 19) produce samples of constant value. This shows that the importance of the sensor 17 depends not only on the risk, but also on the probability of being selected. Thus, the weight of each attribute should be expressed as:

where w _risk is the weight of risk generation, w _reduct is the weight of a selected number of times generation of the attribute, N _c is the number of times attribute C, C ε C.

Thus, the normalized weight of each attribute may be represented in fig. 10. The weight of the selected attribute may be normalized to { w _c3,w_c8,w_c13,w_c17 } = {0.1948,0.1889,0.2239,0.3925}.

Fourth, rule extraction

For fault diagnosis, the extracted rules may be stored as knowledge for fault detection and isolation, such as a subset of sensors, sensor weights, joint state baselines of signals, etc. For prediction, the rule knowledge base consists of: sensor subset, weight, remaining lifetime for subset, etc.

Step five, polymerization

For these four schemes, we can calculate the aggregate score for each sample based on four different aggregation operators:

1) Arithmetic Average (AA):

2) Weighted Arithmetic Average (WAA):

3) Subset Arithmetic Average (SAA):

4) Subset Weighted Arithmetic Average (SWAA):

Furthermore, windowing is a common method for such time-series data representing system states. We split the score data with a window of L samples, l=20. Fig. 11 shows the windowed score results for 20 test units. In the sub-graph of fig. 11, the blue vertical line represents the starting position of the score after adding the window, i.e., sampleID =20. The four solid lines represent windowed fractional time series data obtained by the four strategies, while the four-dot dashed lines are time series data generated by the windows. Third order polynomial fitting for subsequent predictive power assessment.

Step six, outputting verification decision

Considering both evaluation indexes together, we performed some statistical analysis on the numerical coverage of the windowed function data and the slope data of the fitting function, with the statistical results shown in fig. 12-13.

As can be seen from the statistical information in fig. 12, almost all the test units generated results indicate that the distribution of the score data given by scheme 4 is the most widely distributed, and meets the first criterion, i.e. the larger the sample difference, the more accurate the diagnosis result.

Furthermore, as can be seen from the results of fig. 13, the slope range of the fitted curve generated by scheme 4 for the other test engines is larger than the slope range of the fitted curve obtained by unit 4, except for units 83, 90 and 91. This demonstrates that the sequential score generated by scheme 4 has a more pronounced monotonic character and that such time series instances are more conducive to predictive studies.

Furthermore, as can be seen from the results in fig. 12-13, the results generated by these four schemes are progressively better according to the given two evaluation metrics, indicating that proper attribute selection and attribute weight assignment can improve LMADM's ability. These results also indicate that the extensions LMADM presented herein are more efficient and advantageous.

In summary, the method for diagnosing and predicting data and re-cognizing features based on extended multi-attribute decision according to the present invention comprises:

Defining a sensor for detecting equipment faults as a conditional attribute of an extended multi-attribute decision data model, and defining an equipment typical fault mode or state as a decision attribute of the extended multi-attribute decision data model, wherein the typical fault mode or state can be a data set obtained from a knowledge base or a data set obtained according to inspection;

The data samples subjected to data preprocessing are normalized data samples that do not change the original data space structure.

For data samples acquired by the sensor that monotonically decrease over time, a revenue normalization model is employed to scale the data samples to [0,1].

For data samples acquired by the sensor which monotonically increase with time, a cost normalization model is used to scale the data samples to [0,1].

The extended multi-attribute decision data model is decision system DS= (U, C U D), U= { x ₁,x₂,…,x_m }.

Claims

1. A method of diagnosing predictive data and feature re-cognition based on extended multi-attribute decisions, comprising:

defining a sensor for detecting equipment faults as a conditional attribute of the extended multi-attribute decision data model, and defining an equipment typical fault mode or state as a decision attribute of the extended multi-attribute decision data model;

selecting a condition attribute subset according to the dependency relationship between the decision attribute and the condition attribute so as to screen a valuable sensor set from a plurality of sensors for detecting equipment faults;

Obtaining the attribute weight of each condition attribute in the condition attribute subset by distributing weight to each condition attribute in the selected condition attribute subset;

inputting the data samples acquired by the sensors corresponding to the condition attribute subsets and subjected to data preprocessing and the attribute weights of the condition attribute subsets into corresponding fault diagnosis and prediction models, and diagnosing and predicting equipment faults;

The extended multi-attribute decision data model is a decision system DS= (U, C U D), U= { x ₁,x₂,…,x_m } is a group of candidate data sets, C= { C ₁,c₂,…,c_n } is a conditional attribute set, and D= { D ₁,d₂,…,d_K } (K is less than or equal to m) is a group of sets reflecting the typical failure mode or state of the equipment;

The dependency relationship between the condition attribute and the decision attribute refers to the dependency relationship between a data sample collected by a sensor for detecting equipment faults and a fault mode or a state set;

wherein, the condition attribute subset selection according to the dependency relationship between the decision attribute and the condition attribute comprises:

2. The method of claim 1, wherein the data samples subjected to data preprocessing are normalized data samples that do not alter the original data space structure.

3. The method of claim 2, wherein for data samples acquired by the sensor that monotonically decrease over time, the data samples are scaled to [0,1] using a revenue normalization model.

4. The method of claim 2, wherein for data samples acquired by the sensor that monotonically increase over time, a cost normalization model is employed to scale the data samples to [0,1].

5. The method of claim 1, wherein each conditional attribute of the selected subset of conditional attributes is assigned a weight using a global decision risk R.

6. The method of claim 1, wherein the fuzzy neighborhood decision rough set model is;

POS＝{x∈U|P(X|[x]^δ)≥α}，

BND＝{x∈U|β＜P(X|[x]^δ)＜α}，

NEG＝{x∈U|P(X|[x]^δ)≤β} (8)，

Wherein POS is positive domain, BND is boundary domain, NEG is negative domain, and P (X| [ X ] ^δ) is fuzzy conditional probability; x is the category to which the data sample X is assigned in the decision system DS, [ X ] ^δ is a fuzzy neighborhood set; alpha and beta are a pair of threshold parameters, and beta is more than or equal to 0 and less than alpha is more than or equal to 1.

7. The method of claim 6, wherein the global decision risk R is:

Where λ _PN,λ_BP,λ_BN and λ _NP are loss functions.

8. The method of claim 6, wherein the attribute importance based on global decision risk is:

Wherein, C is C, C is a set of conditional attributes in the decision system, R _B is a global decision risk of B, R _C is a global decision risk of C, and/(I)Is global decision risk of B U.C.

9. The method of claim 1, wherein the attribute weight w _ci for each conditional attribute ci in the subset of conditional attributes is:

Wherein,