Disclosure of Invention
The invention aims to provide a method for specifically positioning equipment fault point positions, reducing errors to the maximum extent and providing a method for diagnosing faults of sensors of an aircraft system.
In order to achieve the above object, the embodiments of the present invention provide the following technical solutions:
a method for diagnosing aircraft system sensor faults, comprising the steps of:
performing sample and feature processing on the acquired sensor data to be used as a training set for training a fault diagnosis model;
training a fault diagnosis model by using a method of a decision tree, a random forest or a deep neural network according to the training set to construct the fault diagnosis model;
using sensor data which is not subjected to sample and feature processing as a test set, and verifying the constructed fault diagnosis model;
and after the fault diagnosis model is verified, inputting the newly acquired sensor data into the fault diagnosis model to obtain a diagnosis result.
In the scheme, the real data of the sensor is used as a training set, and when a fault diagnosis model is constructed, the training set with huge data volume can be applied by a method of a decision tree, a random forest or a deep neural network, and meanwhile, the difficulty of manual marking is reduced; after the fault diagnosis model is built, the fault diagnosis model is evaluated by using a test set so as to verify whether the built fault diagnosis model can accurately output a fault result.
The step of performing sample and feature processing on the acquired sensor data as a training set for training a fault diagnosis model includes:
injecting faults with different performances into equipment of data collected by a sensor, collecting data of the equipment respectively at each performance fault by using the sensor as a training set C, and collecting data under each performance fault as a training subset C1、C2、...CNN is the number of equipment performance faults;
wherein the data of each performance fault further includes a plurality of condition data, and one training subset is Ci={a1 i,a2 i,...aM i},CiFor the ith training subset, a is the condition data and M is the number of condition data.
According to the training set, the step of training the fault diagnosis model by using a deep neural network method comprises the following steps:
carrying out DNN forward propagation calculation and DNN backward propagation calculation through a deep neural network layer;
the deep neural network layer comprises an input layer, a hidden layer and an output layer, wherein the hidden layer is an intermediate layer and comprises a plurality of layers;
performing DNN forward propagation calculation:
wherein the content of the first and second substances,
is a linear relation coefficient and represents the linear coefficient from the kth neuron of the i-1 th layer to the jth neuron of the i-1 th layer;
is inclined toMean bias of the ith neuron;
is an activation function;
an output value calculated for forward propagation, representing the output value for the jth neuron of the ith layer if there are m neurons in total at the ith-1 layer;
the output value of the i-th layer is represented using a matrix method:
wherein, the i-1 th layer has m neurons and the i-1 th layer has n neurons, the linear coefficients w of the i-th layer form an n × m matrix
The bias b of the ith layer constitutes an
n x 1 vector
The output a of the i-1 th layer constitutes an
m x 1 vector
The linear output z of the i-th layer before being activated forms an
n x 1 vector
The output a of the ith layer constitutes an
n x 1 vector
;
DNN back propagation calculations were performed:
inputting: the total number L of layers, the number of neurons of each hidden layer and each output layer, an activation function, a loss function and an iteration step length
Maximum number of iterations MAX and stop iteration threshold
Input m training subsets C
1、C
2、...C
m;
And (3) outputting: a linear relation coefficient matrix W and a bias vector b of each hidden layer and each output layer.
The step of verifying the constructed fault diagnosis model by using the sensor data which is not subjected to sample and feature processing as a test set comprises the following steps:
the sensor data without sample and feature processing is: using a sensor to collect data of the equipment in any condition as a test set, wherein the collected test set is data of unknown equipment performance whether has faults or unknown equipment performance why has faults, and Z = { b =1、b2、...bnB is data of the equipment acquired by the sensor under any condition, and n is the data quantity acquired by the sensor;
and inputting the data of the test set into a fault diagnosis model, and judging whether the result output by the fault diagnosis model is consistent with the original equipment performance fault of the data.
Compared with the prior art, the invention has the beneficial effects that:
(1) the training set of the invention contains data of various performances of the equipment, and the trained fault diagnosis model can know what performances of the equipment have faults and can completely locate fault points.
(2) The real data of the sensor is used as a training set, and when a fault diagnosis model is constructed, the method can be applied to the training set with huge data volume by a decision tree, random forest or deep neural network method, and meanwhile, the difficulty of manual marking is reduced; after the fault diagnosis model is built, the fault diagnosis model is evaluated by using a test set so as to verify whether the built fault diagnosis model can accurately output a fault result.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
The invention is realized by the following technical scheme, as shown in fig. 1, a method for diagnosing the faults of the sensors of the aircraft system comprises the following steps:
step S1: and carrying out sample and feature processing on the acquired sensor data to be used as a training set for training a fault diagnosis model.
Usually, a sensor on the aircraft is used to fixedly acquire data of a device, such as a rate gyroscope for acquiring the angular rate of the vehicle, an acceleration component for acquiring the acceleration of the vehicle, a fuel sensor for acquiring the amount of fuel in the fuel tank, and a pitch lever sensor for acquiring the operation data of the pitch lever.
However, one device often suffers from different performance failures, such as a possible fuel deficiency in the fuel tank, a possible fuel tank leak, and a possible fuel sensor power failure; for another example, the failure of the operation of the pitch lever may be a failure of a broken pitch lever, a failure of a power supply of a sensor of the pitch lever, or the like.
Therefore, for a device, different performance failures occur, which can be roughly divided into electrical performance failures and mechanical performance failures, wherein the electrical performance failures include voltage, current, power, temperature and the like, and the mechanical performance failures include breakage, jamming and the like. However, when the sensor collects data, even if the collected data is abnormal, it cannot be directly known what performance of the device has been failed.
According to the scheme, firstly, faults with different performances are injected into equipment with data collected by a sensor actively, the data of the equipment during each performance fault are collected by the sensor, the data under each performance fault are used as a training subset, and C is obtained1、C2、...CNAnd N is the number of device performance failures.
For example, when the C-redundancy pitch rod fails, it may be caused by a disconnection fault of the C-redundancy pitch rod or a power supply fault of a sensor corresponding to the C-redundancy pitch rod (for the moment, the two cases are discussed), a disconnection fault of the C-redundancy pitch rod is injected into the C-redundancy pitch rod, and data of the C-redundancy pitch rod during a performance fault of the disconnection is collected by using the sensor as a training subset C1. Injecting the fault of sensor power supply into the C redundancy pitch rod again, and using the data of the sensor when the performance fault of the sensor power supply is collected as the training subset C2。
And separately for training subsets C1And training subset C2Labelling performance faults, e.g. for training subset C1The label of 'C redundancy pitching rod disconnection fault' is marked, and the training subset C is subjected to2And (4) marking a label of 'C redundancy pitch rod sensor power supply fault'.
When data of the training subset is collected, data of a plurality of pieces of equipment in different states needs to be collected, and the collected data are called as condition data. For example, when data of performance faults of disconnection of the C-redundancy pitch lever are collected, the pitch levers are respectively adjusted to 9 different gear values: -20, -15, -10, -5, 0, 5, 10, 15, 20, so as to obtain 9 sets of data, C, acquired by the sensors1={a1 1(-20),a2 1(-15),a3 1(-10),a4 1(-5),a5 1(0),a6 1(5),a7 1(10),a8 1(15),a9 1(20)}。
Similarly, when data of performance failure of power supply of the sensor is collected, the pitching rods are respectively adjusted to the 9 different gear positions, so that 9 groups of data, C, collected by the sensor are obtained2={a1 2(-20),a2 2(-15),a3 2(-10),a4 2(-5),a5 2(0),a6 2(5),a7 2(10),a8 2(15),a9 2(20)}。
Wherein the training subset C1And training subset C2Each piece of data in (1) is conditional data, and two training subsets are used as training sets, so that 18 pieces of conditional data are in total. And each piece of condition data is labeled with a condition label, such as condition data a1 1(-20) is labeled with a "-20" conditional label. And thus as data in the training set, is known for its specific performance failure and the current state of the device.
Step S2: and training the fault diagnosis model by using a method of a decision tree, a random forest or a deep neural network according to the training set so as to construct the fault diagnosis model.
As an implementable mode, the fault diagnosis model is trained and constructed by using a decision tree, and a training set C (comprising a training subset C)1Training subset C2) The conditional data in (1) is used as leaf nodes of the decision tree, and the subset C is trained1As root node, training subset C2As a root node. And segmenting the training set by a recursive optimal feature selection mode to ensure that each condition data has an optimal classification result.
Because each condition data is labeled with a condition label, each condition data can be correctly classified into a corresponding training subset after the decision tree training is carried out. And repeating the steps until all the condition data in the training set are correctly classified, namely, all the condition data are finally segmented into corresponding root nodes, so that a decision tree is generated, and the training of the fault diagnosis model is completed. And inputting the test set into a decision tree to complete the construction of the fault diagnosis model. In the present embodiment, the condition tags are used to classify the condition data, and therefore the condition tags are selected features.
However, when the data size of the training set is very large, the label processing is performed on the condition data one by one, which increases the workload. Therefore, when selecting features for classification, the criteria for selection may be selected by way of information gain, information gain ratio, or a kuni index.
When the features are selected by using an information gain mode, the conditional data is used as a random variable X, and the probability distribution is as follows:
wherein
Is the ith condition data, n is the number of condition data,
is the probability distribution of the ith condition data.
The entropy of the random variable X is then:
entropy is a measure of uncertainty of random variables, and the larger the entropy value, the larger the uncertainty of random variables. According to the entropy of each random variable X, the joint entropy of a plurality of random variables can be obtained. For example, the joint entropy expression of the random variable X and the random variable Y is:
after the joint entropy is obtained, the expression of the conditional entropy can be obtained:
the conditional entropy measures the uncertainty of the random variable X remaining after the random variable Y is known, so the information gain represents the degree to which the information of the feature Y is known such that the uncertainty of the feature X is reduced. Assuming that a is a certain feature in the training set C, the information gain of the feature a to the training set C is expressed as:
h (C) represents the uncertainty of the classification of the training set C, and H (C | a) represents the uncertainty of the classification of the training set C under the conditions given by the feature a, so that the difference is the information gain g (C | a) representing the degree of uncertainty of the classification of the training set C reduced due to the given feature a. Therefore, the larger the information gain, the stronger the classification capability of the feature is, and therefore, the feature with the larger information gain can be selected as the classification feature.
The method for selecting features according to the information gain criteria is to calculate the information gain of each feature and select the feature with the largest information gain for classification. Suppose that the training set is C, | C | is the sample capacity of the training set, and there are K classes Dk,k=1,2,...K,|DkIs of class DkThe number of samples.
The characteristic A has n different values { a }1,a2,...anDividing the training set C into n training subsets C according to the characteristic A1、C2、...Ci、...Cn,|CiL is the sample number of the ith value of the characteristic A; let training subset CiIn (II) of class DkIs CikI.e. Cik=Ci∩Ck,|CikL is CikThe information gain algorithm is as follows:
inputting a training set C and features A, and calculating an entropy H (C):
2, calculating the conditional entropy H (C | a):
calculating an information gain g (C, a):
and secondly, when the characteristics are selected by selecting the information gain ratio, the adverse effect caused by the characteristic with more values biased by the information gain as a dividing basis can be avoided.
Information gain ratio g of feature A to training set CR(C, A) defined as its information gain g (C, A) and the entropy H of the training set C with respect to the feature AA(C) The ratio of (A) to (B) is expressed as:
about the characteristic entropy HA(C) Is expressed as:
wherein n is the number of the values of the characteristic A, | CiAnd | C | is the sample capacity.
And (III) when the characteristics are selected by using the mode of the Gini coefficient, assuming that K categories are provided, wherein the probability of the Kth category is pkThen the expression of the kini coefficient is:
the larger the kini coefficient is, the larger the uncertainty of the training set is, and for the training set C, if the training set C is divided into training subsets C according to a certain value a of the characteristic A1And training subset C2And in two parts, under the condition of the characteristic A, the Gini coefficient of the training set C is expressed as:
in conclusion, features are selected in the mode of information gain, information gain ratio or a kini coefficient, so that the classification method of the decision tree is generated, and is suitable for processing samples with missing attributes, for example, when data in a training set C has attribute missing; the method is suitable for processing mass data, for example, when the data volume in the training set C is huge, feasible and reliable results can be made for a large data source in a relatively short time; the method is suitable for the cases of which the classification details need to be visually displayed and has strong interpretability.
As another possible implementation mode, a random forest is used for training and constructing the fault diagnosis model, the random forest is an integrated algorithm, and the result of the overall fault diagnosis model has high accuracy and generalization capability by combining a plurality of weak classifiers and voting the final result.
The random forest uses a decision tree generated by selecting features through a kini coefficient as a weak classifier, improves the establishment of the decision tree on the basis of using the decision tree, and selects an optimal feature from all n sample features to divide left and right subtrees of the decision tree for a common decision tree.
But random forests are created by selecting a portion of sample features nsub(nsub<n) to select an optimal feature for left and right tree partitioning of the decision tree, thus further enhancing the generalization capability of fault diagnosis model construction, and nsubThe smaller the fault diagnosis model is, the more robust the fault diagnosis model is, and the algorithm of the random forest is as follows:
inputting iteration times T of a training set C and a classifier, wherein T =1,2t。
2, using the sampling set CtTraining and training tth decision tree model Gt(x) When the nodes of the decision tree model are trained, a part of sample features are selected from all the sample features on the nodes, and an optimal feature is selected from the selected part of sample features to divide left and right subtrees of the decision tree.
And 3, the category with the maximum number of votes is cast out in T iterations to serve as the final category of the data in the training set, and if two or more categories with the maximum number of votes exist, the final category of the data in one seat training set is selected.
The random forest can realize the classification of data, and the application condition of the random forest not only comprises the application condition of a decision tree, but also is suitable for the condition of not making feature selection, is also suitable for the condition of not making generalization processing, and is also suitable for the condition of needing parallel processing of weak classifiers.
No matter a decision tree or a random forest is used, a fault diagnosis model can be constructed according to a training set and a test set which are prepared in advance, but after the fault diagnosis model is constructed, in order to guarantee the use accuracy of the fault diagnosis model, the fault diagnosis model also needs to be evaluated so as to ensure whether errors occur in the construction process of the fault diagnosis model.
As another possible implementation manner, the fault diagnosis model is trained by using a deep neural network method, and if the fault diagnosis model output has errors, the fault diagnosis model is repeatedly learned to reduce or eliminate the errors.
The deep neural network DNN is a multi-layer feedforward neural network trained according to an error back propagation algorithm, and is the most widely applied neural network at present. The process of evaluating the fault diagnosis model consists of two parts, namely signal forward propagation and error backward propagation.
When the DNN is transmitted in the forward direction, an input sample is transmitted from an input layer of the fault diagnosis model, is sequentially processed layer by layer through all hidden layers and is transmitted to an output layer, if the output of the output layer is inconsistent with the expectation, errors are used as adjusting signals to be reversely transmitted back layer by layer, and a connection weight matrix between neurons is processed, so that the errors are reduced. Through repeated learning, the error is finally reduced to an acceptable range.
The deep neural network layer can be divided into three types, namely an input layer, a hidden layer and an output layer. Referring to fig. 2, the first layer is an input layer, the middle layers are hidden layers, the last layer is an output layer, all layers are connected, and any neuron on the ith layer is connected with any neuron on the (i + 1) th layer.
In defining the coefficient of linear relationship
Please refer to fig. 3, for example
Linear coefficients representing the 4 th neuron of the second layer to the 2 nd neuron of the third layer, superscript representing linear coefficients
The number of layers is the same, and the table below corresponds to the third layer of
output cablesIndex 2 and the input second level index 4. The linear coefficient from the kth neuron of the i-1 th layer to the jth neuron of the i-1 th layer is defined as
。
In defining bias b, see FIG. 4, for example
Indicating the bias for the third neuron in the second layer, superscript 2 represents the number of layers in which it is located, and
subscript 3 represents the index of the nerve in which it is located. Also the bias of the first neuron in the third layer should be expressed as
. The bias of the jth neuron at the ith layer is defined as
。
In carrying out the DNN forward propagation algorithm, the activation function is
The output values of the hidden layer and the output layer are a, and the output of the next layer is calculated by using the output of the previous layer. See FIG. 5, for example for the output of the second layer
、
、
There are (superscript of a represents number of layers, subscript represents index of nerve, x represents neuron of input layer):
assuming that there are m neurons in the i-1 th layer, the output for the jth neuron of the i-th layer
The method comprises the following steps:
if the output of the representation of each element by using an algebraic method is complex, the matrix method is simple to use. Assuming that there are m neurons in the i-1 th layer and n neurons in the i-th layer, the linear coefficients w of the i-th layer form an n × m matrix
The offset b of the ith layer constitutes an
n x 1 vector
The output a of the i-1 th layer constitutes an
m x 1 vector
. The output of the ith layer is represented by a matrix method as:
the forward propagation of DNN is to use several weight coefficient matrixes W, bias vectors b and input value vectors x to perform a series of linear operations and activation operations, starting from an input layer, calculating backwards layer by layer until an output layer is operated to obtain an output result.
Thus, DNN forward propagation can be summarized as:
inputting: the total number of layers L, a matrix W, a bias vector b and an input value vector x corresponding to all the hidden layers and the output layers;
and (3) outputting: output of the output layer
。
The method comprises the following steps:
2, for i =2 to L, calculate:
the final result is the output
。
During DNN reverse propagation, if errors exist, the errors serve as adjusting signals to reversely pass back layer by layer, a connection weight matrix between the neurons is processed, and the reverse propagation can be summarized as follows:
inputting: the total number L of layers, the number of neurons of each hidden layer and each output layer, an activation function, a loss function, an iteration step length beta, a maximum iteration number MAX, an iteration stop threshold value з, and m input training subsets C1、C2、...Cm;
And (3) outputting: a linear relation coefficient matrix W and a bias vector b of each hidden layer and each output layer.
The method comprises the following steps:
1, initializing a linear relation coefficient matrix W and a bias vector b of each hidden layer and each output layer to be a random value;
2,for iter to 1 to MAX:
2-1,for i=1 to m;
2-1a, inverting DNN to network input
Is arranged as
;
2-1b, for i =2 to L, forward propagation algorithm calculation is performed
;
2-1c, calculating output layers by loss functions
;
2-1d, for i = L-1 to 2, calculating by back propagation algorithm
;
2-2, for i =2 to L, updating the coefficient matrix of the linear relation of the i-th layer
Bias vector of
:
2-3, if all the W, b change values are smaller than the stop iteration threshold з, jumping out of the iteration loop to the next step;
and 3, outputting a linear relation coefficient matrix W and a bias vector b of each hidden layer and each output layer.
The deep neural network DNN can realize data classification, is particularly suitable for the condition of discovering the nonlinear relation between model input and model output, can learn and store a large number of input-output mode mapping relations without a mathematical equation for describing the mapping relations in advance, and is a good choice for training a fault diagnosis model.
Step S3: and using the sensor data which is not subjected to sample and characteristic processing as a test set to verify the constructed fault diagnosis model.
Using a sensor to collect data of the equipment in any condition as a test set, wherein the collected test set is data of unknown equipment performance whether has faults or unknown equipment performance why has faults, and Z = { b =1、b2、...bnB is the data of the equipment acquired by the sensor under any condition, and n is the data quantity acquired by the sensor.
For example, the C-redundancy pitch stick is now in any situation, and it is unknown whether the performance of the C-redundancy pitch stick is faulty, or in particular what kind of fault is. The pitch lever is also adjusted to the 9 different gear positions, so that 9 groups of data collected by the sensor are obtained as a test set, and Z = { b = }1(-20)、b2(-15)、b3(-10)、b4(-5)、b5(0)、b6(5)、b7(10)、b8(15)、b9(20)}。
Since the data in the test set is unknown whether the device has a performance failure or not, and also unknown what specific performance failure has occurred to the device, the data in the test set is not tagged as random data.
And inputting the data of the test set into a fault diagnosis model, and judging whether the result output by the fault diagnosis model is consistent with the original equipment performance fault of the data.
Step S4: and after the fault diagnosis model is verified, inputting the newly acquired sensor data into the fault diagnosis model to obtain a diagnosis result.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.