CN116164880A

CN116164880A - CNN-GRU neural network-based pressure sensor fault detection method

Info

Publication number: CN116164880A
Application number: CN202211622727.4A
Authority: CN
Inventors: 黄宏程; 徐泽云; 胡敏
Original assignee: Chongqing University of Post and Telecommunications
Current assignee: Chongqing University of Post and Telecommunications
Priority date: 2022-12-16
Filing date: 2022-12-16
Publication date: 2023-05-26

Abstract

The invention relates to a pressure sensor fault detection method based on a CNN-GRU neural network, and belongs to the field of Internet of things. Establishing a fault model of a pressure sensor node and detecting faults of a pressure sensor array based on a CNN-GRU neural network; CNN and RNN networks have limitations as common deep learning fault diagnosis methods. CNN networks can only extract real-time features of time sequences, and cannot mine features of data in the time dimension. RNN has the disadvantage of gradient extinction or explosion. Aiming at the problems, a neural network fault detection method of the CNN-GRU is provided, and the method can fully mine the characteristics of data in time and space respectively, so that the accuracy of detecting the signal faults of the pressure sensor array is improved.

Description

CNN-GRU neural network-based pressure sensor fault detection method

Technical Field

The invention belongs to the field of Internet of things, and relates to a pressure sensor fault detection method based on a CNN-GRU neural network.

Background

The pressure sensing array is composed of various pressure sensors, and has the functions of converting pressure signals into electrical signals and reading the electrical signals. Pressure sensing arrays are commonly used in the medical field. The medical equipment based on the pressure sensing array has wide application in aspects of joint disease diagnosis, posture correction, medical rehabilitation and the like. Yang Lili and the like develop a 64-point piezoelectric array type sitting posture body pressure measuring system, correct the sitting posture of a human body and prevent lumbar diseases; yang Min a human gait analysis system based on plantar pressure sensors is designed and used for collecting information of three movement states of walking, running and squatting, and providing assistance for medical diagnosis and rehabilitation evaluation.

However, the pressure sensor in the medical equipment may wear and age under the condition of long-term use, and the acquired pressure data is abnormal data. The failure or abnormality of the medical equipment causes the treatment of the patient to be stopped halfway or misdiagnosis and the like, which brings great difficulty to the whole medical care work. Therefore, the method has important significance for providing a reliable fault detection method for the pressure sensing array. At present, the health maintenance technology of the equipment is mainly divided into three types: optionally, periodically, and predictably. Although the precision and reliability of the equipment are guaranteed to a certain extent by the on-demand maintenance and the periodic maintenance, the two still have the defects. The maintenance is carried out after the equipment fails according to the condition, the method belongs to a compensation measure, and the medical care work of the equipment is stopped in the maintenance process, so that the work efficiency is seriously influenced. Periodic maintenance is based on previous experience to maintain the equipment, which may lead to untimely or unnecessary maintenance and waste of many maintenance resources. Predictive maintenance refers to predicting the remaining life of a device and the possible occurrence of faults by analyzing signal data of the device.

The method of performing predictive maintenance based on device signals is commonly referred to as data-driven based fault diagnosis. The method comprises a signal processing method, a multivariate statistical analysis method, a machine learning method and the like. The basic idea of the machine learning fault diagnosis method is to collect data of equipment under normal and various fault conditions, train an SVM or neural network model by utilizing the collected massive data, extract characteristics of the data, and input the detected data into the model for fault diagnosis. The diagnosis method has the advantages of diagnosis automation, intelligence and the like, can process extremely complex data, and establishes a prediction model with higher robustness. The number of layers of machine learning according to its structure is generally classified into shallow learning (shallow learning) and deep learning (deep learning).

The shallow learning model structure basically has only one layer or does not contain hidden layer nodes, and has strong self-learning capability, nonlinear mapping capability and good robustness. The current mainstream shallow learning method mainly comprises an artificial neural network (Artificial Neural Network, ANN), a support vector machine (Support Vector Machine, SVM), a Boosting algorithm and the like. ANN is a complex network structure composed of a large number of interconnected neurons that can abstract mimic the way in which the human brain processes information. Has stronger generalization ability and self-learning ability. However, the network structure is easy to fall into a local optimal solution, fault characteristics are required to be extracted manually, the accuracy of fault diagnosis is greatly dependent on the advantages and disadvantages of the characteristics, and the characteristic extraction process requires the participation of an expert. The support vector machine is a machine learning method based on a statistical theory, the fault analysis mechanism is that for a linear separable problem, the objective of the SVM is to construct a classification hyperplane, correctly classify and separate 2 types of samples, and then convert the classification hyperplane solving problem into a convex quadratic programming problem according to a risk minimization principle so as to obtain a global optimal solution. For the nonlinear problem, firstly mapping the space sample data to a high-dimensional feature space, then realizing linear classification after nonlinear transformation by adopting a proper kernel function, and finally solving a classification hyperplane in the high-dimensional feature space. The SVM has the advantages of strong generalization capability, theoretical support for selecting network parameters, global optimum obtaining and the like. However, it is difficult to process large-scale samples, it is difficult to implement multi-classification and model classification performance is greatly affected by kernel functions, and it is difficult to construct satisfactory kernel functions.

The deep neural model comprises a neural network structure of multiple hidden layers. The method comprises the steps of constructing a deep learning model, learning and training the model by using mass data, carrying out feature transformation and extraction on original data layer by layer, and combining the original data into high-level category attributes or features to find out distributed feature representation of the data, so that the model can improve classification or prediction accuracy. The currently predominant deep learning methods include convolutional neural networks (Convolutional Neural Network, CNN), and recurrent neural networks (Recurrent Neural Network, RNN).

CNN was developed in response to the development of the visual cortex of cats, and is mainly composed of an input layer, a convolution layer, a downsampling layer (also called pooling layer), a full-connection layer, and an output layer. The basic idea is to reduce parameters and complexity by means of local receptive fields and shared weights among neurons. Meanwhile, the characteristic dimension is reduced by utilizing the pooling effect of the pooling layer, so that the network operation efficiency is improved. CHEN et al studied the application of convolutional neural networks in gear box fault recognition and classification, using artificially preprocessed vibration signals at 5 different rotational frequencies as input feature vectors of CNN, and applying 4 different load conditions for each rotational frequency, truly simulating the most likely scenario in industrial application [35]. JANSSENS and the like directly use convolutional neural networks as feature learning models, automatically learn useful features for bearing fault detection, and the fault diagnosis accuracy is obviously higher than that of a method for manually extracting features, so that the CNN powerful feature extraction capability is embodied. However, CNN networks can only extract the spatial features of signals, cannot mine the features of data in the time dimension, and have certain limitations.

The RNN is used as a deep neural network with a special structure, the output of the current moment of the sequence is related to the information of the prior period, the current cognition of a person is similar to the influence of the prior knowledge, experience and memory, the time correlation and deep characteristics of fault signals are effectively extracted, and the accuracy and reliability of intelligent fault diagnosis are further improved. HUANGY et al can accurately predict motor failure modes using only motor vibration time domain signals by constructing an RNN-based variable auto-encoder (VAE) architecture. RNN, however, also suffers from certain drawbacks: with the lengthening of the processing sequence and deepening of the network structure, RNNs are prone to problems such as gradient extinction, explosion and long-term memory deficiency, and this dilemma needs to be changed by optimizing their structure.

Disclosure of Invention

In view of the above, the present invention aims to provide a method for detecting faults of pressure sensors based on a CNN-GRU neural network, which can fully mine characteristics of data in time and space respectively, thereby improving accuracy of detecting faults of pressure sensor array signals.

In order to achieve the above purpose, the present invention provides the following technical solutions:

the method comprises the steps of establishing a fault model of a pressure sensor node and detecting a pressure sensor array fault based on the CNN-GRU neural network;

the fault model for building the pressure sensor node is specifically as follows:

the fault is classified into a stuck fault, a constant gain fault, a constant deviation fault and an impact fault according to the fault expression;

let Y _out i (t) is the actual output of the ith sensor at time t and is the signal Y to be output at time t in its fault-free state _out i' (t); each fault type model is as follows:

stuck fault: the sensor's sensing device is disabled and the sensor's output is a constant A:

Y _out i(t)＝A (1)

constant gain failure: the complex working environment causes the gain value of the sensor to change greatly, beta _i Scaling factor for gain variation:

Y _out i(t)＝β _i Y _out i′(t) (2)

constant deviation fault: for long-term use, the pressure sensing element of the sensor is worn, so that a slow drift phenomenon is generated, and in a certain time period, the output data of the sensor show a constant deviation delta:

Y _out i(t)＝Y _out i′(t)+Δ (3)

impact failure: the sensor is interfered by the outside, and a mutation D is generated in a short time, wherein when t=p, the value of theta is 1, and when t is not equal to p, the value of theta is 0:

Y _out i(t)＝Y _out i′(t)+Dθ(t) (4)。

optionally, the fault detection of the pressure sensor array based on the CNN-GRU neural network specifically comprises:

the CNN-GRU network consists of a convolutional neural network CNN and a gate control loop unit network GRU;

the CNN network consists of a convolution layer, an activation layer, a pooling layer and a full connection layer; the convolution layer uses a convolution kernel to carry out convolution operation on a local area input by the upper layer and generate corresponding characteristics; convolution operation is shown as formula (5);

wherein ,

the j' th weight representing the i-th convolution kernel of the first layer,/th>

Represents the j-th convolution kernel local area of the first layer, and W represents the width of the convolution kernel;

the activation layer endows the deep neural network with layered nonlinear mapping capability through an activation function, and maps the multidimensional features to a new space; the pooling layer is arranged in the middle of the continuous convolution layer and performs downsampling operation; the excitation function of each neuron of the full-connection layer adopts a ReLU function; the output value of the last full-connection layer is transmitted to an output, and is classified by adopting softmax logistic regression;

the GRU network is composed of a reset gate and an update gate;

the CNN-GRU fusion network specifically structure comprises an input layer, a CNN network layer, a GRU network layer and an output layer; the CN N consists of three layers of convolution, three layers of pooling and one layer of full connection, and the GRU network with the three-layer structure is connected behind the CNN network full connection layer;

input layer: the data preprocessing is to normalize the time sequence acquired by the pressure sensing array and then input the normalized time sequence as a CNN-GRU fusion network model, as shown in a formula (6):

wherein:i represents the sample of the i-th sample,

representing a sensor time sequence at the T-th sampling time, wherein the sensor time sequence internally contains the reading information of n sensors at the current time, n is the number of pressure sensors, and T is the sampling times contained in a sample;

CNN layer: the CNN layer performs feature extraction on input data, performs batch standardization operation on output after convolution operation by adopting a Valid convolution mode, and performs calculation by a Relu function after addition with offset to serve as output of the convolution layer; the Relu function formula is:

the pooling layer adopts a maximum pooling and Valid pooling mode, and convolution pooling is shown in formula (7):

wherein y、C_i 、P _i Respectively representing an output matrix after BN operation, the output of a convolution layer i and the output of a pooling layer i, wherein i is more than or equal to 1 and less than or equal to 3, and i is an integer; p (P) ₁ ^* and P₂ ^* Respectively to P ₁ and P₂ Adding all the characteristic images to average; w (W) ₁ 、W ₂ and W₃ Is a weight matrix; b _i Is a bias term; * And Maxpool () are convolution operation and maximum pooling function, respectively;

GRU layer: the high-dimensional abstract feature matrix obtained by convolution is processed by a reshape function and then is used as the input of a three-layer GRU network, and the reshape function is expressed in the expression (8); b feature vectors with dimension a are input into three layers of GRU to obtain output h _3b ；

B＝reshape(A,size) (8)

The method is characterized in that an A matrix is reconstructed into a multidimensional array B which is the same as an A element, and the dimension of the multidimensional array B is determined by vector size;

output layer: output layerThe input of (2) is the output h of the upper layer _3b Calculating an output probability value through the full connection layer and softmax; the Softmax formula is expressed as:

in the formula ：x_i The output value of the ith node is represented, n is the number of the output nodes, and the probability value of the ith element belonging to the ith class is represented.

The invention has the beneficial effects that:

CNN and RNN networks have limitations as common deep learning fault diagnosis methods. CNN networks can only extract real-time features of time sequences, and cannot mine features of data in the time dimension. RNN has the disadvantage of gradient extinction or explosion. Aiming at the problems, a neural network fault detection method of the CNN-GRU is provided, and the method can fully mine the characteristics of data in time and space respectively, so that the accuracy of detecting the signal faults of the pressure sensor array is improved.

Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objects and other advantages of the invention may be realized and obtained by means of the instrumentalities and combinations particularly pointed out in the specification.

Drawings

For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in the following preferred detail with reference to the accompanying drawings, in which:

FIG. 1 is a flow chart of the present invention.

Detailed Description

Other advantages and effects of the present invention will become apparent to those skilled in the art from the following disclosure, which describes the embodiments of the present invention with reference to specific examples. The invention may be practiced or carried out in other embodiments that depart from the specific details, and the details of the present description may be modified or varied from the spirit and scope of the present invention. It should be noted that the illustrations provided in the following embodiments merely illustrate the basic idea of the present invention by way of illustration, and the following embodiments and features in the embodiments may be combined with each other without conflict.

Wherein the drawings are for illustrative purposes only and are shown in schematic, non-physical, and not intended to limit the invention; for the purpose of better illustrating embodiments of the invention, certain elements of the drawings may be omitted, enlarged or reduced and do not represent the size of the actual product; it will be appreciated by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.

The same or similar reference numbers in the drawings of embodiments of the invention correspond to the same or similar components; in the description of the present invention, it should be understood that, if there are terms such as "upper", "lower", "left", "right", "front", "rear", etc., that indicate an azimuth or a positional relationship based on the azimuth or the positional relationship shown in the drawings, it is only for convenience of describing the present invention and simplifying the description, but not for indicating or suggesting that the referred device or element must have a specific azimuth, be constructed and operated in a specific azimuth, so that the terms describing the positional relationship in the drawings are merely for exemplary illustration and should not be construed as limiting the present invention, and that the specific meaning of the above terms may be understood by those of ordinary skill in the art according to the specific circumstances.

As shown in fig. 1, the present invention is composed of two parts: a fault model of a pressure sensor node and a pressure sensor array fault detection method based on a CNN-GRU neural network.

(1) Fault model of sensor node

The types of faults of the sensor are mainly classified into hard faults and soft faults. A module of the sensor node fails, so that communication with other nodes is impossible (such as communication failure caused by battery energy exhaustion, circuit failure, antenna damage, and CPU module failure); soft failure refers to sensor errors caused by reasons such as radiation interference, power interference, sensor failure, etc., where the sensor node may still continue to operate and communicate with other nodes, but the data perceived by the node is incorrect. And when the hard fault occurs, the sensor node actually exits the wireless sensor network and is replaced or maintained. The invention mainly aims at the fault detection of the node soft fault. The fault is mainly classified into a stuck fault, a constant gain fault, a constant deviation fault and an impact fault according to the fault expression.

Let Y _out i (t) is the actual output of the ith sensor at time t and is the signal Y to be output at time t in its fault-free state _out i' (t). Each fault type model is as follows.

1) And (5) a stuck fault. The sensor's sensing device is disabled and the sensor's output is a constant A:

Y _out i(t)＝A (1)

2) Constant gain failure. The complex working environment causes the gain value of the sensor to change greatly, beta _i Scaling factor for gain variation:

Y _out i(t)＝β _i Y _out i′(t) (2)

3) Constant deviation faults. For long-term use, the pressure sensing element of the sensor is worn, so that a slow drift phenomenon is generated, and in a certain time period, the output data of the sensor show a constant deviation delta:

Y _out i(t)＝Y _out i′(t)+Δ (3)

4) Impact failure. The sensor is interfered by the outside, and a mutation D is generated in a short time, wherein when t=p, the value of theta is 1, and when t is not equal to p, the value of theta is 0:

Y _out i(t)＝Y _out i′(t)+Dθ(t) (4)

(2) CNN-GRU neural network fault detection method

The body of the CNN-GRU network consists of a Convolutional Neural Network (CNN) and a gated loop unit network (GRU). The CNN can automatically mine out the real-time characteristics of the time sequence by the special convolution-pooling operation, and lays a foundation for improving the accuracy of sensor deviation fault diagnosis. The CNN network mainly comprises a convolution layer, an activation layer, a pooling layer and a full connection layer. The convolution layer uses a convolution kernel to convolve the local region of the previous layer input and generate corresponding features. The main two features of the convolutional layer are local connections and weight sharing. A local connection is a node of a convolution layer that is only connected to a portion of the nodes of its previous layer and is only used to learn local features. The structure of local perception is inspired by the structure of the visual cortex of animals, namely, only a part of neurons play a role in perceiving external objects. The connection mode greatly reduces the number of parameters, quickens the learning rate and reduces the possibility of overfitting to a certain extent. Another big feature of the convolution layer is weight sharing, i.e. the same convolution kernel is traversed once according to a fixed step size. The weight sharing further reduces the parameter quantity of the network layer, reduces the memory required by the system, reduces the risk of overfitting, and carries out convolution operation as shown in the formula (5).

wherein ,

Represents the j-th convolution kernel local region of the first layer, and W represents the width of the convolution kernel.

The activation layer gives the deep neural network layered nonlinear mapping capability through the activation function, and maps the multidimensional features to a new space. The pooling layer is arranged in the middle of the continuous convolution layer and has the main functions of performing downsampling operation to achieve the effects of reducing dimension, removing redundant information, compressing characteristics, simplifying network complexity, reducing calculation amount and the like. The fully connected layer acts as a "classifier" in the network. To improve CNN network performance, the excitation function of each neuron of the fully connected layer generally adopts a ReLU function. The output value of the last fully connected layer is passed to an output, which can be classified using softmax logistic regression.

The GRU network is composed of reset gates and update gates. The function of the reset gate is to determine how to combine the input information with the last memory, and the function of the update gate is to determine the amount of time the last memory was saved to the current time. In the case of resetting gate 1 and updating gate 0, the GRU would degrade to a standard RNN model. The two gates of the GRU determine the output information of the gating loop, which is characterized in that the information in the long-term sequence can be saved and not removed over time or predicted irrelevance. Therefore, the GRU network can memorize different time correlations of each sensor of the pressure sensor array caused by different dynamic response characteristics of each sensor, so that the time sequence is fully characterized and modeled, and further, the deviation fault diagnosis of the pressure sensor array is realized.

The CNN-GRU fusion network specifically comprises an input layer, a CNN network layer, a GRU network layer and an output layer. The CN N consists of three layers of convolution, three layers of pooling and one layer of full connection, and the GRU network with the three-layer structure is connected behind the CNN network full connection layer. The converged network not only reserves the processing capacity of the frequency domain and the spatial domain of the CNN network, but also increases the interpretation of the GRU network in time sequence.

wherein: i represents the sample of the i-th sample,

representing the sensor time sequence at the T-th sampling time, wherein the sensor time sequence internally contains the reading information of n sensors at the current time, n is the number of the pressure sensors, and T is the sampling times contained in the sample.

CNN layer: the CNN layer performs feature extraction on input data, the invention adopts a Valid convolution mode, performs batch standardization operation on output after convolution operation, and then calculates the output as output of a convolution layer after addition with offset through a Relu function. The Relu function formula is:

wherein y、C_i 、P _i Respectively representing an output matrix after BN operation, the output of a convolution layer i and the output of a pooling layer i, wherein i is more than or equal to 1 and less than or equal to 3, and i is an integer; p (P) ₁ ^* and P₂ ^* Respectively to P ₁ and P₂ Adding all the characteristic images to average; w (W) ₁ 、W ₂ and W₃ Is a weight matrix; b _i Is a bias term; * And Maxpool () are convolution operations and maximum pooling functions, respectively.

GRU layer: and (3) processing the high-dimensional abstract feature matrix obtained by convolution through a reshape function, and then using the reshape function as the input of the three-layer GRU network, wherein the reshape function is expressed in the expression (8). B feature vectors with dimension a are input into three layers of GRU to obtain output h _3b 。

B＝reshape(A,size) (8)

The meaning is that the a matrix is reconstructed into the same multidimensional array B as the a element, the dimension of which is determined by the vector size.

Output layer: the input of the output layer is the output h of the upper layer _3b The output probability value is calculated by the full connection layer and softmax. The Softmax formula is expressed as:

Finally, it is noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made thereto without departing from the spirit and scope of the present invention, which is intended to be covered by the claims of the present invention.

Claims

1. The pressure sensor fault detection method based on the CNN-GRU neural network is characterized by comprising the following steps of: establishing a fault model of a pressure sensor node and detecting faults of a pressure sensor array based on a CNN-GRU neural network;

Y _out i(t)＝A (1)

Y _out i(t)＝β _i Y _out i′(t) (2)

Y _out i(t)＝Y _out i′(t)+Δ (3)

Y _out i(t)＝Y _out i′(t)+Dθ(t) (4)。

2. the CNN-GRU neural network-based pressure sensor failure detection method of claim 1, wherein: the pressure sensor array fault detection based on the CNN-GRU neural network specifically comprises the following steps:

wherein ,

the GRU network is composed of a reset gate and an update gate;

wherein: i represents the sample of the i-th sample,

representing a sensor time sequence at the T-th sampling time, wherein the sensor time sequence internally contains the reading information of n sensors at the current time, n is the number of pressure sensors, and T is the sampling times contained in a sample; />

wherein y、C_i 、P _i Respectively representing an output matrix after BN operation, the output of a convolution layer i and the output of a pooling layer i, wherein i is more than or equal to 1 and less than or equal to 3, and i is an integer; p1 and P2 are respectively to P ₁ and P₂ Adding all the characteristic images to average; w (W) ₁ 、W ₂ and W₃ Is a weight matrix; b _i Is a bias term; * And Maxpool () are convolutions respectivelyCalculating and maximum value pooling functions;

B＝reshape(A,size) (8)

output layer: the input of the output layer is the output h of the upper layer _3b Calculating an output probability value through the full connection layer and softmax; the Softmax formula is expressed as: