CN115795397A

CN115795397A - Gearbox temperature prediction method based on 1DCNN-LSTM and BiLSTM parallel network

Info

Publication number: CN115795397A
Application number: CN202211571249.9A
Authority: CN
Inventors: 袁仕能; 石开缔; 赖剑晶; 吴和兵; 姚国强; 张光明; 范亚学; 姜欢; 常凯; 杨新功; 田晓勇
Original assignee: Shaanxi Jinyuan New Energy Co ltd
Current assignee: Shaanxi Jinyuan New Energy Co ltd
Priority date: 2022-12-08
Filing date: 2022-12-08
Publication date: 2023-03-14

Abstract

The invention provides a gearbox temperature prediction method based on a 1DCNN-LSTM and BiLSTM parallel network, which comprises the following steps of: collecting temperature signals of key components of a gear box to be predicted; constructing a 1DCNN-LSTM model and a BiLSTM model; inputting a temperature signal, extracting feature vectors of space-time features and periodic features through a 1DCNN-LSTM model and a BiLSTM model respectively, and performing feature back-end fusion through a feature vector series splicing mode to obtain a fused feature vector; inputting the fused features into the full connection layer to perform regression analysis to obtain a regression analysis result, and predicting a short-term temperature data value of the gearbox according to the regression analysis result. The method solves the problems of insufficient prediction precision and the defects of a single network in the aspect of feature extraction in the prior method.

Description

Gearbox temperature prediction method based on 1DCNN-LSTM and BiLSTM parallel network

Technical Field

The invention relates to the technical field of temperature prediction, in particular to a gearbox temperature prediction method based on a 1DCNN-LSTM and BiLSTM parallel network.

Background

With the development of scientific technology, gear boxes are used in many automatic machines, and the machines in the industry have been developed towards high precision, ultra-precision and high efficiency, high-speed rotation inside the gear box will generate a large amount of heat energy, the heat accumulation will cause temperature rise, the high-temperature environment will cause viscosity reduction of lubricating oil in the gear box, gear meshing rotating friction inside the gear box is increased, gear abrasion is further increased, heat generation is further increased, vicious circle is caused, and finally equipment fault shutdown is caused. If the prior maintenance and treatment can not be carried out on the equipment state, the equipment is shut down and overhauled, the production is influenced, and huge economic loss and even the safety of workers can be seriously caused to enterprises. Therefore, at present, there is an urgent need to improve a temperature prediction method to obtain an accurate temperature prediction result, so that the hidden trouble of possible faults in mechanical equipment can be detected timely and efficiently, and the aggravation of equipment faults can be prevented. At present, common temperature prediction methods are mainly divided into two categories, one of which is to predict the temperature according to the acquired temperature signals, through a certain signal processing method and according to a mathematical method. The other is a temperature prediction method based on machine learning, which automatically performs feature learning through the acquired temperature signal and finally performs temperature prediction. The traditional mathematical model prediction method relies on data preprocessing and selection of a corresponding mathematical model, temperature data are mostly nonlinear signals, the linear mathematical model is difficult to predict and analyze, prediction precision is low, in recent years, machine learning which is hot can extract data characteristics through computer autonomous learning, nonlinear characteristics in the temperature signals can be extracted well, and accuracy of obtained prediction results is greatly improved compared with the traditional method.

The temperature prediction method based on machine learning should include the following three aspects of signal processing, feature extraction learning and temperature prediction. Each part has a large influence on the final prediction result. The signal processing part acquires temperature signals through a temperature sensor arranged in the gear box, rejects problem data, divides a data set into a training set and a testing set after normalization processing, inputs the training set as training data into a model to perform model training, extracts features of the model and finally predicts the temperature. Although the machine learning method is a technical method which is popular in recent years, the machine learning method is better improved in many fields, but still has some defects:

1) The temperature prediction method based on a single neural network is difficult to effectively extract different types of characteristic signals in a temperature signal, for example, a convolutional neural network is good at extracting spatial characteristics in a signal, but a common CNN network is usually two-dimensional and is not suitable for extracting spatial characteristics of a one-dimensional time signal, and if a time sequence signal is forcibly converted into a two-dimensional signal, the time sequence characteristics of the signal are destroyed, so that a prediction result deviation is caused.

2) In recent years, for performing time sequence signal processing and prediction work, a Recurrent Neural Network (RNN) is proposed, and meanwhile, improvements are made on the basis of the RNN, and various time sequence signal processing models are proposed, such as a long and short time memory network (LSTM) and a bidirectional long and short time memory network (BiLSTM), which can better process time sequence signals, wherein a gate structure of the LSTM network can well memorize signal characteristics, and the bidirectional long and short time memory network can extract periodic signal characteristics in temperature signals by using a bidirectional LSTM model thereof, so that the method has an important promoting effect on the research of a time sequence signal processing method. However, the common signals have more than time sequence signals, and the signals have spatial signal characteristics at the same time, so that it is difficult to extract complex spatial signals by using the above method, and the convolutional neural network having a better spatial signal extraction performance has disadvantages in time sequence signal processing, so that it is difficult to extract complete characteristic information in the signals by a single network due to its structural limitation when processing complex signals.

Disclosure of Invention

In order to solve the problems, the invention provides a gearbox temperature prediction method based on a 1DCNN-LSTM and BiLSTM parallel network, and solves the problems of insufficient prediction precision and insufficient characteristic extraction of a single network in the existing method.

In order to achieve the above purpose, the present invention provides the following technical solutions.

A gearbox temperature prediction method based on a 1DCNN-LSTM and a BiLSTM parallel network comprises the following steps:

collecting temperature signals of key components of a gear box to be predicted;

constructing a 1DCNN-LSTM model and a BiLSTM model; inputting a temperature signal, extracting feature vectors of space-time features and periodic features through a 1DCNN-LSTM model and a BiLSTM model respectively, and performing feature back-end fusion through a feature vector series splicing mode to obtain a fused feature vector;

inputting the fused features into the full connection layer to perform regression analysis to obtain a regression analysis result, and predicting a short-term temperature data value of the gearbox according to the regression analysis result.

Preferably, the method further comprises the following steps:

and normalizing the acquired temperature signals of key parts of the gearbox to be predicted, and mapping the data to be between 0 and 1.

Preferably, the 1DCNN-LSTM model comprises a 1DCNN model and an LSTM model which are connected in sequence; the 1DCNN model comprises an input layer and a hidden layer which are sequentially connected; the hidden layer comprises two groups of convolution layers and a pooling layer;

extracting spatio-temporal features through the 1DCNN-LSTM model, comprising the following steps:

receiving temperature signals of key components of the gearbox through an input layer;

alternately extracting nonlinear characteristics of the temperature signal through two groups of one-dimensional convolution layers and pooling layers;

the convolution layer executes convolution operation on the signal data and extracts data characteristics, and the mathematical expression of the convolution operation is as follows:

in the formula, z ^l And z ^l+1 Input and output of the (l + 1) th layer, respectively, b is an offset value,

is the weight value of the corresponding node in the l +1 layer; f, s ₀ Convolution kernel and step length respectively;

wherein, pooling layer carries out pooling operation, and simplifies the number of network layers by the maximum pooling, and the expression is as follows:

in the formula (I), the compound is shown in the specification,

the value of the t-th neuron in each ith characteristic of the l-th layer is obtained; w is a pooling area;

is the output of l +1 layer neurons;

taking the output signals of the convolution layer and the pooling layer after being processed as input signals of the LSTM model;

and after the features are extracted through the LSTM model, performing feature activation and outputting the features through the full connection layer.

Preferably, the LSTM model includes an input gate, an output gate, and a forgetting gate;

the input gate is generally used for updating the state of the hidden cell, and determines important information to be reserved according to the information of the previous layer and the information of the current state by combining a sigmoid function and a tanh function; the forgetting gate determines the discarding or retaining of information, the information from the previous hidden layer and the information of the current layer are processed by a sigmoid function, the output value is processed to be between 0 and 1, the data closer to 0 is discarded, and the data closer to 1 is retained; the output gate determines the information value which should be output by the current layer through the processing of the sigmoid function and the tanh function according to the previous input and the current input so as to determine the value of the next hidden state;

the input gate i _t Forgetting door f _t Output gate o _t Is defined as follows:

i _t ＝Sigmoid(w _f· [h _t-1 ，x _t ]+b _f )

f _t ＝Sigmoid(w _t· [h _t-1 ，x _t ]+b _i )

o _t ＝Sigmoid(w _o· [h _t-1 ，x _t ]+b _o )

in the formula, h _t-1 Output information for the previous hidden layer, x _t For input information of the current layer, h _t W is the reset gate weight and b is the bias term for the output information of the current layer.

Preferably, the BilTM model comprises a forward LSTM network and a backward LSTM network, and the feature signals extracted by the forward LSTM network are combined with the feature signals extracted by the backward LSTM network; wherein, the BilSTM model extraction formula is as follows:

wherein, the first and the second end of the pipe are connected with each other,

for the output vector of the forward LSTM hidden layer at time t, the vector x is input from the current time _t And the forward LSTM output vector at the previous time instant

Jointly determining; wherein the content of the first and second substances,

for the output vector of the t-time inverse LSTM hidden layer, the vector x is input from the current time _t And the inverse LSTM output vector of the previous time instant

Jointly determining; h is _t Is the output of the BilSTM model, where ω is _t A weight matrix for the positive LSTM output; v. of _t A weight matrix that is the inverse LSTM output; b _t Is the bias of the weight matrix.

Preferably, the full-link layer is a regression analysis layer, and the activation function of the full-link layer is activated by using a ReLu activation function.

The invention has the beneficial effects that:

compared with the traditional mathematical model prediction method, the deep learning temperature prediction method adopted by the invention can better process the nonlinear temperature signal, automatically extracts the characteristic information in the learning data through the model, gives an accurate temperature prediction value, and avoids the excessive dependence of the traditional method on the characteristic extraction and the mathematical model selection on professional knowledge and prior experience. Compared with the traditional method, the temperature early warning method provided by the invention has the advantages that the model prediction precision is greatly improved, and more temperature signal characteristics can be extracted in the aspect of characteristic information extraction.

The 1DCNN model adopted by the invention is different from a common CNN model, adopts a one-dimensional convolution kernel and a one-dimensional pooling layer, can better process a one-dimensional time sequence signal without losing the time sequence, and can better extract important characteristic information hidden in the one-dimensional time sequence signal by utilizing the excellent characteristic extraction capability of a convolution neural network.

The 1DCNN-LSTM combined model designed by the invention keeps the advantages of a one-dimensional convolutional neural network and a long-short time memory network, avoids the short board problem of a single network, has excellent spatial signal processing capability of the one-dimensional convolutional neural network and time sequence signal extraction capability of the long-short time memory network simultaneously by combining the use, extracts periodic signals in a data set according to a BiLSTM model, performs feature fusion on two groups of feature signals, and can extract feature information in a temperature signal as much as possible by combining the use of the two feature signals, thereby improving the accuracy of the model for predicting the temperature.

Drawings

FIG. 1 is a flow chart of a gearbox temperature prediction method based on a 1DCNN-LSTM and BiLSTM parallel network according to an embodiment of the present invention;

FIG. 2 is a model structure diagram of a conventional convolutional neural network of a gearbox temperature prediction method based on a 1DCNN-LSTM and BiLSTM parallel network according to an embodiment of the present invention;

FIG. 3 is a model structure diagram of a one-dimensional convolution neural network of the gearbox temperature prediction method based on the 1DCNN-LSTM and BiLSTM parallel networks according to the embodiment of the invention;

FIG. 4 is a model structure diagram of a long-term and short-term memory network of the gearbox temperature prediction method based on the 1DCNN-LSTM and BiLSTM parallel networks according to the embodiment of the invention;

FIG. 5 is a model structure diagram of a bidirectional long-and-short term memory network of the gearbox temperature prediction method based on the 1DCNN-LSTM and BiLSTM parallel networks according to the embodiment of the invention;

FIG. 6 is a graph of predicted temperature versus training data temperature for a gearbox temperature prediction method based on 1DCNN-LSTM and BiLSTM parallel networks in accordance with an embodiment of the present invention;

FIG. 7 is a comparison graph of predicted temperature and actual temperature for a gearbox temperature prediction method based on parallel networks of 1DCNN-LSTM and BiLSTM in accordance with an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and do not limit the invention.

Example 1

The invention discloses a gearbox temperature prediction method based on a 1DCNN-LSTM and BiLSTM parallel network, which is shown in figures 1-7:

fig. 3 is a schematic diagram of an improved 1DCNN model structure, where the improved 1DCNN model also includes an input layer, a hidden layer and an output layer, the functions of each layer are similar to those of a conventional convolutional neural network, but the 1DCNN network is more suitable for processing one-dimensional data, and can retain the time sequence of the one-dimensional data when processing the one-dimensional data, and in combination with a subsequent BiLSTM model, it is beneficial for the BiLSTM model to extract periodic signals in data signals, so as to improve the prediction accuracy of the model.

Fig. 4 is a schematic structural diagram of a long-short term memory network LSTM, which is a common time sequence signal processing model and can effectively avoid the problem of gradient disappearance or gradient explosion that often occurs when a cyclic neural network processes a time sequence signal, with its unique gate structure, the long-short term memory network generally includes three gate structures, namely an input gate, an output gate and a forgetting gate, and the three gate structures can determine the input of information according to current state information and previous state information, in combination with a sigmoid function and a tanh function, output and delete the information, and can effectively extract short-term time sequence characteristics in the time sequence signal, thereby having good effect on processing a time sequence signal, i.e. a temperature signal.

Fig. 5 shows a bidirectional long-short term memory network BiLSTM, which is based on the long-short term memory network and combines a forward long-short term memory network and a reverse long-short term memory network, so that the model can effectively consider the characteristic relation between the front and the back of the signal in the time sequence signal characteristic extraction process, and can effectively extract the periodic signal characteristic in the data when extracting the data signal characteristic, and has a good effect on extracting the data characteristic of the temperature signal, which may have the periodic signal characteristic.

The method comprises the following specific steps:

s1: temperature signals of key parts of the gearbox are collected. According to the following steps: the scale of 8 is divided into a test set and a training set. And (3) carrying out normalization processing on the acquired temperature signals of key parts of the gearbox, and mapping the data to be between 0 and 1. The normalization formula used is as follows:

s2: and constructing a 1DCNN-LSTM model and a BiLSTM model.

S3: inputting a temperature signal, extracting feature vectors of space-time features and periodic features through a 1DCNN-LSTM model and a BiLSTM model respectively, and performing feature back-end fusion through a feature vector series splicing mode to obtain a fused feature vector.

The method comprises the steps of inputting a training set serving as an input signal into a 1DCNN network, alternately extracting spatial features of a convolutional layer and a pooling layer of the 1DCNN network, transmitting data processed by the 1DCNN network into an LSTM model, extracting periodic signals in the data set, inputting the data into a BiLSTM model, extracting the periodic signals in the data, and performing feature fusion on the extracted feature signals.

Particularly, in the 1DCNN-LSTM network, the convolution layer and the pooling layer in the 1DCNN model are alternately used to preliminarily obtain the space-time characteristics of an input signal, wherein the convolution layer and the pooling layer in the 1DCNN model are both suitable for processing one-dimensional data and different from a common convolutional neural network, the network model does not comprise a full connection layer and an output layer, the characteristic data processed by the convolution layer and the pooling layer is directly input into the LSTM model for time sequence characteristic extraction, the 1DCNN model comprises two pairs of convolution layers and pooling layers, wherein the time sequence signal can be well processed by a gate structure in the LSTM model, important information needing to be retained is determined by a sigmoid function and a tanh function in the gate structure, the convolution layer, the pooling layer and the LSTM model layer are directly combined to form a characteristic extraction layer, and the full connection layer is added behind the characteristic extraction layer for characteristic activation and output.

It should be noted that the BiLSTM network used in the present invention is mainly used for extracting periodic characteristic signals in data, and a combination of forward LSTM and backward LSTM networks is provided inside the BiLSTM network, so that the characteristic signals in the data can be extracted backward while time sequence signals in the data are extracted forward, and the periodic signals in the data can be extracted by forward and backward combinations. And combining the forward extracted characteristic signal and the backward extracted characteristic signal, and associating the forward extracted characteristic signal and the backward extracted characteristic signal with the 1DCNN-LSTM model in a parallel connection mode, wherein the forward extracted characteristic signal and the backward extracted characteristic signal independently extract the characteristics of the processed data, and the extracted characteristics are subjected to characteristic fusion.

The BilSTM model extraction formula is as follows:

wherein the content of the first and second substances,

Jointly determining; wherein the content of the first and second substances,

for the output vector of the inverse LSTM hidden layer at time t, the input vector x at the current time _t And the inverse LSTM output vector at the previous time instant

Jointly determining; h is a total of _t Is the output of the BilSTM model, where ω is _t A weight matrix that is a forward LSTM output; v. of _t A weight matrix that is the inverse LSTM output; b is a mixture of _t Is an offset of the weight matrix.

The characteristic fusion process is a series splicing process, the characteristic fusion mode is a rear-end fusion mode, namely, the characteristic fusion is carried out on the output characteristic signals of the predictor after training, the characteristic signals respectively extracted by the 1DCNN-LSTM network and the BiLSTM network in the parallel network are fused, and the space-time characteristic signals and the periodic characteristic signals in the data are obtained through the mode.

And finally, inputting the fused features into a full connection layer, wherein the full connection layer is a regression analysis layer, and the activation function of the full connection layer is activated by adopting a ReLu activation function, wherein the expression of the ReLu activation function is as follows:

s4: inputting the fused features into the full connection layer to perform regression analysis to obtain a regression analysis result, and predicting a short-term temperature data value of the gearbox according to the regression analysis result.

In this example, the following is a gearbox temperature prediction case analysis:

the gearbox is an important component in modern automatic machinery and plays an important role in many mechanical devices, for example, in a wind turbine generator, the gearbox plays a role in accelerating the rotating speed of fan blades and transmitting the rotating speed to a wind driven generator, the rotating speed of the fan blades at low rotating speed can be converted into high-speed rotating speed through the gearbox and transmitted to the wind driven generator, and the generating efficiency is improved. However, a large amount of heat is generated in the operating process of the gear box, if poor heat dissipation or abnormal contact occurs, the internal temperature of the gear box is abnormal, and if the internal temperature is low, the viscosity of lubricating oil is affected, so that the internal abrasion of the gear box is aggravated, the temperature abnormality is further aggravated, and vicious circle is caused; heavy causes the gearbox to overheat and catch fire, and causes more serious economic and even energy loss.

The data adopted by the embodiment is a wind driven generator gear box temperature data set collected by an actual wind power plant, the temperature sensor arranged in the wind driven generator gear box is used for collecting temperature signals of key components of the wind driven generator gear box, the temperature signals are subjected to data preprocessing, problem data in the temperature signals are eliminated, and the data in the data set is the temperature signals collected every two minutes. The data set is temperature data of key components of the fan gearbox in a period of time and serves as training data of the model.

The experiment execution strategy is as follows:

the embodiment firstly collects the temperature data set of key components of the gearbox of the wind driven generator, firstly preprocesses the temperature data set, the embodiment adopts data normalization to preprocess model data, the normalization can convert the temperature data into the data set between 0 and 1, and through the data normalization, the situation that the data value difference is too large to cause the coverage of a large number of decimal numbers and the influence of the decimal numbers is excessively reduced in the data set can be avoided, and the situation that the deviation occurs in the model optimization path when the situation occurs is avoided.

Dividing the data set after data normalization, and dividing the data set according to the ratio of 8: and 2, dividing the training set into a training set and a test set, wherein the training set is used as model input training data for model feature extraction and parameter training work, the test set can be used for verifying a model predicted value, and model parameter fine tuning is performed through the test set verification.

Firstly, a training data set is transmitted into a 1DCNN model for first-step spatial feature extraction, the 1DCNN model comprises an input layer and a hidden layer, the hidden layer comprises two pairs of convolution layers and pooling layers, the 1DCNN network does not comprise a full connection layer, after the training set is transmitted into the 1DCNN model, output data processed by the model is used as input data of a subsequent LSTM layer, and the training set and the LSTM layer extract spatial features in the data together.

And meanwhile, transmitting data into a BilSTM model for periodic signal extraction.

Extracting feature vectors of space-time features and periodic features of data through a 1DCNN-LSTM network and a BiLSTM network, realizing feature fusion of the space-time features and the periodic features through a feature vector splicing mode, obtaining a new feature vector after feature fusion, inputting the feature vector after feature fusion into a full connection layer, performing regression analysis on the full connection layer, extracting the space-time features and the periodic features of the data through the steps, gradually optimizing model parameters in a training process, and finishing model training.

And finally, forecasting the short-term temperature data value of the gearbox of the wind turbine generator by the extracted characteristics and combining a model regression analysis result.

And (3) analyzing an experimental result:

according to the method, the wind turbine generator set falls to the temperature collected by a sensor, a fan can be interfered by wind speed in the sampling process, so that certain fluctuation of temperature data occurs in the rising process, fig. 6 is a temperature prediction experimental result performed by the temperature data of three key components in a certain fan gearbox, training data and prediction data are arranged in the same graph in the result, fig. 7 shows a curve comparison graph of a corresponding prediction data value and a test data set, and the graph can be known.

The present invention is not limited to the above preferred embodiments, and any modifications, equivalent substitutions and improvements made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A gearbox temperature prediction method based on a 1DCNN-LSTM and a BiLSTM parallel network is characterized by comprising the following steps:

collecting temperature signals of key components of a gear box to be predicted;

inputting the fused features into the full-connection layer for regression analysis to obtain a regression analysis result, and predicting the short-term temperature data value of the gearbox according to the regression analysis result.

2. The gearbox temperature prediction method based on the parallel network of 1DCNN-LSTM and BiLSTM according to claim 1, further comprising:

and normalizing the acquired temperature signals of key parts of the gearbox to be predicted, and mapping data to be between 0 and 1.

3. The gearbox temperature prediction method based on the parallel network of 1DCNN-LSTM and BiLSTM according to claim 1, wherein the 1DCNN-LSTM model comprises a 1DCNN model and an LSTM model which are connected in sequence; the 1DCNN model comprises an input layer and a hidden layer which are connected in sequence; the hidden layer comprises two groups of convolution layers and a pooling layer;

extracting space-time characteristics through the 1DCNN-LSTM model, and the method comprises the following steps:

receiving a temperature signal of a key component of the gearbox through an input layer;

is a weight value of the weight value,

wherein, pooling layer carries out pooling operation, and the number of network layers is simplified through maximum pooling, and the expression is as follows:

in the formula (I), the compound is shown in the specification,

the value of the t neuron in the ith characteristic of the ith layer is obtained; w is a pooling area;

is the output of l +1 layer neurons;

taking the output signals after the processing of the convolutional layer and the pooling layer as input signals of an LSTM model;

4. The gearbox temperature prediction method based on the parallel network of 1DCNN-LSTM and BiLSTM according to claim 3, wherein the LSTM model comprises an input gate, an output gate and a forgetting gate;

i _t ＝Sigmoid(w _f .[h _t-1 ,x _t ]+b _f )

f _t ＝Sigmoid(w _t .[h _t-1 ,x _t ]+b _i )

o _t ＝Sigmoid(w _o .[h _t-1 ,x _t ]+b _o )

5. The gearbox temperature prediction method based on the parallel 1DCNN-LSTM and BiLSTM networks as claimed in claim 1, wherein said BiLSTM model comprises a forward LSTM network and a reverse LSTM network, and the feature signals extracted by the forward LSTM network are combined with the feature signals extracted by the reverse LSTM network; wherein, the BilSTM model extraction formula is as follows:

wherein the content of the first and second substances,

for the output vector of the forward LSTM hidden layer at time t, the input vector x at the current time is used _t And the forward LSTM output vector at the previous time instant

Jointly determining; wherein the content of the first and second substances,

Jointly determining; h is _t Is the output of the BilSTM model, where ω is _t A weight matrix that is a forward LSTM output; v. of _t A weight matrix that is the inverse LSTM output; b _t Is the bias of the weight matrix.

6. The gearbox temperature prediction method based on the parallel network of 1DCNN-LSTM and BiLSTM according to claim 1, wherein said fully connected layer is a regression analysis layer, and the activation function of said fully connected layer is activated by ReLu activation function.

7. A computer-readable storage medium storing computer-executable instructions for causing a computer to perform the method for gearbox temperature prediction based on parallel networks of 1DCNN-LSTM and bilst as claimed in any one of claims 1 to 6.