CN111723527A

CN111723527A - Gear residual life prediction method based on cocktail long-term and short-term memory neural network

Info

Publication number: CN111723527A
Application number: CN202010599670.5A
Authority: CN
Inventors: 秦毅; 项盛; 陈定粮
Original assignee: Chongqing University
Current assignee: Chongqing University
Priority date: 2020-06-28
Filing date: 2020-06-28
Publication date: 2020-09-29
Anticipated expiration: 2040-06-28
Also published as: CN111723527B

Abstract

The invention relates to a method for predicting the residual life of a gear based on a cocktail long-short term memory neural network, and belongs to the technical field of automation. The method comprises the following steps: s1: constructing a gear health index based on variational self-coding; s2: defining a cocktail long-short term memory network C-LSTM; s3: and (4) constructing a health index and predicting the residual life of the C-LSTM gear based on VAE. The method comprises the steps of firstly forming a health index capable of accurately showing the degradation trend of the health state of the gear based on a variational self-encoder (VAE), then gradually predicting unknown health indexes according to the proposed long-term and short-term neural network of the cocktail, and obtaining the predicted RUL when the set threshold is reached.

Description

Gear residual life prediction method based on cocktail long-term and short-term memory neural network

Technical Field

The invention belongs to the technical field of automation, and relates to a method for predicting the residual life of a gear based on a cocktail long-term and short-term memory neural network.

Background

Gears are widely used in industry as a key component, such as wind turbines, automobiles, aircraft engines, etc. Gear faults, such as pitting, peeling and other fatigue damages, often cause the chain fault reaction of the whole equipment, cause machine halt, even cause casualties in severe cases, and cause huge economic loss and safety crisis. The definition of the Remaining Useful Life (RUL) of a gear is the length of time from the current moment to the end of its useful life, a viable strategy to determine equipment maintenance schedules and avoid accidental gear failure. The service life of the gear in service is predicted, the maintenance time of the equipment can be effectively determined, the production efficiency is improved, the continuous and efficient production is ensured, the accident rate is reduced, the sudden accident is prevented, and the gear life prediction method is significant for engineering production.

Due to the great significance of the method to industrial production, the method is more and more concerned by scholars and researchers, and therefore an online intelligent prediction method for the residual service life of the gear facing to small samples is provided.

Disclosure of Invention

In view of the above, the present invention provides a method for predicting the remaining life of a gear based on a cocktail long-short term memory neural network.

In order to achieve the purpose, the invention provides the following technical scheme:

the method for predicting the residual life of the gear based on the cocktail long-term and short-term memory neural network comprises the following steps:

s1: constructing a gear health index based on variational self-coding;

s2: defining a cocktail long-short term memory network C-LSTM;

s3: and (4) constructing a health index and predicting the residual life of the C-LSTM gear based on VAE.

Optionally, the S1 specifically includes:

s11: analyzing the principle of variational autocoder

Variational self-codingThe model is that the variational Bayesian reasoning frame is composed of a coding layer simulating posterior distribution and a decoding layer simulating anterior distribution; for normal randomly generated data sets

Containing an unobservable continuous random variable vector Z; the encoder obtains each group of data x by learning the dataⁱSpecific posterior distribution q (z)ⁱ|xⁱ) (ii) a The objective function of a variational autoencoder is written as:

wherein D is_KL(q | | p) represents the Kullback-Leibler divergence, which is a measure of the difference between one probability distribution and another, λ is the weight of the reconstruction error,

a reconstruction error of self-encoding; q (z)ⁱ|xⁱ) Obeying a normal distribution q (z)ⁱ|xⁱ)＝N(zⁱ；μⁱ,(σⁱ)²)，p(zⁱ) Multivariate Gaussian distribution N (z) set to central isotropyⁱ(ii) a 0, 1); the reconstruction error adopts a mean square error; the objective function for the above equation is rewritten as:

wherein J is xⁱThe dimension(s) of (a) is,

is defined as xⁱThe jth element in (a);

s12: construction of health index

1) Extracting time domain and frequency domain characteristics from a vibration signal of the gear; then, inputting the extracted features into the VAE for further information fusion and dimension reduction; first, for the above extracted features, the mean is removed and scaled to the unit variance, normalizing from each feature:

in the formula, x^i,jThe data point representing the ith of the jth feature,

denotes x^i,jA normalized value of (d); mu.s^jAnd σ^jRespectively representing the mean value and the standard deviation of the jth characteristic;

2) secondly, to ensure that the effect of a single node is neither diverging nor converging, the VAE initial weights are made to follow a uniform distribution

3) Finally, defining the network structure of the variation self-coding for constructing the health value target to be 21-10-1-10-21; wherein the input layer of the self-encoder is 21, and the feature dimension is correspondingly extracted; the hidden layer is 10; the output layer is 1 and corresponds to the output health index dimension; on the contrary, the decoder reduces the reconstruction error between the encoder input and the decoder output based on the BP rule, so as to achieve the purpose of optimizing the network weight, thereby training the complete VAE.

Optionally, the S3 specifically includes:

s31: acquiring a gear vibration signal with the time length of T at intervals of delta T until the gear fails, wherein the number of the sampled vibration signal sections is n;

s32: respectively calculating 21 time-frequency characteristics of the vibration signals after noise reduction, inputting the time-frequency characteristics into the trained VAE, and normalizing the output to obtain an n multiplied by 1 dimensional health index matrix X;

s33: selecting a health index matrix X1 formed by the previous m sampling points as y₁,y₂,…,y_m]^TAs a training matrix;

s34: reconstruction matrix

S35: taking k rows in front of the matrix U as input of the neural network, and taking the last row as output of the neural network to train the network;

s36: taking the k outputs to the last as network inputs to obtain the output of the next moment;

s37: repeating the steps S34-S36 for a certain number of times, and comparing the output after inverse normalization with an actual health index value to prove the effectiveness of the method; meanwhile, when the output after inverse normalization exceeds a set threshold value, the number of predicted sampling points is multiplied by the sum delta T + T of the interval time and the sampling time of the vibration signal, and the sum is the residual service life of the gear.

Optionally, in the C-LSTM, the hierarchical split point positions are respectively defined as L according to the hierarchical level of the information_l、L_lm、L_mh、L_hWherein L is_lSegmenting points, L, for short-term information_lmDivide the point, L, for short-and-medium-term information_mhDivide the point, L, for the medium and long term information_hDividing points into long-term information; according to the current input information x_tAnd recursive information h_t-1The information of each hierarchy is calculated as follows:

L_h＝F₁(x_t,h_t-1)＝indexmax(softmax(W_f1x_t+U_f1h_t-1+b_f1)) (11)

L_mh＝F₂(x_t,h_t-1)＝indexmax(softmax(W_f2x_t+U_f2h_t-1+b_f2)) (12)

L_lm＝F₃(x_t,h_t-1)＝indexmax(softmax(W_i2x_t+U_i2h_t-1+b_i2)) (13)

L_l＝F₄(x_t,h_t-1)＝indexmax(softmax(W_i1x_t+U_i1h_t-1+b_i1)) (14)

w and U are respectively input information and historical information weight, b is a threshold value, and softmax is a softmax function;

in order to make the parameters learnable, soften the evaluation process of each division point,

d_L1＝softmax(W_f1x_t+U_f1h_t-1+b_f1) (15)

d_L2＝softmax(W_f2x_t+U_f2h_t-1+b_f2) (16)

d_L3＝softmax(W_i2x_t+U_i2h_t-1+b_i2) (17)

d_L4＝softmax(W_i1x_t+U_i1h_t-1+b_i1) (18)

memory cell c_tUpdating the hierarchical relationship based on the hierarchical segmentation points, L_lm、L_mhThe interaction between the two exists in L_lAnd L_hOn the premise of interaction;

based on L_l、L_lm、L_mhAnd L_hThere are 3 updating modes for the correlation between the two types:

1) when L is_l≥L_lm≥L_mh≥L_hWhen the short-term information and the short-term information are overlapped, the short-term memory and the medium-term memory are also overlapped; low-level information is directly replaced by current input, high-level information is retained for a long time, intermediate-level information including medium-low, medium-high and medium-high information is mixed in different proportions, and a cell memory unit c_tUpdated by the following rules:

wherein z is₁And z₂The scale parameters are learnable scale parameters which respectively represent the proportion of the middle-level information in two mixed levels of a middle-level and a middle-level, k is defined as the number of hidden layer units, and the meaning of other parameters is consistent with that of the LSTM;

2) when L is_l≥L_mh≥L_lm≥L_hTime, in long-term information andon the basis of overlapping short-term information, the short-term memory and the medium-term memory are not overlapped; the low-level information is directly replaced by the current input, the high-level information is retained for a long time, the middle-low and middle-high are mixed in different proportions, the middle-level information is set to zero due to no information interaction, and the cell memory unit c_tUpdated by the following rules:

3) when L is_l≤L_hWhen the current input level is lower than the historical data level, the overlapping basis does not exist between the long-term information and the short-term information; the low-level information is directly replaced by the current input, the high-level information is preserved for a long time, the middle-level information is set to zero due to no information interaction, and the cell memory unit c_tUpdated by the following rules:

with the partitioning of multi-level information and the use of corresponding update rules, the derivation formula of C-LSTM, a variant structure of LSTM neural network, is as follows:

wherein

Which represents the lower level of the hierarchy,

which is representative of the low-to-medium levels,

the representation of the middle-level hierarchy,

which represents a high level of the hierarchy in the middle,

represents a high level, and

respectively represent a main forgetting gate, an auxiliary forgetting gate, a main input gate and an auxiliary input gate.

The invention has the beneficial effects that: the method comprises the steps of firstly forming a health index capable of accurately showing the degradation trend of the health state of the gear based on a variational self-encoder (VAE), then gradually predicting unknown health indexes according to the proposed long-term and short-term neural network of the cocktail, and obtaining the predicted RUL when the set threshold is reached.

Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objectives and other advantages of the invention may be realized and attained by the means of the instrumentalities and combinations particularly pointed out hereinafter.

Drawings

For the purposes of promoting a better understanding of the objects, aspects and advantages of the invention, reference will now be made to the following detailed description taken in conjunction with the accompanying drawings in which:

FIG. 1 is a schematic diagram of a VAE neural network;

FIG. 2 is a diagram of a VAE neural network architecture;

FIG. 3 is a diagram of a C-LSTM neural network architecture;

FIG. 4 is a diagram of a hidden layer structure of the C-LSTM neural network;

FIG. 5 shows the equation when L_l≥L_lm≥L_mh≥L_hA temporal hidden layer update mechanism;

FIG. 6 shows that when L is_l≥L_mh≥L_lm≥L_hA temporal hidden layer update mechanism;

FIG. 7 shows that when L is_h≥L_lThe time-hidden layer isA new mechanism;

FIG. 8 is a health index plot of the experimental data set;

FIG. 9 is a graph of failure threshold, training value, predicted value and actual value for predicting 90 HIs;

FIG. 10 is a graph of failure threshold, training value, predicted value and actual value for the prediction of 70 HIs;

FIG. 11 is a graph of failure threshold, training value, predicted value and actual value for predicting 50 HIs;

FIG. 12 is a MAE graph of model prediction capabilities at known different sampling points;

FIG. 13 is a comparison of the prediction capabilities of different models in predicting 60 health indicators.

Detailed Description

The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention in a schematic way, and the features in the following embodiments and examples may be combined with each other without conflict.

Wherein the showings are for the purpose of illustrating the invention only and not for the purpose of limiting the same, and in which there is shown by way of illustration only and not in the drawings in which there is no intention to limit the invention thereto; to better illustrate the embodiments of the present invention, some parts of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product; it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.

The same or similar reference numerals in the drawings of the embodiments of the present invention correspond to the same or similar components; in the description of the present invention, it should be understood that if there is an orientation or positional relationship indicated by terms such as "upper", "lower", "left", "right", "front", "rear", etc., based on the orientation or positional relationship shown in the drawings, it is only for convenience of description and simplification of description, but it is not an indication or suggestion that the referred device or element must have a specific orientation, be constructed in a specific orientation, and be operated, and therefore, the terms describing the positional relationship in the drawings are only used for illustrative purposes, and are not to be construed as limiting the present invention, and the specific meaning of the terms may be understood by those skilled in the art according to specific situations.

Gear health index construction based on variational self-coding

1. Principle of variational self-encoder

The VAE inherits the basic structure of the auto-encoder. As shown in FIG. 1, the variational self-coding model is essentially a variational Bayesian inference framework which is composed of a coding layer simulating posterior distribution and a decoding layer simulating anterior distribution. For normal randomly generated data sets

Containing an unobservable continuous random variable vector Z. The encoder obtains each group of data x by learning the dataⁱSpecific posterior distribution q (z)ⁱ|xⁱ). Therefore, the objective function of a variational auto-encoder can be written as:

self-encoded reconstruction errors. q (z)ⁱ|xⁱ) Obeying a normal distribution q (z)ⁱ|xⁱ)＝N(zⁱ；μⁱ,(σⁱ)²)，p(zⁱ) Multivariate Gaussian distribution N (z) set to central isotropyⁱ(ii) a 0,1). The reconstruction error adopts a mean square error. The objective function for the above equation can be rewritten as:

wherein J is xⁱThe dimension(s) of (a) is,

is defined as xⁱThe jth element in (a).

2. Construction of health index

1) And 21 time domain and frequency domain features such as peak-to-peak value, mean value, variance and the like are extracted from the vibration signal of the gear. These extracted features are then input into the VAE for further information fusion and dimension reduction. First, for the above extracted features, the mean is removed and scaled to the unit variance, normalizing from each feature:

in the formula, x^i,jThe data point representing the ith of the jth feature,

denotes x^i,jA normalized value of (d); mu.s^jAnd σ^jMean and standard deviation of the jth feature are indicated, respectively.

3) Finally, the network structure of the variation self-coding used for constructing the health value target is defined as 21-10-1-10-21, and the network structure is shown in FIG. 2. Wherein the input layer of the self-encoder is 21, and the feature dimension is correspondingly extracted; the hidden layer is 10; the output layer is 1 and corresponds to the dimensionality of the output health index. On the contrary, the decoder reduces the reconstruction error between the encoder input and the decoder output based on the BP rule, so as to achieve the purpose of optimizing the network weight, thereby training the complete VAE.

Cocktail long short term memory network (C-LSTM):

the training process of the general neural network is disordered among the neurons, and the long-short term memory network (ON-LSTM) based ON the ordered neurons arranges the neurons in order according to a certain rule and enables the neurons to have a certain hierarchical structure, so that hierarchical information-order information which cannot be learned by the general neural network can be used. To further enhance the ordering of neurons, the cocktail long-short term memory network (C-LSTM) considers the information interaction between two adjacent layers, and further divides the mixed (middle) level of ON-LSTM into three levels, namely low-middle, middle and middle-high. The neural network can more fully utilize the ordered information by multi-level division, so that the degradation information of the health index is more fully mined and used by the neural network, and the residual life prediction capability of the gear of the model is further improved.

High-level information represents more important information that needs to be retained in the network for a long period of time, and is also referred to as long-term information. Conversely, low-level information represents less important information, which is updated whenever there is only a short-term presence in the network, and is also referred to as short-term information. When the high-level information and the low-level information are interacted, the updating mode of the middle-level information is consistent with the LSTM updating mode, the low-level information and the middle-level information are mixed in proportion, the middle-level information and the high-level information are mixed in proportion, and represent the middle-term information, the short-term information, the middle-term information and the long-term information respectively; and when no interaction exists, the middle-level and the low-level and the middle-level are set to zero, and the related information does not participate in updating the network.

Therefore, the invention provides a new LSTM with multi-level updating rule, which is named as cocktail LSTM because the information updating mode is well-arranged and the interlayer has mixed zone like cocktail. Compared with ON-LSTM, C-LSTM can use order information more fully, thereby updating network structure reasonably. The neural network model is shown in fig. 3.

The neural network hidden layer structure is shown in fig. 4.

The flow chart of ordered neuron multi-layer updating is shown in fig. 5-7. FIG. 5 shows the equation when L_l≥L_lm≥L_mh≥L_hA temporal hidden layer update mechanism; FIG. 6 shows that when L is_l≥L_mh≥L_lm≥L_hA temporal hidden layer update mechanism; FIG. 7 shows that when L is_h≥L_lThe temporal implication is the layer update mechanism.

Health index and C-LSTM gear residual life prediction method flow constructed based on VAE

The gear residual life prediction method based on combination of VAE and LSTMPP comprises the following steps:

31. and acquiring a gear vibration signal with the time length T at intervals of delta T until the gear fails, wherein the number of the sampled vibration signal sections is n.

32. And respectively calculating 21 time-frequency characteristics of the vibration signals after noise reduction, inputting the time-frequency characteristics into the trained VAE, and normalizing the output to obtain an n X1-dimensional health index matrix X.

33. Selecting a health index matrix X1 formed by the previous m sampling points as y₁,y₂,…,y_m]^TAs a training matrix.

34. Reconstruction matrix

35. And taking the front k rows of the matrix U as the input of the neural network, and taking the last row as the output of the neural network to train the network.

36. And taking the k outputs from the last number as network inputs to obtain the output of the next moment.

37. Repeating the steps 34-36 for a certain number of times, and comparing the output after inverse normalization with the actual health index value to prove the effectiveness of the method. Meanwhile, when the output after inverse normalization exceeds a set threshold value, the number of predicted sampling points is multiplied by the sum delta T + T of the interval time and the sampling time of the vibration signal, and the sum is the residual service life of the gear.

The above is the proposed neural network model and prediction method, and the following is a part of experimental results to illustrate the effectiveness of the method.

The experiment adopts a mode that the first-stage transmission is accelerated and the second-stage transmission is decelerated, so that the transmission ratio of the experimental gearbox is just 1: 1. The experimental gear is made of 40Cr, the machining precision is 5 grades, the surface hardness is 55HRC, and the modulus is 5. Specifically, the number of large gear teeth is 31, the number of small gear teeth is 25, and the width of the first stage transmission gear is 21 mm. The gear life test obtained 4 life cycle data sets of the vibration signal under two conditions, as shown in table 1. The sampling interval was set to 60s, the sampling time was set to 10s, and the sampling frequency was set to 50000 Hz. The VAE is trained using data set 1 and data set 3 to enable construction of a Health Index (HIs). Data set 2 and data set 4 are then encoded with the trained VAEs to construct HIs.

TABLE 1 Gear Life data-related information Table

Since most of the samples obtained do not contain gear degradation information for the smooth-running phase and early failure phase, there is no need to use these data to predict the rules. Thus, the present invention utilizes only a portion of the samples in the life cycle dataset, utilizes the VAE to calculate HIs, and applies it to the prediction of gear RUL. HIs for all four gear data sets obtained is shown in FIG. 8. As can be seen from the figure, the constructed HIs can well reflect the worsening trend of the health condition of the gear, and is greatly helpful for life prediction. In addition, all gear HI curves have a descending trend, and the failure threshold values are similar, so that the uniform failure threshold values can be set for different experiments, and the robustness of gear RUL prediction is improved. And the proposed HI fault threshold fluctuates less compared to other health indicators, such as RMS and frequency center. Thus, a HI constructed using VAE can effectively improve the robustness of gear RUL prediction.

To illustrate the long-term and short-term prediction capabilities of C-LSTM, we used dataset 1 to predict 90, 70 and 50 HI points, respectively, with the prediction results shown in FIGS. 9-11.

Obviously, as the known points increase, the prediction capability gradually improves, and the predicted value is closer to the actual value. In the above case, the degradation tendency of the predicted curve is similar to that of the actual curve, and the gear regularity can be well estimated. Then, we calculated the Mean Absolute Error (MAE) to evaluate the predictive ability of C-LSTM, and through the known prediction experiments of different HI points, we obtained MAEs, as shown in FIG. 12. It can be concluded that MAE decreases as the number of known HI points increases. When the number of predicted points is 50, the percentage error of calculating the RUL prediction is 9%, namely the real RUL is 50min, and for the prediction of 70 HI points, the percentage error of the RUL prediction is 16%. The C-LSTM has long-term prediction capability and good prediction precision. To further demonstrate the long-term predictive ability of C-SLTM, we attempted to predict 90 HIs, i.e., one and a half hours for true RUL. The percentage error in the RUL prediction was calculated to be 36.4%. Therefore, even if the number of samples is known to be small, the C-SLTM still has a certain prediction capability.

In addition, in order to fully prove the superiority of the proposed C-LSTM neural network, four evaluation criteria of MAE (mean absolute error), NRMSE (standard root mean square error), Score (scoring function of life prediction performance of the institute of electrical and electronics engineers), and percent mean absolute error (MAPE) were compared with the conventional LSTM and its variant models, respectively. The method has higher prediction accuracy compared with the traditional LSTMs, and is shown in figure 13.

As can be easily seen from FIG. 13, the ON-LSTM and the C-LSTM can fully capture health information contained in input data due to mining of the sequence information, so that local optimization is easier to obtain, and high-precision prediction of the residual service life of the gear is easier to realize compared with the original LSTN and the variants thereof. In addition, the C-LATM carries out multi-level sequencing ON the neural units, so that the interlayer sequence information is fully and reasonably mined and used, and the prediction performance of the C-LATM is improved to a certain extent compared with that of ON-LSTM.

Multi-level guided ordered neuron process

A new memory cell updating rule is proposed in the C-LSTM through a multi-level sequence information guide mechanism. By adopting a new updating mechanism, the C-LSTM has a multi-level updating characteristic and can more fully mine the use order information, so that the C-LSTM is more suitable for time series prediction than the common LSTM and the variant thereof.

By means of a multi-layer mechanism, the cocktail LSTM provides a new memory cell updating mode. Respectively defining the positions of the hierarchical dividing points as L according to the hierarchical height of the information_l、L_lm、L_mh、L_hWherein L is_lSegmenting points, L, for short-term information_lmDivide the point, L, for short-and-medium-term information_mhDivide the point, L, for the medium and long term information_hIs a long-term information segmentation point. According to the current input information x_tAnd recursive information h_t-1The information of each hierarchy is calculated as follows:

L_h＝F₁(x_t,h_t-1)＝indexmax(softmax(W_f1x_t+U_f1h_t-1+b_f1)) (11)

L_mh＝F₂(x_t,h_t-1)＝indexmax(softmax(W_f2x_t+U_f2h_t-1+b_f2)) (12)

L_lm＝F₃(x_t,h_t-1)＝indexmax(softmax(W_i2x_t+U_i2h_t-1+b_i2)) (13)

L_l＝F₄(x_t,h_t-1)＝indexmax(softmax(W_i1x_t+U_i1h_t-1+b_i1)) (14)

wherein W and U are respectively the weight of the input information and the historical information, b is a threshold value, and softmax is a softmax function.

d_L1＝softmax(W_f1x_t+U_f1h_t-1+b_f1) (15)

d_L2＝softmax(W_f2x_t+U_f2h_t-1+b_f2) (16)

d_L3＝softmax(W_i2x_t+U_i2h_t-1+b_i2) (17)

d_L4＝softmax(W_i1x_t+U_i1h_t-1+b_i1) (18)

memory cell c_tBased on the hierarchy relationship update formed by these hierarchy dividing points, it is noted that L_lm、L_mhThe interaction between the two exists in L_lAnd L_hOn the premise of interaction. Considering L_l、L_lm、L_mhAnd L_hThere are only 3 update methods because of the relationship between them:

1) when L is_l≥L_lm≥L_mh≥L_hWhen the short-term information and the short-term information overlap, the short-term memory and the medium-term memory also overlap. The low-level information is directly replaced by the current input, the high-level information is retained for a long time, and the middle-level information (middle-low, middle-high and middle-high) is mixed in different proportions, so that the cell memory unit c_tUpdated by the following rules:

wherein z is₁And z₂The scale parameters are learnable scale parameters respectively and represent the proportion of the middle-level information in two mixed levels of the middle-level and the middle-level, k is defined as the number of hidden-level units, and the meaning of other parameters is consistent with that of the LSTM.

2) When L is_l≥L_mh≥L_lm≥L_hWhen the short-term information and the short-term information are overlapped, the short-term memory and the medium-term memory are not overlapped. The low-level information is directly replaced by the current input, the high-level information is retained for a long time, the middle-low and middle-high are mixed in different proportions, and the middle-level information is set to zero due to no information interaction, so that the cell memory unit c_tUpdated by the following rules:

the meaning of the parameters is consistent with the above.

3) When L is_l≤L_hWhen the current input level is lower than the historical data level, there is no overlapping basis between the long-term information and the short-term information. The low-level information is directly replaced by the current input, the high-level information is retained for a long time, and the middle-level information is zeroed due to no information interaction, so that the cell memory unit c_tUpdated by the following rules:

with the partitioning of multi-level information and the use of corresponding update rules, a variant structure of LSTM neural networks, C-LSTM, is derived, the derivation formula of which is as follows:

wherein

Which represents the lower level of the hierarchy,

which is representative of the low-to-medium levels,

the representation of the middle-level hierarchy,

which represents a high level of the hierarchy in the middle,

represents a high level, and

Finally, the above embodiments are only intended to illustrate the technical solutions of the present invention and not to limit the present invention, and although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions, and all of them should be covered by the claims of the present invention.

Claims

1. The method for predicting the residual life of the gear based on the cocktail long-term and short-term memory neural network is characterized by comprising the following steps: the method comprises the following steps:

s1: constructing a gear health index based on variational self-coding;

s2: defining a cocktail long-short term memory network C-LSTM;

2. The cocktail long short term memory neural network-based gear remaining life prediction method as claimed in claim 1, wherein: the S1 specifically includes:

s11: analyzing the principle of variational autocoder

The variational self-coding model is that a variational Bayesian reasoning framework consists of a coding layer for simulating posterior distribution and a decoding layer for simulating anterior distribution; for normal randomly generated data sets

wherein D is_KL(q | | p) represents the Kullback-Leibler divergence, being one probability distribution with another probability distributionThe measures of the rate distribution, λ is the weight of the reconstruction error,

wherein J is xⁱThe dimension(s) of (a) is,

is defined as xⁱThe jth element in (a);

s12: construction of health index

in the formula, x^i,jThe data point representing the ith of the jth feature,

3. The cocktail long short term memory neural network-based gear remaining life prediction method as claimed in claim 2, wherein: the S3 specifically includes:

s34: reconstruction matrix

4. The cocktail long short term memory neural network-based gear remaining life prediction method as claimed in claim 1, wherein: in the C-LSTM, the positions of the hierarchy dividing points are respectively defined as L according to the hierarchy height of the information_l、L_lm、L_mh、L_hWherein L is_lSegmenting points, L, for short-term information_lmDivide the point, L, for short-and-medium-term information_mhDivide the point, L, for the medium and long term information_hDividing points into long-term information; according to the current input information x_tAnd recursive information h_t-1The information of each hierarchy is calculated as follows:

L_h＝F₁(x_t,h_t-1)＝indexmax(softmax(W_f1x_t+U_f1h_t-1+b_f1)) (11)

L_mh＝F₂(x_t,h_t-1)＝indexmax(softmax(W_f2x_t+U_f2h_t-1+b_f2)) (12)

L_lm＝F₃(x_t,h_t-1)＝indexmax(softmax(W_i2x_t+U_i2h_t-1+b_i2)) (13)

L_l＝F₄(x_t,h_t-1)＝indexmax(softmax(W_i1x_t+U_i1h_t-1+b_i1)) (14)

d_L1＝softmax(W_f1x_t+U_f1h_t-1+b_f1) (15)

d_L2＝softmax(W_f2x_t+U_f2h_t-1+b_f2) (16)

d_L3＝softmax(W_i2x_t+U_i2h_t-1+b_i2) (17)

d_L4＝softmax(W_i1x_t+U_i1h_t-1+b_i1) (18)

2) when L is_l≥L_mh≥L_lm≥L_hWhen the short-term memory and the medium-term memory are overlapped, the short-term memory and the medium-term memory are not overlapped on the basis of overlapping the long-term information and the short-term information; the low-level information is directly replaced by the current input, the high-level information is retained for a long time, the middle-low and middle-high are mixed in different proportions, the middle-level information is set to zero due to no information interaction, and the cell memory unit c_tUpdated by the following rules:

wherein

Which represents the lower level of the hierarchy,

which is representative of the low-to-medium levels,

the representation of the middle-level hierarchy,

which represents a high level of the hierarchy in the middle,

represents a high level, and