CN111723527A - Gear residual life prediction method based on cocktail long-term and short-term memory neural network - Google Patents

Gear residual life prediction method based on cocktail long-term and short-term memory neural network Download PDF

Info

Publication number
CN111723527A
CN111723527A CN202010599670.5A CN202010599670A CN111723527A CN 111723527 A CN111723527 A CN 111723527A CN 202010599670 A CN202010599670 A CN 202010599670A CN 111723527 A CN111723527 A CN 111723527A
Authority
CN
China
Prior art keywords
information
term
short
level
gear
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010599670.5A
Other languages
Chinese (zh)
Other versions
CN111723527B (en
Inventor
秦毅
项盛
陈定粮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University
Original Assignee
Chongqing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University filed Critical Chongqing University
Priority to CN202010599670.5A priority Critical patent/CN111723527B/en
Publication of CN111723527A publication Critical patent/CN111723527A/en
Application granted granted Critical
Publication of CN111723527B publication Critical patent/CN111723527B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01MTESTING STATIC OR DYNAMIC BALANCE OF MACHINES OR STRUCTURES; TESTING OF STRUCTURES OR APPARATUS, NOT OTHERWISE PROVIDED FOR
    • G01M13/00Testing of machine parts
    • G01M13/02Gearings; Transmission mechanisms
    • G01M13/021Gearings
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01MTESTING STATIC OR DYNAMIC BALANCE OF MACHINES OR STRUCTURES; TESTING OF STRUCTURES OR APPARATUS, NOT OTHERWISE PROVIDED FOR
    • G01M13/00Testing of machine parts
    • G01M13/02Gearings; Transmission mechanisms
    • G01M13/028Acoustic or vibration analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/10Geometric CAD
    • G06F30/17Mechanical parametric or variational design
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2119/00Details relating to the type or aim of the analysis or the optimisation
    • G06F2119/04Ageing analysis or optimisation against ageing

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Geometry (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Acoustics & Sound (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Testing Of Devices, Machine Parts, Or Other Structures Thereof (AREA)

Abstract

The invention relates to a method for predicting the residual life of a gear based on a cocktail long-short term memory neural network, and belongs to the technical field of automation. The method comprises the following steps: s1: constructing a gear health index based on variational self-coding; s2: defining a cocktail long-short term memory network C-LSTM; s3: and (4) constructing a health index and predicting the residual life of the C-LSTM gear based on VAE. The method comprises the steps of firstly forming a health index capable of accurately showing the degradation trend of the health state of the gear based on a variational self-encoder (VAE), then gradually predicting unknown health indexes according to the proposed long-term and short-term neural network of the cocktail, and obtaining the predicted RUL when the set threshold is reached.

Description

Gear residual life prediction method based on cocktail long-term and short-term memory neural network
Technical Field
The invention belongs to the technical field of automation, and relates to a method for predicting the residual life of a gear based on a cocktail long-term and short-term memory neural network.
Background
Gears are widely used in industry as a key component, such as wind turbines, automobiles, aircraft engines, etc. Gear faults, such as pitting, peeling and other fatigue damages, often cause the chain fault reaction of the whole equipment, cause machine halt, even cause casualties in severe cases, and cause huge economic loss and safety crisis. The definition of the Remaining Useful Life (RUL) of a gear is the length of time from the current moment to the end of its useful life, a viable strategy to determine equipment maintenance schedules and avoid accidental gear failure. The service life of the gear in service is predicted, the maintenance time of the equipment can be effectively determined, the production efficiency is improved, the continuous and efficient production is ensured, the accident rate is reduced, the sudden accident is prevented, and the gear life prediction method is significant for engineering production.
Due to the great significance of the method to industrial production, the method is more and more concerned by scholars and researchers, and therefore an online intelligent prediction method for the residual service life of the gear facing to small samples is provided.
Disclosure of Invention
In view of the above, the present invention provides a method for predicting the remaining life of a gear based on a cocktail long-short term memory neural network.
In order to achieve the purpose, the invention provides the following technical scheme:
the method for predicting the residual life of the gear based on the cocktail long-term and short-term memory neural network comprises the following steps:
s1: constructing a gear health index based on variational self-coding;
s2: defining a cocktail long-short term memory network C-LSTM;
s3: and (4) constructing a health index and predicting the residual life of the C-LSTM gear based on VAE.
Optionally, the S1 specifically includes:
s11: analyzing the principle of variational autocoder
Variational self-codingThe model is that the variational Bayesian reasoning frame is composed of a coding layer simulating posterior distribution and a decoding layer simulating anterior distribution; for normal randomly generated data sets
Figure BDA0002558189100000011
Containing an unobservable continuous random variable vector Z; the encoder obtains each group of data x by learning the dataiSpecific posterior distribution q (z)i|xi) (ii) a The objective function of a variational autoencoder is written as:
Figure BDA0002558189100000021
wherein D isKL(q | | p) represents the Kullback-Leibler divergence, which is a measure of the difference between one probability distribution and another, λ is the weight of the reconstruction error,
Figure BDA0002558189100000022
a reconstruction error of self-encoding; q (z)i|xi) Obeying a normal distribution q (z)i|xi)=N(zi;μi,(σi)2),p(zi) Multivariate Gaussian distribution N (z) set to central isotropyi(ii) a 0, 1); the reconstruction error adopts a mean square error; the objective function for the above equation is rewritten as:
Figure BDA0002558189100000023
wherein J is xiThe dimension(s) of (a) is,
Figure BDA0002558189100000024
is defined as xiThe jth element in (a);
s12: construction of health index
1) Extracting time domain and frequency domain characteristics from a vibration signal of the gear; then, inputting the extracted features into the VAE for further information fusion and dimension reduction; first, for the above extracted features, the mean is removed and scaled to the unit variance, normalizing from each feature:
Figure BDA0002558189100000025
in the formula, xi,jThe data point representing the ith of the jth feature,
Figure BDA0002558189100000026
denotes xi,jA normalized value of (d); mu.sjAnd σjRespectively representing the mean value and the standard deviation of the jth characteristic;
2) secondly, to ensure that the effect of a single node is neither diverging nor converging, the VAE initial weights are made to follow a uniform distribution
Figure BDA0002558189100000027
3) Finally, defining the network structure of the variation self-coding for constructing the health value target to be 21-10-1-10-21; wherein the input layer of the self-encoder is 21, and the feature dimension is correspondingly extracted; the hidden layer is 10; the output layer is 1 and corresponds to the output health index dimension; on the contrary, the decoder reduces the reconstruction error between the encoder input and the decoder output based on the BP rule, so as to achieve the purpose of optimizing the network weight, thereby training the complete VAE.
Optionally, the S3 specifically includes:
s31: acquiring a gear vibration signal with the time length of T at intervals of delta T until the gear fails, wherein the number of the sampled vibration signal sections is n;
s32: respectively calculating 21 time-frequency characteristics of the vibration signals after noise reduction, inputting the time-frequency characteristics into the trained VAE, and normalizing the output to obtain an n multiplied by 1 dimensional health index matrix X;
s33: selecting a health index matrix X1 formed by the previous m sampling points as y1,y2,…,ym]TAs a training matrix;
s34: reconstruction matrix
Figure BDA0002558189100000031
S35: taking k rows in front of the matrix U as input of the neural network, and taking the last row as output of the neural network to train the network;
s36: taking the k outputs to the last as network inputs to obtain the output of the next moment;
s37: repeating the steps S34-S36 for a certain number of times, and comparing the output after inverse normalization with an actual health index value to prove the effectiveness of the method; meanwhile, when the output after inverse normalization exceeds a set threshold value, the number of predicted sampling points is multiplied by the sum delta T + T of the interval time and the sampling time of the vibration signal, and the sum is the residual service life of the gear.
Optionally, in the C-LSTM, the hierarchical split point positions are respectively defined as L according to the hierarchical level of the informationl、Llm、Lmh、LhWherein L islSegmenting points, L, for short-term informationlmDivide the point, L, for short-and-medium-term informationmhDivide the point, L, for the medium and long term informationhDividing points into long-term information; according to the current input information xtAnd recursive information ht-1The information of each hierarchy is calculated as follows:
Lh=F1(xt,ht-1)=indexmax(softmax(Wf1xt+Uf1ht-1+bf1)) (11)
Lmh=F2(xt,ht-1)=indexmax(softmax(Wf2xt+Uf2ht-1+bf2)) (12)
Llm=F3(xt,ht-1)=indexmax(softmax(Wi2xt+Ui2ht-1+bi2)) (13)
Ll=F4(xt,ht-1)=indexmax(softmax(Wi1xt+Ui1ht-1+bi1)) (14)
w and U are respectively input information and historical information weight, b is a threshold value, and softmax is a softmax function;
in order to make the parameters learnable, soften the evaluation process of each division point,
dL1=softmax(Wf1xt+Uf1ht-1+bf1) (15)
dL2=softmax(Wf2xt+Uf2ht-1+bf2) (16)
dL3=softmax(Wi2xt+Ui2ht-1+bi2) (17)
dL4=softmax(Wi1xt+Ui1ht-1+bi1) (18)
memory cell ctUpdating the hierarchical relationship based on the hierarchical segmentation points, Llm、LmhThe interaction between the two exists in LlAnd LhOn the premise of interaction;
based on Ll、Llm、LmhAnd LhThere are 3 updating modes for the correlation between the two types:
1) when L isl≥Llm≥Lmh≥LhWhen the short-term information and the short-term information are overlapped, the short-term memory and the medium-term memory are also overlapped; low-level information is directly replaced by current input, high-level information is retained for a long time, intermediate-level information including medium-low, medium-high and medium-high information is mixed in different proportions, and a cell memory unit ctUpdated by the following rules:
Figure BDA0002558189100000041
wherein z is1And z2The scale parameters are learnable scale parameters which respectively represent the proportion of the middle-level information in two mixed levels of a middle-level and a middle-level, k is defined as the number of hidden layer units, and the meaning of other parameters is consistent with that of the LSTM;
2) when L isl≥Lmh≥Llm≥LhTime, in long-term information andon the basis of overlapping short-term information, the short-term memory and the medium-term memory are not overlapped; the low-level information is directly replaced by the current input, the high-level information is retained for a long time, the middle-low and middle-high are mixed in different proportions, the middle-level information is set to zero due to no information interaction, and the cell memory unit ctUpdated by the following rules:
Figure BDA0002558189100000042
3) when L isl≤LhWhen the current input level is lower than the historical data level, the overlapping basis does not exist between the long-term information and the short-term information; the low-level information is directly replaced by the current input, the high-level information is preserved for a long time, the middle-level information is set to zero due to no information interaction, and the cell memory unit ctUpdated by the following rules:
Figure BDA0002558189100000043
with the partitioning of multi-level information and the use of corresponding update rules, the derivation formula of C-LSTM, a variant structure of LSTM neural network, is as follows:
Figure BDA0002558189100000051
wherein
Figure BDA0002558189100000052
Which represents the lower level of the hierarchy,
Figure BDA0002558189100000053
which is representative of the low-to-medium levels,
Figure BDA0002558189100000054
the representation of the middle-level hierarchy,
Figure BDA0002558189100000055
which represents a high level of the hierarchy in the middle,
Figure BDA0002558189100000056
represents a high level, and
Figure BDA0002558189100000057
Figure BDA0002558189100000058
respectively represent a main forgetting gate, an auxiliary forgetting gate, a main input gate and an auxiliary input gate.
The invention has the beneficial effects that: the method comprises the steps of firstly forming a health index capable of accurately showing the degradation trend of the health state of the gear based on a variational self-encoder (VAE), then gradually predicting unknown health indexes according to the proposed long-term and short-term neural network of the cocktail, and obtaining the predicted RUL when the set threshold is reached.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objectives and other advantages of the invention may be realized and attained by the means of the instrumentalities and combinations particularly pointed out hereinafter.
Drawings
For the purposes of promoting a better understanding of the objects, aspects and advantages of the invention, reference will now be made to the following detailed description taken in conjunction with the accompanying drawings in which:
FIG. 1 is a schematic diagram of a VAE neural network;
FIG. 2 is a diagram of a VAE neural network architecture;
FIG. 3 is a diagram of a C-LSTM neural network architecture;
FIG. 4 is a diagram of a hidden layer structure of the C-LSTM neural network;
FIG. 5 shows the equation when Ll≥Llm≥Lmh≥LhA temporal hidden layer update mechanism;
FIG. 6 shows that when L isl≥Lmh≥Llm≥LhA temporal hidden layer update mechanism;
FIG. 7 shows that when L ish≥LlThe time-hidden layer isA new mechanism;
FIG. 8 is a health index plot of the experimental data set;
FIG. 9 is a graph of failure threshold, training value, predicted value and actual value for predicting 90 HIs;
FIG. 10 is a graph of failure threshold, training value, predicted value and actual value for the prediction of 70 HIs;
FIG. 11 is a graph of failure threshold, training value, predicted value and actual value for predicting 50 HIs;
FIG. 12 is a MAE graph of model prediction capabilities at known different sampling points;
FIG. 13 is a comparison of the prediction capabilities of different models in predicting 60 health indicators.
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention in a schematic way, and the features in the following embodiments and examples may be combined with each other without conflict.
Wherein the showings are for the purpose of illustrating the invention only and not for the purpose of limiting the same, and in which there is shown by way of illustration only and not in the drawings in which there is no intention to limit the invention thereto; to better illustrate the embodiments of the present invention, some parts of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product; it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The same or similar reference numerals in the drawings of the embodiments of the present invention correspond to the same or similar components; in the description of the present invention, it should be understood that if there is an orientation or positional relationship indicated by terms such as "upper", "lower", "left", "right", "front", "rear", etc., based on the orientation or positional relationship shown in the drawings, it is only for convenience of description and simplification of description, but it is not an indication or suggestion that the referred device or element must have a specific orientation, be constructed in a specific orientation, and be operated, and therefore, the terms describing the positional relationship in the drawings are only used for illustrative purposes, and are not to be construed as limiting the present invention, and the specific meaning of the terms may be understood by those skilled in the art according to specific situations.
Gear health index construction based on variational self-coding
1. Principle of variational self-encoder
The VAE inherits the basic structure of the auto-encoder. As shown in FIG. 1, the variational self-coding model is essentially a variational Bayesian inference framework which is composed of a coding layer simulating posterior distribution and a decoding layer simulating anterior distribution. For normal randomly generated data sets
Figure BDA0002558189100000061
Containing an unobservable continuous random variable vector Z. The encoder obtains each group of data x by learning the dataiSpecific posterior distribution q (z)i|xi). Therefore, the objective function of a variational auto-encoder can be written as:
Figure BDA0002558189100000062
wherein D isKL(q | | p) represents the Kullback-Leibler divergence, which is a measure of the difference between one probability distribution and another, λ is the weight of the reconstruction error,
Figure BDA0002558189100000071
self-encoded reconstruction errors. q (z)i|xi) Obeying a normal distribution q (z)i|xi)=N(zi;μi,(σi)2),p(zi) Multivariate Gaussian distribution N (z) set to central isotropyi(ii) a 0,1). The reconstruction error adopts a mean square error. The objective function for the above equation can be rewritten as:
Figure BDA0002558189100000072
wherein J is xiThe dimension(s) of (a) is,
Figure BDA0002558189100000073
is defined as xiThe jth element in (a).
2. Construction of health index
1) And 21 time domain and frequency domain features such as peak-to-peak value, mean value, variance and the like are extracted from the vibration signal of the gear. These extracted features are then input into the VAE for further information fusion and dimension reduction. First, for the above extracted features, the mean is removed and scaled to the unit variance, normalizing from each feature:
Figure BDA0002558189100000074
in the formula, xi,jThe data point representing the ith of the jth feature,
Figure BDA0002558189100000075
denotes xi,jA normalized value of (d); mu.sjAnd σjMean and standard deviation of the jth feature are indicated, respectively.
2) Secondly, to ensure that the effect of a single node is neither diverging nor converging, the VAE initial weights are made to follow a uniform distribution
Figure BDA0002558189100000076
3) Finally, the network structure of the variation self-coding used for constructing the health value target is defined as 21-10-1-10-21, and the network structure is shown in FIG. 2. Wherein the input layer of the self-encoder is 21, and the feature dimension is correspondingly extracted; the hidden layer is 10; the output layer is 1 and corresponds to the dimensionality of the output health index. On the contrary, the decoder reduces the reconstruction error between the encoder input and the decoder output based on the BP rule, so as to achieve the purpose of optimizing the network weight, thereby training the complete VAE.
Cocktail long short term memory network (C-LSTM):
the training process of the general neural network is disordered among the neurons, and the long-short term memory network (ON-LSTM) based ON the ordered neurons arranges the neurons in order according to a certain rule and enables the neurons to have a certain hierarchical structure, so that hierarchical information-order information which cannot be learned by the general neural network can be used. To further enhance the ordering of neurons, the cocktail long-short term memory network (C-LSTM) considers the information interaction between two adjacent layers, and further divides the mixed (middle) level of ON-LSTM into three levels, namely low-middle, middle and middle-high. The neural network can more fully utilize the ordered information by multi-level division, so that the degradation information of the health index is more fully mined and used by the neural network, and the residual life prediction capability of the gear of the model is further improved.
High-level information represents more important information that needs to be retained in the network for a long period of time, and is also referred to as long-term information. Conversely, low-level information represents less important information, which is updated whenever there is only a short-term presence in the network, and is also referred to as short-term information. When the high-level information and the low-level information are interacted, the updating mode of the middle-level information is consistent with the LSTM updating mode, the low-level information and the middle-level information are mixed in proportion, the middle-level information and the high-level information are mixed in proportion, and represent the middle-term information, the short-term information, the middle-term information and the long-term information respectively; and when no interaction exists, the middle-level and the low-level and the middle-level are set to zero, and the related information does not participate in updating the network.
Therefore, the invention provides a new LSTM with multi-level updating rule, which is named as cocktail LSTM because the information updating mode is well-arranged and the interlayer has mixed zone like cocktail. Compared with ON-LSTM, C-LSTM can use order information more fully, thereby updating network structure reasonably. The neural network model is shown in fig. 3.
The neural network hidden layer structure is shown in fig. 4.
The flow chart of ordered neuron multi-layer updating is shown in fig. 5-7. FIG. 5 shows the equation when Ll≥Llm≥Lmh≥LhA temporal hidden layer update mechanism; FIG. 6 shows that when L isl≥Lmh≥Llm≥LhA temporal hidden layer update mechanism; FIG. 7 shows that when L ish≥LlThe temporal implication is the layer update mechanism.
Health index and C-LSTM gear residual life prediction method flow constructed based on VAE
The gear residual life prediction method based on combination of VAE and LSTMPP comprises the following steps:
31. and acquiring a gear vibration signal with the time length T at intervals of delta T until the gear fails, wherein the number of the sampled vibration signal sections is n.
32. And respectively calculating 21 time-frequency characteristics of the vibration signals after noise reduction, inputting the time-frequency characteristics into the trained VAE, and normalizing the output to obtain an n X1-dimensional health index matrix X.
33. Selecting a health index matrix X1 formed by the previous m sampling points as y1,y2,…,ym]TAs a training matrix.
34. Reconstruction matrix
Figure BDA0002558189100000081
35. And taking the front k rows of the matrix U as the input of the neural network, and taking the last row as the output of the neural network to train the network.
36. And taking the k outputs from the last number as network inputs to obtain the output of the next moment.
37. Repeating the steps 34-36 for a certain number of times, and comparing the output after inverse normalization with the actual health index value to prove the effectiveness of the method. Meanwhile, when the output after inverse normalization exceeds a set threshold value, the number of predicted sampling points is multiplied by the sum delta T + T of the interval time and the sampling time of the vibration signal, and the sum is the residual service life of the gear.
The above is the proposed neural network model and prediction method, and the following is a part of experimental results to illustrate the effectiveness of the method.
The experiment adopts a mode that the first-stage transmission is accelerated and the second-stage transmission is decelerated, so that the transmission ratio of the experimental gearbox is just 1: 1. The experimental gear is made of 40Cr, the machining precision is 5 grades, the surface hardness is 55HRC, and the modulus is 5. Specifically, the number of large gear teeth is 31, the number of small gear teeth is 25, and the width of the first stage transmission gear is 21 mm. The gear life test obtained 4 life cycle data sets of the vibration signal under two conditions, as shown in table 1. The sampling interval was set to 60s, the sampling time was set to 10s, and the sampling frequency was set to 50000 Hz. The VAE is trained using data set 1 and data set 3 to enable construction of a Health Index (HIs). Data set 2 and data set 4 are then encoded with the trained VAEs to construct HIs.
TABLE 1 Gear Life data-related information Table
Figure BDA0002558189100000091
Since most of the samples obtained do not contain gear degradation information for the smooth-running phase and early failure phase, there is no need to use these data to predict the rules. Thus, the present invention utilizes only a portion of the samples in the life cycle dataset, utilizes the VAE to calculate HIs, and applies it to the prediction of gear RUL. HIs for all four gear data sets obtained is shown in FIG. 8. As can be seen from the figure, the constructed HIs can well reflect the worsening trend of the health condition of the gear, and is greatly helpful for life prediction. In addition, all gear HI curves have a descending trend, and the failure threshold values are similar, so that the uniform failure threshold values can be set for different experiments, and the robustness of gear RUL prediction is improved. And the proposed HI fault threshold fluctuates less compared to other health indicators, such as RMS and frequency center. Thus, a HI constructed using VAE can effectively improve the robustness of gear RUL prediction.
To illustrate the long-term and short-term prediction capabilities of C-LSTM, we used dataset 1 to predict 90, 70 and 50 HI points, respectively, with the prediction results shown in FIGS. 9-11.
Obviously, as the known points increase, the prediction capability gradually improves, and the predicted value is closer to the actual value. In the above case, the degradation tendency of the predicted curve is similar to that of the actual curve, and the gear regularity can be well estimated. Then, we calculated the Mean Absolute Error (MAE) to evaluate the predictive ability of C-LSTM, and through the known prediction experiments of different HI points, we obtained MAEs, as shown in FIG. 12. It can be concluded that MAE decreases as the number of known HI points increases. When the number of predicted points is 50, the percentage error of calculating the RUL prediction is 9%, namely the real RUL is 50min, and for the prediction of 70 HI points, the percentage error of the RUL prediction is 16%. The C-LSTM has long-term prediction capability and good prediction precision. To further demonstrate the long-term predictive ability of C-SLTM, we attempted to predict 90 HIs, i.e., one and a half hours for true RUL. The percentage error in the RUL prediction was calculated to be 36.4%. Therefore, even if the number of samples is known to be small, the C-SLTM still has a certain prediction capability.
In addition, in order to fully prove the superiority of the proposed C-LSTM neural network, four evaluation criteria of MAE (mean absolute error), NRMSE (standard root mean square error), Score (scoring function of life prediction performance of the institute of electrical and electronics engineers), and percent mean absolute error (MAPE) were compared with the conventional LSTM and its variant models, respectively. The method has higher prediction accuracy compared with the traditional LSTMs, and is shown in figure 13.
As can be easily seen from FIG. 13, the ON-LSTM and the C-LSTM can fully capture health information contained in input data due to mining of the sequence information, so that local optimization is easier to obtain, and high-precision prediction of the residual service life of the gear is easier to realize compared with the original LSTN and the variants thereof. In addition, the C-LATM carries out multi-level sequencing ON the neural units, so that the interlayer sequence information is fully and reasonably mined and used, and the prediction performance of the C-LATM is improved to a certain extent compared with that of ON-LSTM.
Multi-level guided ordered neuron process
A new memory cell updating rule is proposed in the C-LSTM through a multi-level sequence information guide mechanism. By adopting a new updating mechanism, the C-LSTM has a multi-level updating characteristic and can more fully mine the use order information, so that the C-LSTM is more suitable for time series prediction than the common LSTM and the variant thereof.
By means of a multi-layer mechanism, the cocktail LSTM provides a new memory cell updating mode. Respectively defining the positions of the hierarchical dividing points as L according to the hierarchical height of the informationl、Llm、Lmh、LhWherein L islSegmenting points, L, for short-term informationlmDivide the point, L, for short-and-medium-term informationmhDivide the point, L, for the medium and long term informationhIs a long-term information segmentation point. According to the current input information xtAnd recursive information ht-1The information of each hierarchy is calculated as follows:
Lh=F1(xt,ht-1)=indexmax(softmax(Wf1xt+Uf1ht-1+bf1)) (11)
Lmh=F2(xt,ht-1)=indexmax(softmax(Wf2xt+Uf2ht-1+bf2)) (12)
Llm=F3(xt,ht-1)=indexmax(softmax(Wi2xt+Ui2ht-1+bi2)) (13)
Ll=F4(xt,ht-1)=indexmax(softmax(Wi1xt+Ui1ht-1+bi1)) (14)
wherein W and U are respectively the weight of the input information and the historical information, b is a threshold value, and softmax is a softmax function.
In order to make the parameters learnable, soften the evaluation process of each division point,
dL1=softmax(Wf1xt+Uf1ht-1+bf1) (15)
dL2=softmax(Wf2xt+Uf2ht-1+bf2) (16)
dL3=softmax(Wi2xt+Ui2ht-1+bi2) (17)
dL4=softmax(Wi1xt+Ui1ht-1+bi1) (18)
memory cell ctBased on the hierarchy relationship update formed by these hierarchy dividing points, it is noted that Llm、LmhThe interaction between the two exists in LlAnd LhOn the premise of interaction. Considering Ll、Llm、LmhAnd LhThere are only 3 update methods because of the relationship between them:
1) when L isl≥Llm≥Lmh≥LhWhen the short-term information and the short-term information overlap, the short-term memory and the medium-term memory also overlap. The low-level information is directly replaced by the current input, the high-level information is retained for a long time, and the middle-level information (middle-low, middle-high and middle-high) is mixed in different proportions, so that the cell memory unit ctUpdated by the following rules:
Figure BDA0002558189100000111
wherein z is1And z2The scale parameters are learnable scale parameters respectively and represent the proportion of the middle-level information in two mixed levels of the middle-level and the middle-level, k is defined as the number of hidden-level units, and the meaning of other parameters is consistent with that of the LSTM.
2) When L isl≥Lmh≥Llm≥LhWhen the short-term information and the short-term information are overlapped, the short-term memory and the medium-term memory are not overlapped. The low-level information is directly replaced by the current input, the high-level information is retained for a long time, the middle-low and middle-high are mixed in different proportions, and the middle-level information is set to zero due to no information interaction, so that the cell memory unit ctUpdated by the following rules:
Figure BDA0002558189100000112
the meaning of the parameters is consistent with the above.
3) When L isl≤LhWhen the current input level is lower than the historical data level, there is no overlapping basis between the long-term information and the short-term information. The low-level information is directly replaced by the current input, the high-level information is retained for a long time, and the middle-level information is zeroed due to no information interaction, so that the cell memory unit ctUpdated by the following rules:
Figure BDA0002558189100000113
with the partitioning of multi-level information and the use of corresponding update rules, a variant structure of LSTM neural networks, C-LSTM, is derived, the derivation formula of which is as follows:
Figure BDA0002558189100000121
wherein
Figure BDA0002558189100000122
Which represents the lower level of the hierarchy,
Figure BDA0002558189100000123
which is representative of the low-to-medium levels,
Figure BDA0002558189100000124
the representation of the middle-level hierarchy,
Figure BDA0002558189100000125
which represents a high level of the hierarchy in the middle,
Figure BDA0002558189100000126
represents a high level, and
Figure BDA0002558189100000127
Figure BDA0002558189100000128
respectively represent a main forgetting gate, an auxiliary forgetting gate, a main input gate and an auxiliary input gate.
Finally, the above embodiments are only intended to illustrate the technical solutions of the present invention and not to limit the present invention, and although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions, and all of them should be covered by the claims of the present invention.

Claims (4)

1. The method for predicting the residual life of the gear based on the cocktail long-term and short-term memory neural network is characterized by comprising the following steps: the method comprises the following steps:
s1: constructing a gear health index based on variational self-coding;
s2: defining a cocktail long-short term memory network C-LSTM;
s3: and (4) constructing a health index and predicting the residual life of the C-LSTM gear based on VAE.
2. The cocktail long short term memory neural network-based gear remaining life prediction method as claimed in claim 1, wherein: the S1 specifically includes:
s11: analyzing the principle of variational autocoder
The variational self-coding model is that a variational Bayesian reasoning framework consists of a coding layer for simulating posterior distribution and a decoding layer for simulating anterior distribution; for normal randomly generated data sets
Figure FDA0002558189090000011
Containing an unobservable continuous random variable vector Z; the encoder obtains each group of data x by learning the dataiSpecific posterior distribution q (z)i|xi) (ii) a The objective function of a variational autoencoder is written as:
Figure FDA0002558189090000012
wherein D isKL(q | | p) represents the Kullback-Leibler divergence, being one probability distribution with another probability distributionThe measures of the rate distribution, λ is the weight of the reconstruction error,
Figure FDA0002558189090000013
a reconstruction error of self-encoding; q (z)i|xi) Obeying a normal distribution q (z)i|xi)=N(zi;μi,(σi)2),p(zi) Multivariate Gaussian distribution N (z) set to central isotropyi(ii) a 0, 1); the reconstruction error adopts a mean square error; the objective function for the above equation is rewritten as:
Figure FDA0002558189090000014
wherein J is xiThe dimension(s) of (a) is,
Figure FDA0002558189090000015
is defined as xiThe jth element in (a);
s12: construction of health index
1) Extracting time domain and frequency domain characteristics from a vibration signal of the gear; then, inputting the extracted features into the VAE for further information fusion and dimension reduction; first, for the above extracted features, the mean is removed and scaled to the unit variance, normalizing from each feature:
Figure FDA0002558189090000016
in the formula, xi,jThe data point representing the ith of the jth feature,
Figure FDA0002558189090000017
denotes xi,jA normalized value of (d); mu.sjAnd σjRespectively representing the mean value and the standard deviation of the jth characteristic;
2) secondly, to ensure that the effect of a single node is neither diverging nor converging, the VAE initial weights are made to follow a uniform distribution
Figure FDA0002558189090000021
3) Finally, defining the network structure of the variation self-coding for constructing the health value target to be 21-10-1-10-21; wherein the input layer of the self-encoder is 21, and the feature dimension is correspondingly extracted; the hidden layer is 10; the output layer is 1 and corresponds to the output health index dimension; on the contrary, the decoder reduces the reconstruction error between the encoder input and the decoder output based on the BP rule, so as to achieve the purpose of optimizing the network weight, thereby training the complete VAE.
3. The cocktail long short term memory neural network-based gear remaining life prediction method as claimed in claim 2, wherein: the S3 specifically includes:
s31: acquiring a gear vibration signal with the time length of T at intervals of delta T until the gear fails, wherein the number of the sampled vibration signal sections is n;
s32: respectively calculating 21 time-frequency characteristics of the vibration signals after noise reduction, inputting the time-frequency characteristics into the trained VAE, and normalizing the output to obtain an n multiplied by 1 dimensional health index matrix X;
s33: selecting a health index matrix X1 formed by the previous m sampling points as y1,y2,…,ym]TAs a training matrix;
s34: reconstruction matrix
Figure FDA0002558189090000022
S35: taking k rows in front of the matrix U as input of the neural network, and taking the last row as output of the neural network to train the network;
s36: taking the k outputs to the last as network inputs to obtain the output of the next moment;
s37: repeating the steps S34-S36 for a certain number of times, and comparing the output after inverse normalization with an actual health index value to prove the effectiveness of the method; meanwhile, when the output after inverse normalization exceeds a set threshold value, the number of predicted sampling points is multiplied by the sum delta T + T of the interval time and the sampling time of the vibration signal, and the sum is the residual service life of the gear.
4. The cocktail long short term memory neural network-based gear remaining life prediction method as claimed in claim 1, wherein: in the C-LSTM, the positions of the hierarchy dividing points are respectively defined as L according to the hierarchy height of the informationl、Llm、Lmh、LhWherein L islSegmenting points, L, for short-term informationlmDivide the point, L, for short-and-medium-term informationmhDivide the point, L, for the medium and long term informationhDividing points into long-term information; according to the current input information xtAnd recursive information ht-1The information of each hierarchy is calculated as follows:
Lh=F1(xt,ht-1)=indexmax(softmax(Wf1xt+Uf1ht-1+bf1)) (11)
Lmh=F2(xt,ht-1)=indexmax(softmax(Wf2xt+Uf2ht-1+bf2)) (12)
Llm=F3(xt,ht-1)=indexmax(softmax(Wi2xt+Ui2ht-1+bi2)) (13)
Ll=F4(xt,ht-1)=indexmax(softmax(Wi1xt+Ui1ht-1+bi1)) (14)
w and U are respectively input information and historical information weight, b is a threshold value, and softmax is a softmax function;
in order to make the parameters learnable, soften the evaluation process of each division point,
dL1=softmax(Wf1xt+Uf1ht-1+bf1) (15)
dL2=softmax(Wf2xt+Uf2ht-1+bf2) (16)
dL3=softmax(Wi2xt+Ui2ht-1+bi2) (17)
dL4=softmax(Wi1xt+Ui1ht-1+bi1) (18)
memory cell ctUpdating the hierarchical relationship based on the hierarchical segmentation points, Llm、LmhThe interaction between the two exists in LlAnd LhOn the premise of interaction;
based on Ll、Llm、LmhAnd LhThere are 3 updating modes for the correlation between the two types:
1) when L isl≥Llm≥Lmh≥LhWhen the short-term information and the short-term information are overlapped, the short-term memory and the medium-term memory are also overlapped; low-level information is directly replaced by current input, high-level information is retained for a long time, intermediate-level information including medium-low, medium-high and medium-high information is mixed in different proportions, and a cell memory unit ctUpdated by the following rules:
Figure FDA0002558189090000031
wherein z is1And z2The scale parameters are learnable scale parameters which respectively represent the proportion of the middle-level information in two mixed levels of a middle-level and a middle-level, k is defined as the number of hidden layer units, and the meaning of other parameters is consistent with that of the LSTM;
2) when L isl≥Lmh≥Llm≥LhWhen the short-term memory and the medium-term memory are overlapped, the short-term memory and the medium-term memory are not overlapped on the basis of overlapping the long-term information and the short-term information; the low-level information is directly replaced by the current input, the high-level information is retained for a long time, the middle-low and middle-high are mixed in different proportions, the middle-level information is set to zero due to no information interaction, and the cell memory unit ctUpdated by the following rules:
Figure FDA0002558189090000041
3) when L isl≤LhWhen the current input level is lower than the historical data level, the overlapping basis does not exist between the long-term information and the short-term information; the low-level information is directly replaced by the current input, the high-level information is preserved for a long time, the middle-level information is set to zero due to no information interaction, and the cell memory unit ctUpdated by the following rules:
Figure FDA0002558189090000042
with the partitioning of multi-level information and the use of corresponding update rules, the derivation formula of C-LSTM, a variant structure of LSTM neural network, is as follows:
Figure FDA0002558189090000043
wherein
Figure FDA0002558189090000044
Which represents the lower level of the hierarchy,
Figure FDA0002558189090000045
which is representative of the low-to-medium levels,
Figure FDA0002558189090000046
the representation of the middle-level hierarchy,
Figure FDA0002558189090000047
which represents a high level of the hierarchy in the middle,
Figure FDA0002558189090000048
represents a high level, and
Figure FDA0002558189090000049
Figure FDA00025581890900000410
respectively represent a main forgetting gate, an auxiliary forgetting gate, a main input gate and an auxiliary input gate.
CN202010599670.5A 2020-06-28 2020-06-28 Method for predicting residual life of gear based on cocktail long-short-term memory neural network Active CN111723527B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010599670.5A CN111723527B (en) 2020-06-28 2020-06-28 Method for predicting residual life of gear based on cocktail long-short-term memory neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010599670.5A CN111723527B (en) 2020-06-28 2020-06-28 Method for predicting residual life of gear based on cocktail long-short-term memory neural network

Publications (2)

Publication Number Publication Date
CN111723527A true CN111723527A (en) 2020-09-29
CN111723527B CN111723527B (en) 2024-04-16

Family

ID=72569149

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010599670.5A Active CN111723527B (en) 2020-06-28 2020-06-28 Method for predicting residual life of gear based on cocktail long-short-term memory neural network

Country Status (1)

Country Link
CN (1) CN111723527B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112926505A (en) * 2021-03-24 2021-06-08 重庆大学 Rotating machine health index construction method based on DTC-VAE neural network
CN112966400A (en) * 2021-04-23 2021-06-15 重庆大学 Centrifugal fan trend prediction method based on multi-source information fusion
CN113566953A (en) * 2021-09-23 2021-10-29 中国空气动力研究与发展中心设备设计与测试技术研究所 Online monitoring method for flexible-wall spray pipe
CN113836822A (en) * 2021-10-28 2021-12-24 重庆大学 Aero-engine service life prediction method based on MCLSTM model
CN114246563A (en) * 2021-12-17 2022-03-29 重庆大学 Intelligent heart and lung function monitoring equipment based on millimeter wave radar
CN115542172A (en) * 2022-12-01 2022-12-30 湖北工业大学 Power battery fault detection method, system, device and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018071005A1 (en) * 2016-10-11 2018-04-19 Hitachi, Ltd. Deep long short term memory network for estimation of remaining useful life of the components
CN108595409A (en) * 2018-03-16 2018-09-28 上海大学 A kind of requirement documents based on neural network and service document matches method
CN109726524A (en) * 2019-03-01 2019-05-07 哈尔滨理工大学 A kind of rolling bearing remaining life prediction technique based on CNN and LSTM
CN110210126A (en) * 2019-05-31 2019-09-06 重庆大学 A kind of prediction technique of the gear remaining life based on LSTMPP
CN110570035A (en) * 2019-09-02 2019-12-13 上海交通大学 people flow prediction system for simultaneously modeling space-time dependency and daily flow dependency
KR20200002665A (en) * 2018-06-29 2020-01-08 성균관대학교산학협력단 Prognostics and health management systems for component of vehicle and methods thereof
CN111047482A (en) * 2019-11-14 2020-04-21 华中师范大学 Knowledge tracking system and method based on hierarchical memory network

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018071005A1 (en) * 2016-10-11 2018-04-19 Hitachi, Ltd. Deep long short term memory network for estimation of remaining useful life of the components
CN108595409A (en) * 2018-03-16 2018-09-28 上海大学 A kind of requirement documents based on neural network and service document matches method
KR20200002665A (en) * 2018-06-29 2020-01-08 성균관대학교산학협력단 Prognostics and health management systems for component of vehicle and methods thereof
CN109726524A (en) * 2019-03-01 2019-05-07 哈尔滨理工大学 A kind of rolling bearing remaining life prediction technique based on CNN and LSTM
CN110210126A (en) * 2019-05-31 2019-09-06 重庆大学 A kind of prediction technique of the gear remaining life based on LSTMPP
CN110570035A (en) * 2019-09-02 2019-12-13 上海交通大学 people flow prediction system for simultaneously modeling space-time dependency and daily flow dependency
CN111047482A (en) * 2019-11-14 2020-04-21 华中师范大学 Knowledge tracking system and method based on hierarchical memory network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SHENG XIANG 等: "Cocktail LSTM and Its Application Into Machine Remaining Useful Life Prediction", IEEE/ASME TRANSACTIONS ON MECHATRONICS, vol. 28, no. 5, 24 February 2023 (2023-02-24), pages 2425 - 2436 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112926505A (en) * 2021-03-24 2021-06-08 重庆大学 Rotating machine health index construction method based on DTC-VAE neural network
CN112926505B (en) * 2021-03-24 2022-11-11 重庆大学 Rotating machine health index construction method based on DTC-VAE neural network
CN112966400A (en) * 2021-04-23 2021-06-15 重庆大学 Centrifugal fan trend prediction method based on multi-source information fusion
CN113566953A (en) * 2021-09-23 2021-10-29 中国空气动力研究与发展中心设备设计与测试技术研究所 Online monitoring method for flexible-wall spray pipe
CN113566953B (en) * 2021-09-23 2021-11-30 中国空气动力研究与发展中心设备设计与测试技术研究所 Online monitoring method for flexible-wall spray pipe
CN113836822A (en) * 2021-10-28 2021-12-24 重庆大学 Aero-engine service life prediction method based on MCLSTM model
CN114246563A (en) * 2021-12-17 2022-03-29 重庆大学 Intelligent heart and lung function monitoring equipment based on millimeter wave radar
CN114246563B (en) * 2021-12-17 2023-11-17 重庆大学 Heart and lung function intelligent monitoring equipment based on millimeter wave radar
CN115542172A (en) * 2022-12-01 2022-12-30 湖北工业大学 Power battery fault detection method, system, device and storage medium

Also Published As

Publication number Publication date
CN111723527B (en) 2024-04-16

Similar Documents

Publication Publication Date Title
CN111723527A (en) Gear residual life prediction method based on cocktail long-term and short-term memory neural network
Wang et al. Modeling carbon emission trajectory of China, US and India
Chen et al. Railway turnout system RUL prediction based on feature fusion and genetic programming
CN108520320A (en) A kind of equipment life prediction technique based on multiple shot and long term memory network and Empirical Bayes
CN109766745B (en) Reinforced learning tri-state combined long-time and short-time memory neural network system and training and predicting method
CN110175425B (en) Prediction method of residual life of gear based on MMALSTM
CN114912077B (en) Sea wave forecasting method integrating random search and mixed decomposition error correction
CN114580545A (en) Wind turbine generator gearbox fault early warning method based on fusion model
CN109447305B (en) Trend prediction method based on quantum weighted long-time and short-time memory neural network
CN113935513A (en) CEEMDAN-based short-term power load prediction method
CN114186379A (en) Transformer state evaluation method based on echo network and deep residual error neural network
Lu Research on GDP forecast analysis combining BP neural network and ARIMA model
CN114862035B (en) Combined bay water temperature prediction method based on transfer learning
Xiang et al. Cocktail LSTM and its application into machine remaining useful life prediction
CN111475986B (en) LSTM-AON-based gear residual life prediction method
Saravanan et al. PREDICTION OF INDIA'S ELECTRICITY DEMAND USING ANFIS.
CN112767692A (en) Short-term traffic flow prediction system based on SARIMA-GA-Elman combined model
CN111475987B (en) SAE and ON-LSTM-based gear residual life prediction method
Wang et al. A Transformer-based multi-entity load forecasting method for integrated energy systems
Meddour et al. Selection of bearing health indicator by GRA for ANFIS-based forecasting of remaining useful life
CN117078191A (en) Data-driven multi-system rail transit emergency collaborative decision-making method and device
Kai et al. Notice of Retraction: A Novel Forecasting Model of Fuzzy Time Series Based on K-means Clustering
CN112329335B (en) Long-term prediction method for content of dissolved gas in transformer oil
CN112183814A (en) Short-term wind speed prediction method
Choi et al. Development of Data-based Hierarchical Learning Model for Predicting Condition Rating of Bridge Members over Time

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant