CN111723527B - Method for predicting residual life of gear based on cocktail long-short-term memory neural network - Google Patents

Method for predicting residual life of gear based on cocktail long-short-term memory neural network Download PDF

Info

Publication number
CN111723527B
CN111723527B CN202010599670.5A CN202010599670A CN111723527B CN 111723527 B CN111723527 B CN 111723527B CN 202010599670 A CN202010599670 A CN 202010599670A CN 111723527 B CN111723527 B CN 111723527B
Authority
CN
China
Prior art keywords
information
level
term
short
long
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010599670.5A
Other languages
Chinese (zh)
Other versions
CN111723527A (en
Inventor
秦毅
项盛
陈定粮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University
Original Assignee
Chongqing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University filed Critical Chongqing University
Priority to CN202010599670.5A priority Critical patent/CN111723527B/en
Publication of CN111723527A publication Critical patent/CN111723527A/en
Application granted granted Critical
Publication of CN111723527B publication Critical patent/CN111723527B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01MTESTING STATIC OR DYNAMIC BALANCE OF MACHINES OR STRUCTURES; TESTING OF STRUCTURES OR APPARATUS, NOT OTHERWISE PROVIDED FOR
    • G01M13/00Testing of machine parts
    • G01M13/02Gearings; Transmission mechanisms
    • G01M13/021Gearings
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01MTESTING STATIC OR DYNAMIC BALANCE OF MACHINES OR STRUCTURES; TESTING OF STRUCTURES OR APPARATUS, NOT OTHERWISE PROVIDED FOR
    • G01M13/00Testing of machine parts
    • G01M13/02Gearings; Transmission mechanisms
    • G01M13/028Acoustic or vibration analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/10Geometric CAD
    • G06F30/17Mechanical parametric or variational design
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2119/00Details relating to the type or aim of the analysis or the optimisation
    • G06F2119/04Ageing analysis or optimisation against ageing

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Geometry (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Acoustics & Sound (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Testing Of Devices, Machine Parts, Or Other Structures Thereof (AREA)

Abstract

The invention relates to a method for predicting the residual life of a gear based on a cocktail long-short-term memory neural network, and belongs to the technical field of automation. The method comprises the following steps: s1: constructing gear health indexes based on variation self-coding; s2: defining a cocktail long-term and short-term memory network C-LSTM; s3: and predicting the residual life of the gear based on the health index constructed by the VAE and the C-LSTM. According to the method, firstly, a health index capable of accurately showing the degradation trend of the health state of the gear is formed based on a variation self-encoder (VAE), then, unknown health indexes are predicted step by step according to a proposed cocktail long-short-term neural network, and the predicted RUL can be obtained when the unknown health index reaches a set threshold.

Description

Method for predicting residual life of gear based on cocktail long-short-term memory neural network
Technical Field
The invention belongs to the technical field of automation, and relates to a method for predicting the residual life of a gear based on a cocktail long-period and short-period memory neural network.
Background
Gears are widely used in industry as a key component, such as wind turbines, automobiles, aircraft engines, etc. Gear faults, such as pitting, peeling and other fatigue damage, often lead to cascading failure reactions of the whole equipment, cause machine shutdown, even cause casualties when serious, and cause huge economic loss and safety crisis. The definition of the Remaining Useful Life (RUL) of a gear is the length of time from the current moment to the end of its useful life, and is one possible strategy to determine equipment maintenance plans and avoid unexpected gear failures. The service life prediction of the in-service gear is carried out, so that the maintenance time of equipment can be effectively determined, the production efficiency is improved, the continuous and efficient production is ensured, the accident rate is reduced, the occurrence of sudden accidents is prevented, and the method has great significance for engineering production.
Because of great significance to industrial production, the method is more and more focused by scholars and researchers, and therefore, an online small sample-oriented intelligent prediction method for the residual service life of the gear is provided.
Disclosure of Invention
Accordingly, the present invention is directed to a method for predicting the remaining life of gears based on a cocktail long-term memory neural network.
In order to achieve the above purpose, the present invention provides the following technical solutions:
the method for predicting the residual life of the gear based on the cocktail long-term and short-term memory neural network comprises the following steps:
s1: constructing gear health indexes based on variation self-coding;
s2: defining a cocktail long-term and short-term memory network C-LSTM;
s3: and predicting the residual life of the gear based on the health index constructed by the VAE and the C-LSTM.
Optionally, the S1 specifically is:
s11: principle of analysis of variational self-encoder
The variational self-coding model is that the frame is composed of a coding layer simulating posterior distribution and a decoding layer simulating anterior distribution by variational Bayesian reasoning; for normally randomly generated data setsComprising a non-observable continuous random variable vector Z; the encoder obtains each group of data x by learning the data i Proprietary posterior distribution q (z i |x i ) The method comprises the steps of carrying out a first treatment on the surface of the The objective function of the variational automatic encoder is written as:
wherein D is KL (q||p) represents the Kullback-Leibler divergence, a measure of the difference between one probability distribution and another, λ is the weight of the reconstruction error,reconstruction errors of the self-code; q (z) i |x i ) Obeys a normal distribution q (z) i |x i )=N(z i ;μ i ,(σ i ) 2 ),p(z i ) A multivariate gaussian distribution N (z) arranged to be central isotropic i The method comprises the steps of carrying out a first treatment on the surface of the 0, 1); the reconstruction error adopts a mean square error; the objective function of the above is rewritten as:
wherein J is x i Is used for the number of dimensions of (c),is defined as x i The j-th element of (a);
s12: construction of health index
1) Extracting time domain and frequency domain features from vibration signals of the gear; then, inputting the extracted features into the VAE for further information fusion and dimension reduction; first, the features extracted above are normalized from each feature by removing the mean and scaling to unity variance:
wherein x is i,j The ith data point representing the jth feature,represents x i,j Is a normalized value of (2); mu (mu) j Sum sigma j Mean and standard deviation of the j-th feature;
2) Secondly to ensure that the effect of a single node is neither divergent nor divergentConvergence to make the VAE initial weight follow uniform distribution
3) Finally, defining the network structure of the variation self-coding used for constructing the health value target as 21-10-1-10-21; wherein the self-encoder input layer is 21, corresponding to the extracted feature dimension; the hidden layer is 10; the output layer is 1, and the dimension of the health index is correspondingly output; on the contrary, the decoder reduces the reconstruction error of the input of the encoder and the output of the decoder based on BP rule, thereby achieving the purpose of optimizing the network weight and further training the complete VAE.
Optionally, the S3 specifically is:
s31: collecting vibration signals of the gear with the time length of T at intervals of delta T until the gear fails, wherein the number of sampled vibration signal segments is n;
s32: respectively calculating 21 time-frequency characteristics after noise reduction of the vibration signals, inputting the 21 time-frequency characteristics into a trained VAE, and regularizing the output to obtain an n multiplied by 1 health index matrix X;
s33: selecting a health index matrix x1= [ y ] consisting of the previous m sampling points 1 ,y 2 ,…,y m ] T As a training matrix;
s34: reconstruction matrix
S35: taking k rows in front of the matrix U as the input of the neural network, and taking the last row as the output of the neural network to train the network;
s36: taking k outputs of the reciprocal as network inputs to obtain the output of the next moment;
s37: repeating the steps S34-S36 for a certain number of times, and comparing the output signals with the actual health index value after inverse normalization to prove the effectiveness of the method; meanwhile, when the output inverse normalization is performed and then exceeds a set threshold value, the sum delta t+T of the predicted sampling point number multiplied by the vibration signal interval time and the sampling time is the residual life of the gear.
Optionally, in the C-LSTM, according to the level of the information, the level dividing points are respectively defined as L l 、L lm 、L mh 、L h Wherein L is l Partitioning points, L, for short-term information lm Dividing points and L for short and medium term information mh Dividing points, L for medium-long term information h Dividing points for long-term information; according to the current input information x t And recursive information h t-1 The hierarchical information is calculated as follows:
L h =F 1 (x t ,h t-1 )=indexmax(softmax(W f1 x t +U f1 h t-1 +b f1 )) (11)
L mh =F 2 (x t ,h t-1 )=indexmax(softmax(W f2 x t +U f2 h t-1 +b f2 )) (12)
L lm =F 3 (x t ,h t-1 )=indexmax(softmax(W i2 x t +U i2 h t-1 +b i2 )) (13)
L l =F 4 (x t ,h t-1 )=indexmax(softmax(W i1 x t +U i1 h t-1 +b i1 )) (14)
wherein W, U are weights of input information and historical information respectively, b is a threshold value, and softmax is a softmax function;
to make the parameters learnable, the segmentation point evaluation process is softened,
d L1 =softmax(W f1 x t +U f1 h t-1 +b f1 ) (15)
d L2 =softmax(W f2 x t +U f2 h t-1 +b f2 ) (16)
d L3 =softmax(W i2 x t +U i2 h t-1 +b i2 ) (17)
d L4 =softmax(W i1 x t +U i1 h t-1 +b i1 ) (18)
memory cell c t Updating L based on the hierarchical relationship formed by the hierarchical points lm 、L mh The interaction relation between the L and the L exists l And L is equal to h On the premise of interaction;
based on L l 、L lm 、L mh And L h There are 3 ways of updating the interrelationship between:
1) When L l ≥L lm ≥L mh ≥L h When the information is overlapped with the short-term information, the short-term memory and the medium-term memory are overlapped; the low-level information is directly replaced by the current input, the high-level information is reserved for a long time, the middle-level information comprises middle-low, middle-high and medium-high-level information, the mixture of different proportions is obtained, and the cell memory unit c t Updated by the following rules:
wherein z is 1 And z 2 For the learnable proportion parameters, the proportion of middle-level information in a middle-low level and a middle-high level is represented respectively, k is defined as the number of hidden layer units, and the meaning of other parameters is consistent with LSTM;
2) When L l ≥L mh ≥L lm ≥L h When the long-term information and the short-term information are overlapped, the short-term memory and the middle-term memory are not overlapped; the low-level information is directly replaced by the current input, the high-level information is reserved for a long time, the middle-low and middle-high are mixed in different proportions, the middle-level information is zeroed due to no information interaction, and the cell memory unit c t Updated by the following rules:
3) When L l ≤L h When the current input level is lower than the historical data level, the current input level is long-termThe information and the short-term information have no overlapping basis; the low-level information is directly replaced by the current input, the high-level information is reserved for a long time, the middle-level information is set to zero due to no information interaction, and the cell memory unit c t Updated by the following rules:
with the use of multi-level information partitioning and corresponding updating rules, a variant structure of an LSTM neural network, C-LSTM, is derived as follows:
wherein the method comprises the steps ofRepresents a low level +.>Representing middle and low levels,/->Represents middle level->Representing middle-high hierarchy>Represents a high level, and-> Representing a main forgetting door, a secondary forgetting door, a main input door and a secondary input door respectively.
The invention has the beneficial effects that: according to the method, firstly, a health index capable of accurately showing the degradation trend of the health state of the gear is formed based on a variation self-encoder (VAE), then, unknown health indexes are predicted step by step according to a proposed cocktail long-short-term neural network, and the predicted RUL can be obtained when the unknown health index reaches a set threshold.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention. The objects and other advantages of the invention may be realized and obtained by means of the instrumentalities and combinations particularly pointed out in the specification.
Drawings
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in the following preferred detail with reference to the accompanying drawings, in which:
FIG. 1 is a schematic diagram of a VAE neural network;
FIG. 2 is a block diagram of a VAE neural network;
FIG. 3 is a block diagram of a C-LSTM neural network;
FIG. 4 is a schematic diagram of an hidden layer of a C-LSTM neural network;
FIG. 5 is when L l ≥L lm ≥L mh ≥L h A time hidden layer update mechanism;
FIG. 6 is when L l ≥L mh ≥L lm ≥L h A time hidden layer update mechanism;
FIG. 7 is when L h ≥L l A time hidden layer update mechanism;
FIG. 8 is a health index graph of an experimental dataset;
FIG. 9 shows failure threshold values, training values, predicted values and actual values when 90 HIs are predicted;
FIG. 10 shows failure threshold, training value, predicted value and actual value when 70 pieces HIs are predicted;
FIG. 11 shows failure threshold, training value, predicted value and actual value when 50 pieces HIs are predicted;
FIG. 12 is a graph of MAE for model predictive power at various sample points;
FIG. 13 is a comparison of predictive power of different models in predicting 60 health indicators.
Detailed Description
Other advantages and effects of the present invention will become apparent to those skilled in the art from the following disclosure, which describes the embodiments of the present invention with reference to specific examples. The invention may be practiced or carried out in other embodiments that depart from the specific details, and the details of the present description may be modified or varied from the spirit and scope of the present invention. It should be noted that the illustrations provided in the following embodiments merely illustrate the basic idea of the present invention by way of illustration, and the following embodiments and features in the embodiments may be combined with each other without conflict.
Wherein the drawings are for illustrative purposes only and are shown in schematic, non-physical, and not intended to limit the invention; for the purpose of better illustrating embodiments of the invention, certain elements of the drawings may be omitted, enlarged or reduced and do not represent the size of the actual product; it will be appreciated by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The same or similar reference numbers in the drawings of embodiments of the invention correspond to the same or similar components; in the description of the present invention, it should be understood that, if there are terms such as "upper", "lower", "left", "right", "front", "rear", etc., that indicate an azimuth or a positional relationship based on the azimuth or the positional relationship shown in the drawings, it is only for convenience of describing the present invention and simplifying the description, but not for indicating or suggesting that the referred device or element must have a specific azimuth, be constructed and operated in a specific azimuth, so that the terms describing the positional relationship in the drawings are merely for exemplary illustration and should not be construed as limiting the present invention, and that the specific meaning of the above terms may be understood by those of ordinary skill in the art according to the specific circumstances.
Gear health index construction based on variation self-coding
1. Principle of variational self-encoder
The VAE inherits the basic structure of the automatic encoder. As shown in fig. 1, the variational self-coding model is essentially a variational bayesian reasoning the framework consisting of a coding layer simulating a posterior distribution and a decoding layer simulating an anterior distribution. For normally randomly generated data setsComprising a continuous random variable vector Z that is unobservable. The encoder obtains each group of data x by learning the data i Proprietary posterior distribution q (z i |x i ). Therefore, the objective function of the variational auto-encoder can be written as:
wherein D is KL (q||p) represents the Kullback-Leibler divergence, which is a measure of the difference between one probability distribution and another, λ is the weight of the reconstruction error,reconstruction errors from the codes. q (z) i |x i ) Obeys a normal distribution q (z) i |x i )=N(z i ;μ i ,(σ i ) 2 ),p(z i ) A multivariate gaussian distribution N (z) arranged to be central isotropic i The method comprises the steps of carrying out a first treatment on the surface of the 0,1). The reconstruction error adopts a mean square error. The objective function for the above equation can be rewritten as:
wherein J is x i Is used for the number of dimensions of (c),is defined as x i The j-th element of (a).
2. Construction of health index
1) 21 time domain and frequency domain features such as peak-to-peak value, mean value, variance and the like are extracted from vibration signals of the gears. These extracted features are then input into the VAE for further information fusion and dimension reduction. First, the features extracted above are normalized from each feature by removing the mean and scaling to unity variance:
wherein x is i,j The ith data point representing the jth feature,represents x i,j Is a normalized value of (2); mu (mu) j Sum sigma j Mean and standard deviation of the j-th feature are shown, respectively.
2) Secondly, to ensure that the effect of a single node is neither divergent nor convergent, the VAE initial weights follow a uniform distribution
3) Finally, the network structure of the variant self-coding used for constructing the health value targets is defined as 21-10-1-10-21, and the network structure is shown in figure 2. Wherein the self-encoder input layer is 21, corresponding to the extracted feature dimension; the hidden layer is 10; the output layer is 1, and corresponds to the dimension of the health index to be output. On the contrary, the decoder reduces the reconstruction error of the input of the encoder and the output of the decoder based on BP rule, thereby achieving the purpose of optimizing the network weight and further training the complete VAE.
Cocktail long term memory network (C-LSTM):
the neurons are unordered in the general neural network training process, and the neurons are orderly arranged according to a certain rule based ON a long-short-term memory network (ON-LSTM) of the orderly neurons and have a certain hierarchical structure, so that hierarchical information-sequential information which cannot be learned by the general neural network can be used. To further enhance the ordering of neurons, the cocktail long-short-term memory network (C-LSTM) considers the information interaction between two adjacent levels, further dividing the mixed (middle) level of ON-LSTM into three levels, low-medium, medium-high. The multi-level division enables the neural network to more fully utilize the ordered information, so that the degradation information of the neural network to the health index is more fully mined and used, and the prediction capability of the gear residual life of the model is further improved.
High-level information represents more important information that needs to be kept in the network for a long period of time, so is also called long-term information. In contrast, low-level information represents less important information, which is updated only in the short-term presence in the network at any time, and is also referred to as short-term information. When the high-level information and the low-level information are interacted, the updating mode of the middle-level information is consistent with the LSTM updating mode, the middle-level information and the low-level information are mixed in proportion, the middle-level information and the high-level information are mixed in proportion, and the middle-level information, the short-medium-term information and the medium-long-term information are represented respectively; when no interaction exists, the middle level, the middle and low levels and the middle and high levels are set to zero, and related information does not participate in updating the network.
Therefore, the invention provides a new LSTM with multi-level updating rules, which is named as cocktail LSTM because the information updating mode is well-defined and the layers are provided with mixed zones like cocktails. Compared with ON-LSTM, the C-LSTM can use the sequence information more fully, so that the network structure is updated reasonably. The neural network model is shown in fig. 3.
The hidden layer structure of the neural network is shown in fig. 4.
The flow chart of the ordered neuron multi-layered update is shown in fig. 5-7. FIG. 5 is when L l ≥L lm ≥L mh ≥L h A time hidden layer update mechanism; FIG. 6 is when L l ≥L mh ≥L lm ≥L h A time hidden layer update mechanism; FIG. 7 is when L h ≥L l Implicit layer update mechanism.
Gear residual life prediction method flow based on health index constructed by VAE and C-LSTM
The gear residual life prediction method based on the combination of VAE and LSTMPP comprises the following steps:
31. and acquiring a gear vibration signal with the time length of T at intervals of delta T until the gear fails, wherein the number of sampled vibration signal segments is n.
32. And respectively calculating 21 time-frequency characteristics of the vibration signals after noise reduction, inputting the 21 time-frequency characteristics into the trained VAE, and regularizing the output to obtain the health index matrix X with n multiplied by 1 dimensions.
33. Selecting a health index matrix x1= [ y ] consisting of the previous m sampling points 1 ,y 2 ,…,y m ] T As a training matrix.
34. Reconstruction matrix
35. The first k rows of the matrix U are taken as inputs to the neural network and the last row is taken as an output of the neural network to train the network.
36. And taking the k outputs of the reciprocal as network inputs to obtain the output of the next moment.
37. Repeating the steps 34-36 for a certain number of times, and comparing the output after the inverse normalization with the actual health index value to prove the effectiveness of the method. Meanwhile, when the output inverse normalization is performed and then exceeds a set threshold value, the sum delta t+T of the predicted sampling point number multiplied by the vibration signal interval time and the sampling time is the residual life of the gear.
The above is a proposed neural network model and prediction method, and the following is a part of experimental results that have demonstrated the effectiveness of the method.
The experiment adopts a mode of first-stage transmission acceleration and second-stage transmission deceleration, so that the transmission ratio of the experiment gearbox is just 1:1. The experimental gear is made of 40Cr, the machining precision is 5 grades, the surface hardness is 55HRC, and the modulus is 5. In particular, the number of teeth of the large gear is 31, the number of teeth of the small gear is 25, and the width of the first stage transmission gear is 21mm. The gear life test resulted in 4 life cycle data sets of vibration signals for both conditions as shown in table 1. The sampling interval was set to 60s, the sampling time was set to 10s, and the sampling frequency was set to 50000Hz. The VAE is trained using data set 1 and data set 3 to enable construction of a Health Indicator (HIs). Data set 2 and data set 4 are then encoded with the trained VAEs to construct HIs.
Table 1 table of data related information on gear life
Since most of the samples obtained do not contain gear degradation information for stationary and early failure phases, these data need not be used to predict rules. Thus, the present invention uses only a portion of the samples in the lifecycle data set, calculates HIs using the VAE, and applies it to the predictions of gear RUL. HIs of all four gear data sets obtained is shown in fig. 8. From the figure, the constructed HIs can well reflect the deterioration trend of the gear health condition, and greatly helps to life prediction. In addition, the HI curves of all gears have descending trend, and the failure thresholds are similar, so that unified failure thresholds are set for different experiments, and the robustness of RUL prediction of the gears is improved. And the proposed HI fault threshold fluctuates less than other health indicators such as RMS and frequency center. Thus, HI with VAE construction can effectively improve the robustness of gear RUL predictions.
To illustrate the long-term and short-term predictive capabilities of C-LSTM, we use data set 1 to predict 90, 70 and 50 HI points, respectively, with the prediction results shown in FIGS. 9-11.
It is apparent that as the number of known points increases, the predictive power increases gradually, and the predicted value becomes closer to the actual value. In the above case, the degradation trend of the predicted curve is similar to that of the actual curve, and the gear rule can be estimated well. Then we calculate the Mean Absolute Error (MAE) to evaluate the predictive power of C-LSTM, and we get MAEs by known prediction experiments at different HI points, as shown in FIG. 12. It can be concluded that MAE decreases with increasing number of known HI points. When the number of the predicted points is 50, the percent error of RUL prediction is calculated to be 9%, namely the real RUL is 50min, and for the predictions of 70 HI points, the percent error of RUL prediction is calculated to be 16%. The C-LSTM has long-term prediction capability and good prediction precision. To further demonstrate the long-term predictive capability of C-SLTM, we tried to predict 90 HIs, i.e., one half hour for a true RUL. The error percentage of the RUL prediction was calculated to be 36.4%. From this, it is clear that the C-SLTM still has a certain predictive power even if the number of samples is known to be small.
In addition, to fully demonstrate the superiority of the proposed C-LSTM neural network, four evaluation criteria, i.e., MAE (mean absolute error), NRMSE (standard root mean square error), score (scoring function of american society of electrical and electronic engineers for life prediction performance), mean absolute error percent (MAPE), were compared with the conventional LSTM and its variant model, respectively. Compared with the traditional LSTMs, the method has higher prediction precision, as shown in figure 13.
As can be easily seen from fig. 13, ON-LSTM and C-LSTM can fully capture health information contained in input data due to their mining of layer sequence information, so that local optimization is easier to obtain, and compared with the original LSTN and its variants, high-precision prediction of the remaining service life of the gear is easier to implement. In addition, the C-LATM performs multi-level sequencing ON the nerve units, so that the layer sequence information is fully and reasonably mined and used, and the prediction performance of the C-LATM is improved to a certain extent compared with that of the ON-LSTM.
Multilevel guided ordered neuronal process
A new memory cell update rule is proposed in the C-LSTM through a multi-level hierarchical information steering mechanism. By adopting a new updating mechanism, the C-LSTM has multi-level updating characteristics, so that the use sequence information can be more fully mined, and the method is more suitable for time sequence prediction than the common LSTM and variants thereof.
Through a multi-layered mechanism, the cocktail LSTM proposes a new way of updating memory cells. According to the level of the information, respectively defining the level division points as L l 、L lm 、L mh 、L h Wherein L is l Partitioning points, L, for short-term information lm Dividing points and L for short and medium term information mh Dividing points, L for medium-long term information h Points are partitioned for long-term information. According to the current input information x t And recursive information h t-1 The hierarchical information is calculated as follows:
L h =F 1 (x t ,h t-1 )=indexmax(softmax(W f1 x t +U f1 h t-1 +b f1 )) (11)
L mh =F 2 (x t ,h t-1 )=indexmax(softmax(W f2 x t +U f2 h t-1 +b f2 )) (12)
L lm =F 3 (x t ,h t-1 )=indexmax(softmax(W i2 x t +U i2 h t-1 +b i2 )) (13)
L l =F 4 (x t ,h t-1 )=indexmax(softmax(W i1 x t +U i1 h t-1 +b i1 )) (14)
wherein W, U are weights of input information and history information respectively, b is a threshold value, and softmax is a softmax function.
To make the parameters learnable, the segmentation point evaluation process is softened,
d L1 =softmax(W f1 x t +U f1 h t-1 +b f1 ) (15)
d L2 =softmax(W f2 x t +U f2 h t-1 +b f2 ) (16)
d L3 =softmax(W i2 x t +U i2 h t-1 +b i2 ) (17)
d L4 =softmax(W i1 x t +U i1 h t-1 +b i1 ) (18)
memory cell c t Updating the hierarchical relationship formed by the hierarchical points, notably L lm 、L mh The interaction relation between the L and the L exists l And L is equal to h On the premise of interaction. Taking into account L l 、L lm 、L mh And L h The interrelationship between them, so there are only 3 ways of updating:
1) When L l ≥L lm ≥L mh ≥L h In this case, the short-term and medium-term memories overlap with each other on the basis of the overlapping of the long-term and short-term information. The low-level information is directly replaced by the current input, the high-level information is reserved for a long time, and the middle-level information (middle-low, middle-high) is mixed in different proportions, so that the cell memory unit c t Updated by the following rules:
wherein z is 1 And z 2 Respectively, are learnable proportion parameters, respectively represent the proportion of middle-level information in the middle-low level and the middle-high level, k is defined as the number of hidden layer units, and other parameter meanings are consistent with LSTM.
2) When L l ≥L mh ≥L lm ≥L h In this case, the short-term and medium-term memories are not overlapped with each other on the basis of the overlapping of the long-term and short-term information. The low-level information is directly replaced by the current input, the high-level information is reserved for a long time, the middle-low and middle-high are mixed in different proportions, the middle-level information is set to zero due to no information interaction, and therefore the cell memory unit c t Updated by the following rules:
the meaning of the parameters is consistent with the above.
3) When L l ≤L h When the current input level is lower than the historical data level, there is no overlapping basis for long-term information and short-term information. The low-level information is directly replaced by the current input, the high-level information is reserved for a long time, and the middle-level information is set to zero due to no information interaction, so the cell memory unit c t Updated by the following rules:
with the partitioning of multi-level information and the use of corresponding update rules, a variant structure of an LSTM neural network, C-LSTM, is derived as follows:
wherein the method comprises the steps ofRepresents a low level +.>Representing middle and low levels,/->Represents middle level->Representing middle-high hierarchy>Represents a high level, and-> Representing a main forgetting door, a secondary forgetting door, a main input door and a secondary input door respectively.
Finally, it is noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made thereto without departing from the spirit and scope of the present invention, which is intended to be covered by the claims of the present invention.

Claims (2)

1. The method for predicting the residual life of the gear based on the cocktail long-short-term memory neural network is characterized by comprising the following steps of: the method comprises the following steps:
s1: constructing gear health indexes based on variation self-coding;
s2: defining a cocktail long-term and short-term memory network C-LSTM;
s3: predicting the residual life of the gear based on the health index constructed by the VAE and the C-LSTM;
the S1 specifically comprises the following steps:
s11: principle of analysis of variational self-encoder
The variational self-coding model is that the frame is composed of a coding layer simulating posterior distribution and a decoding layer simulating anterior distribution by variational Bayesian reasoning; for normally randomly generated data setsComprising a non-observable continuous random variable vector Z; the encoder obtains each group of data x by learning the data i Proprietary posterior distribution q (z i |x i ) The method comprises the steps of carrying out a first treatment on the surface of the The objective function of the variational automatic encoder is written as:
wherein D is KL (q||p) represents the Kullback-Leibler divergence, a measure of the difference between one probability distribution and another, λ is the weight of the reconstruction error,reconstruction errors of the self-code; q (z) i |x i ) Obeys a normal distribution q (z) i |x i )=N(z i ;μ i ,(σ i ) 2 ),p(z i ) A multivariate gaussian distribution N (z) arranged to be central isotropic i The method comprises the steps of carrying out a first treatment on the surface of the 0, 1); the reconstruction error adopts mean square errorDifference; the objective function of the above is rewritten as:
wherein J is x i Is used for the number of dimensions of (c),is defined as x i The j-th element of (a);
s12: construction of health index
1) Extracting time domain and frequency domain features from vibration signals of the gear; then, inputting the extracted features into the VAE for further information fusion and dimension reduction; first, the features extracted above are normalized from each feature by removing the mean and scaling to unity variance:
wherein x is i,j The ith data point representing the jth feature,represents x i,j Is a normalized value of (2); mu (mu) j Sum sigma j Mean and standard deviation of the j-th feature;
2) Secondly, to ensure that the effect of a single node is neither divergent nor convergent, the VAE initial weights follow a uniform distribution
3) Finally, defining the network structure of the variation self-coding used for constructing the health value target as 21-10-1-10-21; wherein the self-encoder input layer is 21, corresponding to the extracted feature dimension; the hidden layer is 10; the output layer is 1, and the dimension of the health index is correspondingly output; on the contrary, the decoder reduces the reconstruction error of the input of the encoder and the output of the decoder based on BP rule, thereby achieving the purpose of optimizing the network weight and further training the complete VAE;
in the C-LSTM, according to the level of the information, the level division points are respectively defined as L l 、L lm 、L mh 、L h Wherein L is l Partitioning points, L, for short-term information lm Dividing points and L for short and medium term information mh Dividing points, L for medium-long term information h Dividing points for long-term information; according to the current input information x t And recursive information h t-1 The hierarchical information is calculated as follows:
L h =F 1 (x t ,h t-1 )=indexmax(softmax(W f1 x t +U f1 h t-1 +b f1 )) (11)
L mh =F 2 (x t ,h t-1 )=indexmax(softmax(W f2 x t +U f2 h t-1 +b f2 )) (12)
L lm =F 3 (x t ,h t-1 )=indexmax(softmax(W i2 x t +U i2 h t-1 +b i2 )) (13)
L l =F 4 (x t ,h t-1 )=indexmax(softmax(W i1 x t +U i1 h t-1 +b i1 )) (14)
wherein W, U are weights of input information and historical information respectively, b is a threshold value, and softmax is a softmax function;
to make the parameters learnable, the segmentation point evaluation process is softened,
d L1 =softmax(W f1 x t +U f1 h t-1 +b f1 ) (15)
d L2 =softmax(W f2 x t +U f2 h t-1 +b f2 ) (16)
d L3 =softmax(W i2 x t +U i2 h t-1 +b i2 ) (17)
d L4 =softmax(W i1 x t +U i1 h t-1 +b i1 ) (18)
memory cell c t Updating L based on the hierarchical relationship formed by the hierarchical points lm 、L mh The interaction relation between the L and the L exists l And L is equal to h On the premise of interaction;
based on L l 、L lm 、L mh And L h There are 3 ways of updating the interrelationship between:
1) When L l ≥L lm ≥L mh ≥L h When the information is overlapped with the short-term information, the short-term memory and the medium-term memory are overlapped; the low-level information is directly replaced by the current input, the high-level information is reserved for a long time, the middle-level information comprises middle-low, middle-high and medium-high-level information, the mixture of different proportions is obtained, and the cell memory unit c t Updated by the following rules:
wherein z is 1 And z 2 For the learnable proportion parameters, the proportion of middle-level information in a middle-low level and a middle-high level is represented respectively, k is defined as the number of hidden layer units, and the meaning of other parameters is consistent with LSTM;
2) When L l ≥L mh ≥L lm ≥L h When the long-term information and the short-term information are overlapped, the short-term memory and the middle-term memory are not overlapped; the low-level information is directly replaced by the current input, the high-level information is reserved for a long time, the middle-low and middle-high are mixed in different proportions, the middle-level information is zeroed due to no information interaction, and the cell memory unit c t Updated by the following rules:
3) When L l ≤L h When the current input level is lower than the historical data level, a foundation of no overlapping exists between long-term information and short-term information; the low-level information is directly replaced by the current input, the high-level information is reserved for a long time, the middle-level information is set to zero due to no information interaction, and the cell memory unit c t Updated by the following rules:
with the use of multi-level information partitioning and corresponding updating rules, a variant structure of an LSTM neural network, C-LSTM, is derived as follows:
wherein the method comprises the steps ofRepresents a low level +.>Representing middle and low levels,/->Represents middle level->Representing middle-high hierarchy>Represents a high level, and-> Representing a main forgetting door, a secondary forgetting door, a main input door and a secondary input door respectively.
2. The method for predicting the remaining life of a gear based on a cocktail long-term memory neural network of claim 1, wherein: the step S3 is specifically as follows:
s31: collecting vibration signals of the gear with the time length of T at intervals of delta T until the gear fails, wherein the number of sampled vibration signal segments is n;
s32: respectively calculating 21 time-frequency characteristics after noise reduction of the vibration signals, inputting the 21 time-frequency characteristics into a trained VAE, regularizing the output, and obtaining an n multiplied by 1 health index matrix X;
s33: selecting a health index matrix x1= [ y ] consisting of the previous m sampling points 1 ,y 2 ,…,y m ] T As a training matrix;
s34: reconstruction matrix
S35: taking k rows in front of the matrix U as the input of the neural network, and taking the last row as the output of the neural network to train the network;
s36: taking k outputs of the reciprocal as network inputs to obtain the output of the next moment;
s37: repeating the steps S34-S36 for a certain number of times, and comparing the output signals with the actual health index value after inverse normalization to prove the effectiveness of the method; meanwhile, when the output inverse normalization is performed and then exceeds a set threshold value, the sum delta t+T of the predicted sampling point number multiplied by the vibration signal interval time and the sampling time is the residual life of the gear.
CN202010599670.5A 2020-06-28 2020-06-28 Method for predicting residual life of gear based on cocktail long-short-term memory neural network Active CN111723527B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010599670.5A CN111723527B (en) 2020-06-28 2020-06-28 Method for predicting residual life of gear based on cocktail long-short-term memory neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010599670.5A CN111723527B (en) 2020-06-28 2020-06-28 Method for predicting residual life of gear based on cocktail long-short-term memory neural network

Publications (2)

Publication Number Publication Date
CN111723527A CN111723527A (en) 2020-09-29
CN111723527B true CN111723527B (en) 2024-04-16

Family

ID=72569149

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010599670.5A Active CN111723527B (en) 2020-06-28 2020-06-28 Method for predicting residual life of gear based on cocktail long-short-term memory neural network

Country Status (1)

Country Link
CN (1) CN111723527B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112926505B (en) * 2021-03-24 2022-11-11 重庆大学 Rotating machine health index construction method based on DTC-VAE neural network
CN112966400B (en) * 2021-04-23 2023-04-18 重庆大学 Centrifugal fan fault trend prediction method based on multi-source information fusion
CN113566953B (en) * 2021-09-23 2021-11-30 中国空气动力研究与发展中心设备设计与测试技术研究所 Online monitoring method for flexible-wall spray pipe
CN113836822A (en) * 2021-10-28 2021-12-24 重庆大学 Aero-engine service life prediction method based on MCLSTM model
CN114246563B (en) * 2021-12-17 2023-11-17 重庆大学 Heart and lung function intelligent monitoring equipment based on millimeter wave radar
CN115542172A (en) * 2022-12-01 2022-12-30 湖北工业大学 Power battery fault detection method, system, device and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018071005A1 (en) * 2016-10-11 2018-04-19 Hitachi, Ltd. Deep long short term memory network for estimation of remaining useful life of the components
CN108595409A (en) * 2018-03-16 2018-09-28 上海大学 A kind of requirement documents based on neural network and service document matches method
CN109726524A (en) * 2019-03-01 2019-05-07 哈尔滨理工大学 A kind of rolling bearing remaining life prediction technique based on CNN and LSTM
CN110210126A (en) * 2019-05-31 2019-09-06 重庆大学 A kind of prediction technique of the gear remaining life based on LSTMPP
CN110570035A (en) * 2019-09-02 2019-12-13 上海交通大学 people flow prediction system for simultaneously modeling space-time dependency and daily flow dependency
KR20200002665A (en) * 2018-06-29 2020-01-08 성균관대학교산학협력단 Prognostics and health management systems for component of vehicle and methods thereof
CN111047482A (en) * 2019-11-14 2020-04-21 华中师范大学 Knowledge tracking system and method based on hierarchical memory network

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018071005A1 (en) * 2016-10-11 2018-04-19 Hitachi, Ltd. Deep long short term memory network for estimation of remaining useful life of the components
CN108595409A (en) * 2018-03-16 2018-09-28 上海大学 A kind of requirement documents based on neural network and service document matches method
KR20200002665A (en) * 2018-06-29 2020-01-08 성균관대학교산학협력단 Prognostics and health management systems for component of vehicle and methods thereof
CN109726524A (en) * 2019-03-01 2019-05-07 哈尔滨理工大学 A kind of rolling bearing remaining life prediction technique based on CNN and LSTM
CN110210126A (en) * 2019-05-31 2019-09-06 重庆大学 A kind of prediction technique of the gear remaining life based on LSTMPP
CN110570035A (en) * 2019-09-02 2019-12-13 上海交通大学 people flow prediction system for simultaneously modeling space-time dependency and daily flow dependency
CN111047482A (en) * 2019-11-14 2020-04-21 华中师范大学 Knowledge tracking system and method based on hierarchical memory network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Cocktail LSTM and Its Application Into Machine Remaining Useful Life Prediction;Sheng Xiang 等;IEEE/ASME Transactions on Mechatronics;20230224;第28卷(第5期);2425-2436 *

Also Published As

Publication number Publication date
CN111723527A (en) 2020-09-29

Similar Documents

Publication Publication Date Title
CN111723527B (en) Method for predicting residual life of gear based on cocktail long-short-term memory neural network
Gorjian et al. A review on degradation models in reliability analysis
CN110046743B (en) Public building energy consumption prediction method and system based on GA-ANN
CN110633792B (en) End-to-end bearing health index construction method based on convolution cyclic neural network
CN111695521B (en) Attention-LSTM-based rolling bearing performance degradation prediction method
CN114580545A (en) Wind turbine generator gearbox fault early warning method based on fusion model
CN104503420A (en) Non-linear process industry fault prediction method based on novel FDE-ELM and EFSM
CN116167527A (en) Pure data-driven power system static safety operation risk online assessment method
CN116503118A (en) Waste household appliance value evaluation system based on classification selection reinforcement prediction model
CN111475986B (en) LSTM-AON-based gear residual life prediction method
Jiang et al. Measurement of health evolution tendency for aircraft engine using a data-driven method based on multi-scale series reconstruction and adaptive hybrid model
Jiang et al. Paired ensemble and group knowledge measurement for health evaluation of wind turbine gearbox under compound fault scenarios
Wang et al. A transformer-based multi-entity load forecasting method for integrated energy systems
CN112767692A (en) Short-term traffic flow prediction system based on SARIMA-GA-Elman combined model
CN111475987B (en) SAE and ON-LSTM-based gear residual life prediction method
Kai et al. Notice of Retraction: A Novel Forecasting Model of Fuzzy Time Series Based on K-means Clustering
Rubinstein et al. Time series forecasting of crude oil consumption using neuro-fuzzy inference
Jiang et al. Multistep degradation tendency prediction for aircraft engines based on CEEMDAN permutation entropy and improved Grey–Markov model
Wei et al. A new BRB model for cloud security-state prediction based on the large-scale monitoring data
CN115577854A (en) Quantile regression wind speed interval prediction method based on EEMD-RBF combination
CN112599205B (en) Event-driven design method for total phosphorus soft measurement model of effluent in sewage treatment process
Tran et al. Effects of Data Standardization on Hyperparameter Optimization with the Grid Search Algorithm Based on Deep Learning: A Case Study of Electric Load Forecasting
Maliyaem The Amount of Solid Waste Forecasting using Time Series ANFIS
CN111143774A (en) Power load prediction method and device based on influence factor multi-state model
Sharma et al. Stochastic behaviour and performance analysis of an industrial system using GABLT technique

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant