CN114429153B - Gear box increment fault diagnosis method and system based on life learning - Google Patents
Gear box increment fault diagnosis method and system based on life learning Download PDFInfo
- Publication number
- CN114429153B CN114429153B CN202111677774.4A CN202111677774A CN114429153B CN 114429153 B CN114429153 B CN 114429153B CN 202111677774 A CN202111677774 A CN 202111677774A CN 114429153 B CN114429153 B CN 114429153B
- Authority
- CN
- China
- Prior art keywords
- stage
- fault diagnosis
- model
- fault
- learning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000003745 diagnosis Methods 0.000 title claims abstract description 108
- 238000000034 method Methods 0.000 title claims abstract description 48
- 238000012549 training Methods 0.000 claims abstract description 47
- 230000002776 aggregation Effects 0.000 claims abstract description 40
- 238000004220 aggregation Methods 0.000 claims abstract description 40
- 230000036541 health Effects 0.000 claims abstract description 21
- 210000002569 neuron Anatomy 0.000 claims abstract description 20
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 claims description 28
- 230000006870 function Effects 0.000 claims description 22
- 238000013508 migration Methods 0.000 claims description 14
- 230000005012 migration Effects 0.000 claims description 14
- 238000012360 testing method Methods 0.000 claims description 13
- 238000005457 optimization Methods 0.000 claims description 11
- 238000013140 knowledge distillation Methods 0.000 claims description 10
- 238000000605 extraction Methods 0.000 claims description 8
- 238000013527 convolutional neural network Methods 0.000 claims description 6
- 230000001133 acceleration Effects 0.000 claims description 4
- 238000007710 freezing Methods 0.000 claims description 4
- 230000008569 process Effects 0.000 claims description 4
- 230000008014 freezing Effects 0.000 claims description 3
- 239000012633 leachable Substances 0.000 claims description 3
- 238000013135 deep learning Methods 0.000 abstract description 7
- 238000013526 transfer learning Methods 0.000 abstract description 5
- 230000006872 improvement Effects 0.000 description 8
- 238000012545 processing Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 4
- 238000009825 accumulation Methods 0.000 description 3
- 238000004590 computer program Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 230000003862 health status Effects 0.000 description 3
- 238000010801 machine learning Methods 0.000 description 3
- 238000003860 storage Methods 0.000 description 3
- 238000005520 cutting process Methods 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 206010027175 memory impairment Diseases 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 230000002411 adverse Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000004888 barrier function Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000010892 electric spark Methods 0.000 description 1
- 230000001747 exhibiting effect Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2218/00—Aspects of pattern recognition specially adapted for signal processing
- G06F2218/12—Classification; Matching
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24133—Distances to prototypes
- G06F18/24137—Distances to cluster centroïds
- G06F18/2414—Smoothing the distance, e.g. radial basis function networks [RBFN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/01—Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2218/00—Aspects of pattern recognition specially adapted for signal processing
- G06F2218/08—Feature extraction
Abstract
The invention discloses a gear box increment fault diagnosis method and system based on life learning, comprising the following steps: s101: collecting vibration data of a gear box to construct an increment health state data set, and dividing the increment health state data set into fault diagnosis tasks of different stages; s102: constructing an initial stage diagnosis model by utilizing the fault diagnosis task of the initial stage of the original ResNet-32 network learning; s103: initializing a ResNet-32 double-branch aggregation network by using an initial stage diagnosis model, and increasing the number of neurons of a classification layer according to the number of newly-increased fault types; s104: training the stage diagnosis model through the selected typical cases and the stage fault diagnosis task data together, and selecting the typical cases of the stage fault diagnosis task data after training is completed; s105: repeating the steps S103-S104 in the subsequent increment stage to obtain a final fault diagnosis model, and performing fault diagnosis. The invention aims to solve the problem that the existing fault diagnosis model based on deep learning and transfer learning cannot diagnose actual unexpected faults of the gear box.
Description
Technical Field
The invention relates to the technical field of mechanical fault diagnosis, in particular to a gear box increment fault diagnosis method and system based on life learning.
Background
With the rapid development of modern industrialization, the precision and importance of rotating machinery are increasing. Rotary machines have become one of the most widely used industrial machines, and there is an increasing demand for reliability. Rotary machines are used in many fields, such as aviation, navigation, machinery, chemical, energy, electric power, etc., and their service conditions show an increasingly complex trend, and performance degradation or even failure inevitably occurs during operation, resulting in huge economic losses, higher and higher operation and maintenance costs, and even catastrophic casualties, causing irrecoverable adverse effects to the environment and society. Therefore, the research of the health state monitoring and fault diagnosis method is carried out by taking the rotary mechanical equipment as an object, and the method has important significance for guaranteeing the safe and reliable operation of the mechanical equipment, preventing the critical equipment from faults and avoiding huge economic loss and disastrous accidents.
The requirements on high speed, heavy load and automation degree of modern rotary mechanical equipment are continuously improved, the dynamic signals are more complex, the modern state monitoring technology can realize multi-measuring point and full-service-life data acquisition of complex equipment, and massive data are further obtained, but the processing of the dynamic signals and the feature extraction of health state information in the dynamic signals bring great difficulty. The traditional fault diagnosis method comprises the steps of extracting fault characteristic frequency based on a vibration signal, short-time Fourier transformation, empirical mode decomposition, sparse representation and the like. The methods are mature, but for the state signals of the current mechanical equipment, the method based on signal processing does not have the capability of processing fault data with low density, strong interference and diversity under variable working conditions in a large amount of signal data.
In recent years, with the rapid development of the fields of artificial intelligence and machine learning, more and more intelligent fault diagnosis methods of rotating machinery based on machine learning are proposed. Machine learning-based fault diagnosis generally comprises the steps of signal acquisition, feature extraction, fault identification, prediction and the like. The method greatly simplifies the fault diagnosis process and improves the diagnosis efficiency, but has a simple structure and limited layers due to most shallow networks, the effectiveness of the method depends on the effectiveness of the pre-processing extraction features, and the processing capacity is limited when facing a large number of equipment state signals with complex structures.
In recent years, a plurality of students overcome the defect that a shallow model is difficult to characterize a complex mapping relation between signals and health conditions by utilizing excellent self-adaptive feature learning and extraction capability of deep learning, and achieve good effects. However, these methods are based on two assumptions: the training data is distributed with the test data and the training data is sufficient. In actual engineering, the operation conditions of mechanical equipment are changeable, faults occur accidentally, the obtained samples are difficult to meet the two assumptions, and the fault diagnosis result is directly affected. With the rapid development of transfer learning, with the aid of knowledge mining and transfer capability among cross fields and cross distribution, a transfer learning solution for a label sample limited (very small sample or no sample) problem or a variable working condition problem is developed in the field of mechanical fault diagnosis. However, the transfer learning can only meet the fault diagnosis of a single target task, namely, the primary transfer can be completed under the given conditions of a source domain and a target domain, and the generalization capability of a model is greatly reduced and the universality is poor when the model faces a new task due to the diversity of mechanical equipment faults and operation conditions; on the other hand, the migration learning does not involve accumulation of knowledge, and when the equipment state recognition task is carried out under the working condition corresponding to the source domain data, the performance is poor and is inconsistent with the actual requirements of engineering.
In practice, due to the complex and varying operating conditions, the machine often generates unexpected faults, resulting in increased fault types, disabling the deep diagnostic model and the deep migration diagnostic model trained by pre-collecting semi-complete fault data, thus requiring retraining of the model to identify new fault types. However, training the depth model directly with the new type of data will result in the identification of the old fault class exhibiting a cliff drop, which is referred to as catastrophic forgetfulness. Catastrophic forgetting has been an important problem in the field of deep learning, and also in the field of fault diagnosis, it is necessary to study the problem of solving the catastrophic forgetting of a deep diagnosis model caused by unexpected faults, so as to build a life-long fault diagnosis model with higher reliability, generalization and versatility.
Disclosure of Invention
The invention aims to provide a gear box increment fault diagnosis method and system based on life learning, which are used for solving the problem that the conventional fault diagnosis model based on deep learning and transfer learning cannot diagnose actual unexpected faults of a gear box.
In order to solve the technical problems, the invention provides a gear box increment fault diagnosis method based on life learning, which comprises the following steps:
s101: collecting vibration data of a gear box to construct an increment health state data set, and dividing the increment health state data set into fault diagnosis tasks of different stages;
s102: utilizing an original ResNet-32 network to learn a fault diagnosis task at an initial stage, constructing an initial stage diagnosis model, and selecting a typical example of fault diagnosis task data at the initial stage;
s103: initializing a ResNet-32 double-branch aggregation network by using an initial stage diagnosis model, wherein the ResNet-32 double-branch aggregation network adopts a cosine standardized classifier, and the number of neurons of a classification layer is increased according to the number of newly increased fault types;
s104: training the stage diagnosis model through the selected typical cases and the stage fault diagnosis task data together, and selecting the typical cases of the stage fault diagnosis task data after training is completed;
the method comprises the steps of representing migration capacities of different residual block layers by using aggregation weights in the training process, reducing the difference of new and old stage diagnosis models in the old stage fault diagnosis task data by combining a knowledge distillation loss function, and optimizing the aggregation weights and model parameters by using a double-layer optimization scheme;
s105: repeating the steps S103-S104 in the subsequent increment stage to obtain a final fault diagnosis model, and performing fault diagnosis.
As a further improvement of the present invention, the step S101 specifically includes the steps of:
acquiring vibration signals of the gearbox by using an acceleration sensor to construct an incremental health state data set D;
if there are n+1 fault diagnosis tasks in total, there are n+1 learning phases, i.e., the fault diagnosis task 0 of the initial phase and N incremental phases, during which the number of diagnosis tasks gradually increases;
in the nth stage, the training data of task n is wherein ,Pn Is the number of fault data samples for task n;
if J n Representing old fault class C 0:n-1 ={C 0 ,C 1 ,…,C n-1 Number, K of n Representing a new fault class C n And then J n+1 =K n +J n ,Representing the i-th sample, +.>
As a further improvement of the present invention, the step S102 specifically includes the steps of:
after training, the feature extractor F before the classification layer is utilized 0 A certain number of typical examples epsilon are selected through a locking algorithm 0 。
As a further improvement of the present invention, a feature extractor F before the classification layer is utilized 0 A certain number of typical examples are selected through a locking algorithm, including:
by usingIndicating the reasonTraining samples of barrier class c, then class c averages +.> wherein ,Pc Is the number of training samples of class c;
or the number of selected classics is t, each classics passesCalculated epsilon= (e) 1 ,e 2 ,…,e t )。
As a further improvement of the present invention, the step S103 specifically includes the steps of:
replacing the original ResNet-32 network with a ResNet-32 dual-branch aggregation network, wherein the ResNet-32 dual-branch aggregation network comprises a dynamic branch and a steady-state branch;
the dynamic branch is a conventional parameter level fine tuning, namely, an initial stage diagnosis model is used for initializing the dynamic branch of an incremental stage, and tasks of each stage are used for training and fine tuning the parameter alpha;
the steady-state branch is the fine adjustment of the neuron level after freezing the network parameters in the initial stage, namely, each neuron is given a weight beta and is trained and fine adjusted by each stage of task, if the k-th layer convolutional neural network of the steady-state branch comprises Q neurons, the neuron weight is the parameter frozen by the initial modelThe input of the k-th layer convolutional neural network is x k-1 Output is x k =(W k ⊙β k )x k-1 Wherein, the addition is Hadamard product;
the cosine standardized classifier of the increment stage n passesCalculating a predictive probability of the input x being class c, wherein θ n Full connection classification layer parameters for delta phase n, h n For the feature extracted in delta phase n, +.>Representation l 2 Norms (F/F)>Eta is a leachable scaling parameter and controls the cosine similarity value at [ -1,1]Within the range;
for the failure category increase, the number of classification layer neurons increases to coincide with the number of failure categories.
As a further improvement of the present invention, the representing the migration capability of different residual block layers by using the aggregation weight includes:
using the classical epsilon reserved in the initial stage 0 And the stage task data D 0 Training a double-branch aggregation network, and respectively endowing self-adaptive aggregation weights omega and xi aiming at different migration capacities of a dynamic residual block and a steady residual block of each residual block layer;
the fault training data x [0] Extracting features through a double-branch aggregation network, and extracting features of dynamic residual blocks at an mth residual block layer are as followsThe steady state residual block extraction is characterized by +.>
As a further improvement of the present invention, the loss function of the initial stage is a classification cross entropy loss
The loss function of the increment stage is classified cross entropy lossAnd knowledge distillation loss-> wherein , and />The temperature T is typically greater than 1 for soft labels with old models in the old failure class and hard labels with new models in the old failure class, respectively.
As a further improvement of the invention, the loss function of the incremental phase isWherein lambda is more than 0 and less than or equal to 1;
The non-optimized parameters of the increment stage have model parameters theta n And aggregate weights ω and ζ, the update for aggregate weights ω and ζ requires a fixed model parameter Θ n Adopting a double-layer optimization scheme;
using randomly sampled data sets D n ObtainingEstablishing balance data->By passing throughUpdating the aggregate weight, wherein γ 2 Is the learning rate of the upper layer problem.
As a further improvement of the invention, after each incremental training is completed, the performance of the model in new and old tasks is tested by using the test data of all learned tasks, and the capability of the model for not forgetting learning is verified, which comprises the following steps:
the model Θ obtained by training in the increment stage n n All learned faults C 0:n The test data contains all learned fault classes to verify that the model has the ability to learn without forgetting.
The gear box increment fault diagnosis system based on the life learning adopts the gear box increment fault diagnosis method based on the life learning to carry out gear box fault diagnosis.
The invention has the beneficial effects that: according to the method for diagnosing the faults of the gearbox, firstly, the vibration signals of the gearbox are collected through the acceleration sensor to construct an increment health state data set, diagnosis tasks in different stages are divided, and the increase of diagnosis tasks caused by the increase of fault types due to the occurrence of unexpected faults in an actual scene is simulated;
in the initial stage, an initial gear box bearing fault diagnosis task is learned by using an original ResNet-32, an incomplete fault diagnosis model trained by pre-collecting fault data in a simulated reality scene is used, and a certain number of typical examples are selected from initial task data for storage through an localization algorithm after training is completed; replacing the original ResNet-32 with an improved ResNet-32-based dual-branch aggregation network in a subsequent incremental stage to obtain an incremental stage feature extractor structure so as to balance the plasticity (knowledge migration) and stability (knowledge accumulation) of the model, and simultaneously modifying the full-connection layer classifier into a cosine standardized classifier so as to avoid the classification bias problem of the model and increase the number of neurons of the classification layer according to the number of newly added fault types;
the model of the first incremental stage is trained by using the typical examples stored in the initial stage and the diagnosis task data of the stage so as to wake up the memory of the model to old knowledge and overcome the disastrous forgetfulness of the deep learning model; the loss function of the increment stage comprises a classification cross entropy loss function and a knowledge distillation loss function, and the knowledge distillation loss function can reduce the difference of the new and old stage models in the old task data, so as to further prevent catastrophic forgetting;
the migration capability of different residual block layers is represented by the aggregation weight, so that the migration capability of a steady-state branch and a dynamic branch can be balanced to balance the plasticity and the stability of the model; the optimization of the aggregation weight and the model parameters are mutually constrained, and a double-layer optimization scheme is adopted to update the parameters of the aggregation weight and the model parameters; after the diagnosis task training is completed in each incremental stage, a certain amount of typical example storage of the data in the stage is continuously selected and used for training in the next incremental stage;
the invention generally constructs a gear box increment fault diagnosis method based on life learning, adopts a double-branch aggregation network, combines knowledge distillation and typical examples, solves the problem of disastrous forgetting of a deep learning diagnosis model, and can be suitable for continuous gear box fault diagnosis of new unexpected faults.
Drawings
FIG. 1 is a flow chart of an embodiment of the method of the present invention;
FIG. 2 is a test chart of a gearbox data generation test stand of the present invention;
FIG. 3 is a fault section view of the gearbox of the present invention;
FIG. 4 is a dual-branch aggregation network structure in a model of the present invention;
FIG. 5 is a graph of diagnostic accuracy of two fine tuning methods and the method of the present invention for a depth model that does not use a lifetime learning method.
Detailed Description
The present invention will be further described with reference to the accompanying drawings and specific examples, which are not intended to be limiting, so that those skilled in the art will better understand the invention and practice it.
Referring to fig. 1, the invention provides a gear box increment fault diagnosis method based on life learning, which comprises the following steps of: collecting vibration data of a gear box to construct an increment health state data set, and dividing the increment health state data set into fault diagnosis tasks of different stages;
s102: utilizing an original ResNet-32 network to learn a fault diagnosis task at an initial stage, constructing an initial stage diagnosis model, and selecting a typical example of fault diagnosis task data at the initial stage;
s103: initializing a ResNet-32 double-branch aggregation network by using an initial stage diagnosis model, wherein the ResNet-32 double-branch aggregation network adopts a cosine standardized classifier, and the number of neurons of a classification layer is increased according to the number of newly increased fault types;
s104: training the stage diagnosis model through the selected typical cases and the stage fault diagnosis task data together, and selecting the typical cases of the stage fault diagnosis task data after training is completed;
the method comprises the steps of representing migration capacities of different residual block layers by using aggregation weights in the training process, reducing the difference of new and old stage diagnosis models in the old stage fault diagnosis task data by combining a knowledge distillation loss function, and optimizing the aggregation weights and model parameters by using a double-layer optimization scheme;
s105: repeating the steps S103-S104 in the subsequent increment stage to obtain a final fault diagnosis model, and performing fault diagnosis.
The invention adopts a life learning method to construct a diagnosis model capable of realizing continuous knowledge migration and accumulation so as to facilitate fault diagnosis of fault type increment caused by complex working conditions.
Further, the performance of the model in the new task and the old task is tested by using the test data of all the learned tasks, and the learning capacity of the model is verified.
Examples
The method is specifically described in this embodiment in connection with specific acquisition of experimental data.
The test bench shown in fig. 2 was used to collect the required experimental data and construct an incremental health status dataset. In order to obtain the unexpected fault of the gear box with the composite fault of the bearing and the gear as shown in fig. 3, a linear cutting technology is adopted to set cracks of 0.4 millimeter on the inner ring, the outer ring and the rollers of the bearing, so that the local fault of the bearing is simulated; and cutting half teeth on the driving gear by adopting an electric spark technology, and simulating the local fault of the gear.
In the experiment, the motor speed is 1496r/min, and the sampling frequency is set to 25.6KHz. The gearbox delta dataset was constructed for a total of 11 different health states consisting of a combination of gear and bearing faults as listed in table 1. The gear has two health states of normal gear and gear fault, the bearing has four basic health states including normal bearing, inner ring fault, roller fault and outer ring fault, and three mixed faults of the bearing, and the bearing is formed by combining three basic health states in pairs.
Thus, according to the actual scenario, the diagnostic tasks of the different phases are divided: and acquiring vibration signals of the gearbox by using an acceleration sensor to construct an incremental health state data set D. Assuming that there are n+1 gearbox fault diagnosis tasks in total, there are n+1 learning phases, i.e., a phase of learning diagnosis task 0 and N incremental phases, during which the number of diagnosis tasks gradually increases. In the nth stage, the training data of task n is wherein Pn Is the number of fault data samples for task n. By J n Representing old fault class C 0:n-1 ={C 0 ,C 1 ,…,C n-1 Number, K of n Representing a new fault class C n And then J n+1 =K n +J n Therefore, it isRepresenting the i-th sample, +.>
As listed in table 1, in an actual scenario, gearbox health status data obtained through experimentation will be used as training samples for task 0 to train the model in the initial phase. These health states are generally common, so are of a large variety and are easy to learn, so seven gearbox health states in which gears normally only have bearings failed are taken as the failure types for task 0 learning; to simulate the fault type increment caused by unexpected faults occurring in a real scene, each task of learning contains a gear-bearing hybrid fault type in each subsequent increment stage. There are 200 training samples and 100 test samples for each failure type. Table 1 health status and incremental task settings of gearbox:
therefore, the step S102 specifically includes the steps of:
s102.1: data using task 0Training the original ResNet-32 learning failure class C 0 Obtaining an initial model theta 0 The detailed structure of ResNet-32 is shown in Table 2. The loss function of the model is a class cross entropy loss function: />Where δ is the true label. The model parameters theta of the initial stage 0 Is conventional +.>
Table 2 structuring parameters of backbone network res net-32:
s102.2: after training, the feature extractor F before the classification layer is utilized 0 A certain number of typical examples epsilon are selected through a locking algorithm 0 . By usingTraining samples representing a faulty class c, then class c averages +.> wherein Pc Is the number of training samples of class c.
The number of selected typical cases has two schemes: firstly, fixing the number of typical cases selected by each fault type to be 5; or a fixed total storage number of 55. If the number of classics of the selected class c is t, each classics passesCalculated epsilon= (e) 1 ,e 2 ,…,e t )。
The step S103 specifically includes the following steps:
s103.1: the original ResNet-32 was replaced with a dual-branch aggregation network, the dual-branch aggregation network structure of which is shown in FIG. 4. Wherein the dual-branch aggregation network comprises a dynamic branch and a steady-state branch.
The dynamic branch is a conventional parameter level fine tuning, namely, an initial model is used for initializing the dynamic branch of an incremental stage, and tasks of each stage are used for training and fine tuning the parameter alpha;
the steady-state branches are fine adjustments of parameters of neuron levels after freezing initial stage network parameters, namely, each neuron is given a weight beta and is trained and fine adjusted by tasks of each stage. Assuming that the steady-state branch k-th layer convolutional neural network comprises Q neurons, the neuron weights are parameters of initial model freezingThe input of the k-th layer convolutional neural network is x k-1 Output is x k =(W k ⊙β k )x k-1 Wherein, the "+.. The learnable parameters beta of the steady state blocks are smaller than alpha, and the method can enable the steady state residual blocks to be slowly adapted to the knowledge of new tasks, and meanwhile old knowledge is fully reserved.
S103.2: the classifier of the initial model is a conventional fully connected classification layer byCalculating a predictive probability of the input x being class c, where θ 0 For the parameters of the full connection classification layer in the initial stage, h 0 Features extracted in an initial stage;
the cosine standardized classifier of the increment stage n passesCalculating a predictive probability of the input x being class c, where θ n Full connection classification layer parameters for delta phase n, h n For the feature extracted in delta phase n, +.>Representation l 2 Norms (F/F)>Eta is a leachable scaling parameter and controls the cosine similarity value at [ -1,1]Within the range. The problem of classification bias of new and old classes can be avoided through a cosine standardized classifier.
For failure category increases, the number of classification layer neurons should increase to be consistent with the number of failure categories.
The step S104 specifically includes the following steps:
s104.1: using the classical epsilon reserved in the initial stage 0 And the stage task data D 0 Training a dual-branch aggregation network, and respectively endowing self-adaptive aggregation weights omega and xi aiming at different migration capacities of a dynamic residual block and a steady residual block of each residual block layer, as shown in fig. 4.
Failure training data x [0] Extracting features through a dual-branch aggregation network, and dynamically extracting features at an mth residual block layerResidual block extraction is characterized in thatThe steady state residual block extraction is characterized by +.>
S104.2: the loss function of the increment stage is classified cross entropy lossAnd knowledge distillation loss-> wherein ,/> and />The temperature T is typically greater than 1 for soft labels with old models in the old failure class and hard labels with new models in the old failure class, respectively. Narrowing new models in old failure class C by knowledge distillation loss 0:n-1 The difference between the expression and the old model is approximately constrained to the similarity distribution of the old class in the old model. The loss function of the delta phase is +.>Wherein lambda is more than 0 and less than or equal to 1.
S104.2: the non-optimized parameters of the increment stage have model parameters theta n And aggregate weights ω and ζ for aggregationUpdating the combined weights ω and ζ requires fixing the model parameters Θ n Adopting a double-layer optimization scheme;
the double-layer optimization scheme is divided into upper layer problemsAnd lower layer problemsBy->Updating model parameters Θ n, wherein γ1 Is the lower problem learning rate;
updating aggregate weights in upper layer problems to balance dynamic and steady state residual blocks, using a random sample data set D n ObtainingEstablishing balance data->By passing throughUpdating the aggregate weights, wherein γ 2 Is the learning rate of the upper layer problem.
The step S105 specifically includes the following steps:
the model Θ obtained by training in the increment stage n n All learned faults C 0:n The test data contains all learned fault classes to verify that the model has the ability to learn without forgetting. After 4 incremental task learning is completed, two fine-tuning and confusion matrices for the method of the present invention under two typical case number strategies are shown in FIG. 5. The two fine-tuned confusion matrixes reflect the disastrous forgetting of the deep learning diagnosis model without lifelong learning, and the method can effectively solve the disastrous forgetting and realize the continuous fault diagnosis of the gear box with new unexpected faults.
In summary, the invention designs a method for realizing the increment fault diagnosis of the gear box based on the life learning method. Compared with the traditional deep learning method, the method can solve the problem of disastrous forgetting and is more suitable for the actual scene of industrial application.
The invention also provides a gear box increment fault diagnosis system based on the life learning, which adopts the gear box increment fault diagnosis method based on the life learning to carry out gear box fault diagnosis.
The principles of which are similar to those described above and which are repeated, it is noted that the present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above-described embodiments are merely preferred embodiments for fully explaining the present invention, and the scope of the present invention is not limited thereto. Equivalent substitutions and modifications will occur to those skilled in the art based on the present invention, and are intended to be within the scope of the present invention. The protection scope of the invention is subject to the claims.
Claims (6)
1. A gear box increment fault diagnosis method based on life learning is characterized in that: the method comprises the following steps:
s101: collecting vibration data of a gear box to construct an increment health state data set, and dividing the increment health state data set into fault diagnosis tasks of different stages;
s102: utilizing an original ResNet-32 network to learn a fault diagnosis task at an initial stage, constructing an initial stage diagnosis model, and selecting a typical example of fault diagnosis task data at the initial stage;
s103: initializing a ResNet-32 double-branch aggregation network by using an initial stage diagnosis model, wherein the ResNet-32 double-branch aggregation network adopts a cosine standardized classifier, and the number of neurons of a classification layer is increased according to the number of newly increased fault types;
s104: training the stage diagnosis model through the selected typical cases and the stage fault diagnosis task data together, and selecting the typical cases of the stage fault diagnosis task data after training is completed;
the method comprises the steps of representing migration capacities of different residual block layers by using aggregation weights in the training process, reducing the difference of new and old stage diagnosis models in the old stage fault diagnosis task data by combining a knowledge distillation loss function, and optimizing the aggregation weights and model parameters by using a double-layer optimization scheme;
s105: repeating the steps S103-S104 in the subsequent increment stage to obtain a final fault diagnosis model, and performing fault diagnosis;
the step S103 specifically includes the following steps:
replacing the original ResNet-32 network with a ResNet-32 dual-branch aggregation network, wherein the ResNet-32 dual-branch aggregation network comprises a dynamic branch and a steady-state branch;
the dynamic branch is subjected to conventional parameter level fine tuning, namely, an initial stage diagnosis model is used for initializing the dynamic branch of an incremental stage, and each stage task is used for training fine tuning parameters;
the steady-state branch is the fine adjustment of the neuron level after freezing the network parameters in the initial stage, namely, each neuron is given a weight beta and is trained and fine adjusted by each stage of task, if the k-th layer convolutional neural network of the steady-state branch comprises Q neurons, the neuron weight is the parameter frozen by the initial modelThe input of the k-th layer convolutional neural network is x k-1 Output is x k =(W k ⊙β k )x k-1 Wherein, the addition of the root is Hadamard product;
The cosine standardized classifier of the increment stage n passesCalculating a predictive probability of the input x being class c, wherein θ n Full connection classification layer parameters for delta phase n, h n For the feature extracted in delta phase n, +.>Representation l 2 Norms (F/F)>Eta is a leachable scaling parameter and controls the cosine similarity value at [ -1,1]Within the range where J n Representing old fault class C 0:n-1 ={C 0 ,C 1 ,...,C n-1 Number, K of n Representing a new fault class C n Quantity, J n+1 =K n +J n ;
Aiming at the increase of the fault categories, the number of the neurons of the classification layer is increased to be consistent with the number of the fault categories;
the migration capability of representing different residual block layers by using the aggregation weight comprises the following steps:
using the classical epsilon reserved in the initial stage 0 And one-stage task data D 1 Training a double-branch aggregation network, and respectively endowing self-adaptive aggregation weights omega and xi aiming at different migration capacities of a dynamic residual block and a steady residual block of each residual block layer;
failure training data x [0] Extracting features through a double-branch aggregation network, and extracting features of dynamic residual blocks at an mth residual block layer are as followsThe steady state residual block extraction is characterized by +.>
The loss function of the increment stage is classified cross entropy lossAnd knowledge distillation loss-> wherein , and />Respectively a soft label of an old model in an old fault class and a hard label of a new model in the old fault class, wherein the temperature T is more than 1;
the loss function of the increment stage isWherein lambda is more than 0 and less than or equal to 1;
The non-optimized parameters of the incremental phase are modeledParameter theta n And aggregate weights ω and ζ, the update for aggregate weights ω and ζ requires a fixed model parameter Θ n Adopting a double-layer optimization scheme;
2. The life-learning-based gearbox incremental fault diagnosis method according to claim 1, wherein: the step S101 specifically includes the following steps:
acquiring vibration signals of the gearbox by using an acceleration sensor to construct an incremental health state data set D;
if there are n+1 fault diagnosis tasks in total, there are n+1 learning phases, i.e., the fault diagnosis task 0 of the initial phase and N incremental phases, during which the number of diagnosis tasks gradually increases;
in the nth stage, the training data of task n is wherein ,pn Is the number of fault data samples for task n;
3. The life-learning-based gearbox incremental fault diagnosis method according to claim 2, wherein: the step S102 specifically includes the following steps:
data using task 0Training the original ResNet-32 learning failure class C 0 Obtaining an initial stage diagnosis model theta 0 The loss function of the initial stage diagnostic model is a classification cross entropy loss function:wherein δ is the true label;
after training, the feature extractor F before the classification layer is utilized 0 A certain number of typical examples epsilon are selected through a locking algorithm 0 。
4. A life-learning based gearbox incremental fault diagnosis method according to claim 3, wherein: using feature extractor F before classification layer 0 A certain number of typical examples are selected through a locking algorithm, including:
by usingTraining samples representing a faulty class c, then class average for cIs-> wherein ,Pc Is the number of training samples of class c;
5. The life-learning based gearbox delta fault diagnosis method according to any one of claims 1 to 4, wherein: after each increment training is completed, testing the performance of the model in new and old tasks by using test data of all learned tasks, and verifying the learning capacity of the model, wherein the test data comprises the following steps:
the model Θ obtained by training in the increment stage n n All learned faults C 0:n The test data contains all learned fault classes to verify that the model has the ability to learn without forgetting.
6. A life learning based gearbox incremental fault diagnosis system, characterized in that: gearbox fault diagnosis is performed by adopting the gearbox incremental fault diagnosis method based on life learning as claimed in any one of claims 1-4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111677774.4A CN114429153B (en) | 2021-12-31 | 2021-12-31 | Gear box increment fault diagnosis method and system based on life learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111677774.4A CN114429153B (en) | 2021-12-31 | 2021-12-31 | Gear box increment fault diagnosis method and system based on life learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114429153A CN114429153A (en) | 2022-05-03 |
CN114429153B true CN114429153B (en) | 2023-04-28 |
Family
ID=81311970
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111677774.4A Active CN114429153B (en) | 2021-12-31 | 2021-12-31 | Gear box increment fault diagnosis method and system based on life learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114429153B (en) |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115270956B (en) * | 2022-07-25 | 2023-10-27 | 苏州大学 | Continuous learning-based cross-equipment incremental bearing fault diagnosis method |
CN115510963A (en) * | 2022-09-20 | 2022-12-23 | 同济大学 | Incremental equipment fault diagnosis method |
CN116029367A (en) * | 2022-12-26 | 2023-04-28 | 东北林业大学 | Fault diagnosis model optimization method based on personalized federal learning |
CN116089883B (en) * | 2023-01-30 | 2023-12-19 | 北京邮电大学 | Training method for improving classification degree of new and old categories in existing category increment learning |
CN117313000B (en) * | 2023-09-19 | 2024-03-15 | 北京交通大学 | Motor brain learning fault diagnosis method based on sample characterization topology |
CN117150377B (en) * | 2023-11-01 | 2024-02-02 | 北京交通大学 | Motor fault diagnosis stepped learning method based on full-automatic motor offset |
CN117313251B (en) * | 2023-11-30 | 2024-03-15 | 北京交通大学 | Train transmission device global fault diagnosis method based on non-hysteresis progressive learning |
CN117591888B (en) * | 2024-01-17 | 2024-04-12 | 北京交通大学 | Cluster autonomous learning fault diagnosis method for key parts of train |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5566092A (en) * | 1993-12-30 | 1996-10-15 | Caterpillar Inc. | Machine fault diagnostics system and method |
CN108376264A (en) * | 2018-02-26 | 2018-08-07 | 上海理工大学 | A kind of handpiece Water Chilling Units method for diagnosing faults based on support vector machines incremental learning |
CN109492765A (en) * | 2018-11-01 | 2019-03-19 | 浙江工业大学 | A kind of image Increment Learning Algorithm based on migration models |
CN110162018A (en) * | 2019-05-31 | 2019-08-23 | 天津开发区精诺瀚海数据科技有限公司 | The increment type equipment fault diagnosis method that knowledge based distillation is shared with hidden layer |
CN111651937A (en) * | 2020-06-03 | 2020-09-11 | 苏州大学 | Method for diagnosing similar self-adaptive bearing fault under variable working conditions |
CN112381788A (en) * | 2020-11-13 | 2021-02-19 | 北京工商大学 | Part surface defect increment detection method based on double-branch matching network |
CN112990280A (en) * | 2021-03-01 | 2021-06-18 | 华南理工大学 | Class increment classification method, system, device and medium for image big data |
CN113281048A (en) * | 2021-06-25 | 2021-08-20 | 华中科技大学 | Rolling bearing fault diagnosis method and system based on relational knowledge distillation |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190339688A1 (en) * | 2016-05-09 | 2019-11-07 | Strong Force Iot Portfolio 2016, Llc | Methods and systems for data collection, learning, and streaming of machine signals for analytics and maintenance using the industrial internet of things |
-
2021
- 2021-12-31 CN CN202111677774.4A patent/CN114429153B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5566092A (en) * | 1993-12-30 | 1996-10-15 | Caterpillar Inc. | Machine fault diagnostics system and method |
CN108376264A (en) * | 2018-02-26 | 2018-08-07 | 上海理工大学 | A kind of handpiece Water Chilling Units method for diagnosing faults based on support vector machines incremental learning |
CN109492765A (en) * | 2018-11-01 | 2019-03-19 | 浙江工业大学 | A kind of image Increment Learning Algorithm based on migration models |
CN110162018A (en) * | 2019-05-31 | 2019-08-23 | 天津开发区精诺瀚海数据科技有限公司 | The increment type equipment fault diagnosis method that knowledge based distillation is shared with hidden layer |
CN111651937A (en) * | 2020-06-03 | 2020-09-11 | 苏州大学 | Method for diagnosing similar self-adaptive bearing fault under variable working conditions |
CN112381788A (en) * | 2020-11-13 | 2021-02-19 | 北京工商大学 | Part surface defect increment detection method based on double-branch matching network |
CN112990280A (en) * | 2021-03-01 | 2021-06-18 | 华南理工大学 | Class increment classification method, system, device and medium for image big data |
CN113281048A (en) * | 2021-06-25 | 2021-08-20 | 华中科技大学 | Rolling bearing fault diagnosis method and system based on relational knowledge distillation |
Non-Patent Citations (4)
Title |
---|
A Continual Learning Survey: Defying Forgetting in Classification Tasks;Matthias De Lange 等;《arxiv》;20210416;第3366-3565页 * |
Highly Accurate Machine Fault Diagnosis Using Deep Transfer Learning;Siyu Shao 等;《IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS》;20191231;第15卷(第4期);第2446-2455页 * |
基于改进卷积神经网络及ightGBM 的滚动轴承故障诊断;杨瑞双 等;《轴承》;20210615(第06期);第44-49页 * |
基于神经网络迁移学习和增量学习的脑电信号分类;韩久琦 等;《第四届全国神经动力学学术会议》;20180806;第78-80页 * |
Also Published As
Publication number | Publication date |
---|---|
CN114429153A (en) | 2022-05-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114429153B (en) | Gear box increment fault diagnosis method and system based on life learning | |
CN111596604B (en) | Intelligent fault diagnosis and self-healing control system and method for engineering equipment based on digital twinning | |
CN111237134B (en) | Offshore double-fed wind driven generator fault diagnosis method based on GRA-LSTM-stacking model | |
CN115270956B (en) | Continuous learning-based cross-equipment incremental bearing fault diagnosis method | |
CN110929918B (en) | 10kV feeder fault prediction method based on CNN and LightGBM | |
CN111351665B (en) | Rolling bearing fault diagnosis method based on EMD and residual error neural network | |
CN111812507A (en) | Motor fault diagnosis method based on graph convolution | |
CN109472097B (en) | Fault diagnosis method for online monitoring equipment of power transmission line | |
CN113343591B (en) | Product key part life end-to-end prediction method based on self-attention network | |
CN115453356B (en) | Power equipment operation state monitoring and analyzing method, system, terminal and medium | |
Shi et al. | Study of wind turbine fault diagnosis and early warning based on SCADA data | |
CN115146718A (en) | Depth representation-based wind turbine generator anomaly detection method | |
CN112949402A (en) | Fault diagnosis method for planetary gear box under minimum fault sample size | |
CN112461543A (en) | Rotary machine fault diagnosis method based on multi-classification support vector data description | |
CN116108346A (en) | Bearing increment fault diagnosis life learning method based on generated feature replay | |
CN116956203B (en) | Method and system for measuring action characteristics of tapping switch of transformer | |
CN107527093B (en) | Wind turbine generator running state diagnosis method and device | |
CN111244937B (en) | Method for screening serious faults of transient voltage stability of power system | |
CN112000923A (en) | Power grid fault diagnosis method, system and equipment | |
CN116432359A (en) | Variable topology network tide calculation method based on meta transfer learning | |
CN102788955A (en) | Remaining lifetime prediction method of ESN (echo state network) turbine generator classification submodel based on Kalman filtering | |
CN102749199A (en) | Method for predicting residual service lives of turbine engines on basis of ESN (echo state network) | |
Velasquez et al. | Machine Learning Approach for Predictive Maintenance in Hydroelectric Power Plants | |
Peng et al. | Wind turbine blades icing failure prognosis based on balanced data and improved entropy | |
CN112651628A (en) | Power system transient stability evaluation method based on capsule neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |