CN114429153B - Gear box increment fault diagnosis method and system based on life learning - Google Patents

Gear box increment fault diagnosis method and system based on life learning Download PDF

Info

Publication number
CN114429153B
CN114429153B CN202111677774.4A CN202111677774A CN114429153B CN 114429153 B CN114429153 B CN 114429153B CN 202111677774 A CN202111677774 A CN 202111677774A CN 114429153 B CN114429153 B CN 114429153B
Authority
CN
China
Prior art keywords
stage
fault diagnosis
model
fault
learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111677774.4A
Other languages
Chinese (zh)
Other versions
CN114429153A (en
Inventor
沈长青
陈博戬
孔林
陈良
丁传仓
申永军
庄国龙
张艳华
李林
张爱文
祁玉梅
石娟娟
江星星
黄伟国
朱忠奎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou University
Original Assignee
Suzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou University filed Critical Suzhou University
Priority to CN202111677774.4A priority Critical patent/CN114429153B/en
Publication of CN114429153A publication Critical patent/CN114429153A/en
Application granted granted Critical
Publication of CN114429153B publication Critical patent/CN114429153B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/12Classification; Matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes
    • G06F18/24137Distances to cluster centroïds
    • G06F18/2414Smoothing the distance, e.g. radial basis function networks [RBFN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/01Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/08Feature extraction

Abstract

The invention discloses a gear box increment fault diagnosis method and system based on life learning, comprising the following steps: s101: collecting vibration data of a gear box to construct an increment health state data set, and dividing the increment health state data set into fault diagnosis tasks of different stages; s102: constructing an initial stage diagnosis model by utilizing the fault diagnosis task of the initial stage of the original ResNet-32 network learning; s103: initializing a ResNet-32 double-branch aggregation network by using an initial stage diagnosis model, and increasing the number of neurons of a classification layer according to the number of newly-increased fault types; s104: training the stage diagnosis model through the selected typical cases and the stage fault diagnosis task data together, and selecting the typical cases of the stage fault diagnosis task data after training is completed; s105: repeating the steps S103-S104 in the subsequent increment stage to obtain a final fault diagnosis model, and performing fault diagnosis. The invention aims to solve the problem that the existing fault diagnosis model based on deep learning and transfer learning cannot diagnose actual unexpected faults of the gear box.

Description

Gear box increment fault diagnosis method and system based on life learning
Technical Field
The invention relates to the technical field of mechanical fault diagnosis, in particular to a gear box increment fault diagnosis method and system based on life learning.
Background
With the rapid development of modern industrialization, the precision and importance of rotating machinery are increasing. Rotary machines have become one of the most widely used industrial machines, and there is an increasing demand for reliability. Rotary machines are used in many fields, such as aviation, navigation, machinery, chemical, energy, electric power, etc., and their service conditions show an increasingly complex trend, and performance degradation or even failure inevitably occurs during operation, resulting in huge economic losses, higher and higher operation and maintenance costs, and even catastrophic casualties, causing irrecoverable adverse effects to the environment and society. Therefore, the research of the health state monitoring and fault diagnosis method is carried out by taking the rotary mechanical equipment as an object, and the method has important significance for guaranteeing the safe and reliable operation of the mechanical equipment, preventing the critical equipment from faults and avoiding huge economic loss and disastrous accidents.
The requirements on high speed, heavy load and automation degree of modern rotary mechanical equipment are continuously improved, the dynamic signals are more complex, the modern state monitoring technology can realize multi-measuring point and full-service-life data acquisition of complex equipment, and massive data are further obtained, but the processing of the dynamic signals and the feature extraction of health state information in the dynamic signals bring great difficulty. The traditional fault diagnosis method comprises the steps of extracting fault characteristic frequency based on a vibration signal, short-time Fourier transformation, empirical mode decomposition, sparse representation and the like. The methods are mature, but for the state signals of the current mechanical equipment, the method based on signal processing does not have the capability of processing fault data with low density, strong interference and diversity under variable working conditions in a large amount of signal data.
In recent years, with the rapid development of the fields of artificial intelligence and machine learning, more and more intelligent fault diagnosis methods of rotating machinery based on machine learning are proposed. Machine learning-based fault diagnosis generally comprises the steps of signal acquisition, feature extraction, fault identification, prediction and the like. The method greatly simplifies the fault diagnosis process and improves the diagnosis efficiency, but has a simple structure and limited layers due to most shallow networks, the effectiveness of the method depends on the effectiveness of the pre-processing extraction features, and the processing capacity is limited when facing a large number of equipment state signals with complex structures.
In recent years, a plurality of students overcome the defect that a shallow model is difficult to characterize a complex mapping relation between signals and health conditions by utilizing excellent self-adaptive feature learning and extraction capability of deep learning, and achieve good effects. However, these methods are based on two assumptions: the training data is distributed with the test data and the training data is sufficient. In actual engineering, the operation conditions of mechanical equipment are changeable, faults occur accidentally, the obtained samples are difficult to meet the two assumptions, and the fault diagnosis result is directly affected. With the rapid development of transfer learning, with the aid of knowledge mining and transfer capability among cross fields and cross distribution, a transfer learning solution for a label sample limited (very small sample or no sample) problem or a variable working condition problem is developed in the field of mechanical fault diagnosis. However, the transfer learning can only meet the fault diagnosis of a single target task, namely, the primary transfer can be completed under the given conditions of a source domain and a target domain, and the generalization capability of a model is greatly reduced and the universality is poor when the model faces a new task due to the diversity of mechanical equipment faults and operation conditions; on the other hand, the migration learning does not involve accumulation of knowledge, and when the equipment state recognition task is carried out under the working condition corresponding to the source domain data, the performance is poor and is inconsistent with the actual requirements of engineering.
In practice, due to the complex and varying operating conditions, the machine often generates unexpected faults, resulting in increased fault types, disabling the deep diagnostic model and the deep migration diagnostic model trained by pre-collecting semi-complete fault data, thus requiring retraining of the model to identify new fault types. However, training the depth model directly with the new type of data will result in the identification of the old fault class exhibiting a cliff drop, which is referred to as catastrophic forgetfulness. Catastrophic forgetting has been an important problem in the field of deep learning, and also in the field of fault diagnosis, it is necessary to study the problem of solving the catastrophic forgetting of a deep diagnosis model caused by unexpected faults, so as to build a life-long fault diagnosis model with higher reliability, generalization and versatility.
Disclosure of Invention
The invention aims to provide a gear box increment fault diagnosis method and system based on life learning, which are used for solving the problem that the conventional fault diagnosis model based on deep learning and transfer learning cannot diagnose actual unexpected faults of a gear box.
In order to solve the technical problems, the invention provides a gear box increment fault diagnosis method based on life learning, which comprises the following steps:
s101: collecting vibration data of a gear box to construct an increment health state data set, and dividing the increment health state data set into fault diagnosis tasks of different stages;
s102: utilizing an original ResNet-32 network to learn a fault diagnosis task at an initial stage, constructing an initial stage diagnosis model, and selecting a typical example of fault diagnosis task data at the initial stage;
s103: initializing a ResNet-32 double-branch aggregation network by using an initial stage diagnosis model, wherein the ResNet-32 double-branch aggregation network adopts a cosine standardized classifier, and the number of neurons of a classification layer is increased according to the number of newly increased fault types;
s104: training the stage diagnosis model through the selected typical cases and the stage fault diagnosis task data together, and selecting the typical cases of the stage fault diagnosis task data after training is completed;
the method comprises the steps of representing migration capacities of different residual block layers by using aggregation weights in the training process, reducing the difference of new and old stage diagnosis models in the old stage fault diagnosis task data by combining a knowledge distillation loss function, and optimizing the aggregation weights and model parameters by using a double-layer optimization scheme;
s105: repeating the steps S103-S104 in the subsequent increment stage to obtain a final fault diagnosis model, and performing fault diagnosis.
As a further improvement of the present invention, the step S101 specifically includes the steps of:
acquiring vibration signals of the gearbox by using an acceleration sensor to construct an incremental health state data set D;
if there are n+1 fault diagnosis tasks in total, there are n+1 learning phases, i.e., the fault diagnosis task 0 of the initial phase and N incremental phases, during which the number of diagnosis tasks gradually increases;
in the nth stage, the training data of task n is
Figure BDA0003452726960000031
wherein ,Pn Is the number of fault data samples for task n;
if J n Representing old fault class C 0:n-1 ={C 0 ,C 1 ,…,C n-1 Number, K of n Representing a new fault class C n And then J n+1 =K n +J n
Figure BDA0003452726960000041
Representing the i-th sample, +.>
Figure BDA0003452726960000042
As a further improvement of the present invention, the step S102 specifically includes the steps of:
data using task 0
Figure BDA0003452726960000043
Training the original ResNet-32 learning failure class C 0 Obtaining an initial stage diagnosis model theta 0 The loss function of the initial stage diagnostic model is a classification cross entropy loss function: />
Figure BDA0003452726960000044
Wherein δ is the true label;
after training, the feature extractor F before the classification layer is utilized 0 A certain number of typical examples epsilon are selected through a locking algorithm 0
As a further improvement of the present invention, a feature extractor F before the classification layer is utilized 0 A certain number of typical examples are selected through a locking algorithm, including:
by using
Figure BDA0003452726960000045
Indicating the reasonTraining samples of barrier class c, then class c averages +.>
Figure BDA0003452726960000046
wherein ,Pc Is the number of training samples of class c;
or the number of selected classics is t, each classics passes
Figure BDA0003452726960000047
Calculated epsilon= (e) 1 ,e 2 ,…,e t )。
As a further improvement of the present invention, the step S103 specifically includes the steps of:
replacing the original ResNet-32 network with a ResNet-32 dual-branch aggregation network, wherein the ResNet-32 dual-branch aggregation network comprises a dynamic branch and a steady-state branch;
the dynamic branch is a conventional parameter level fine tuning, namely, an initial stage diagnosis model is used for initializing the dynamic branch of an incremental stage, and tasks of each stage are used for training and fine tuning the parameter alpha;
the steady-state branch is the fine adjustment of the neuron level after freezing the network parameters in the initial stage, namely, each neuron is given a weight beta and is trained and fine adjusted by each stage of task, if the k-th layer convolutional neural network of the steady-state branch comprises Q neurons, the neuron weight is the parameter frozen by the initial model
Figure BDA0003452726960000048
The input of the k-th layer convolutional neural network is x k-1 Output is x k =(W k ⊙β k )x k-1 Wherein, the addition is Hadamard product;
the cosine standardized classifier of the increment stage n passes
Figure BDA0003452726960000049
Calculating a predictive probability of the input x being class c, wherein θ n Full connection classification layer parameters for delta phase n, h n For the feature extracted in delta phase n, +.>
Figure BDA0003452726960000051
Representation l 2 Norms (F/F)>
Figure BDA0003452726960000052
Eta is a leachable scaling parameter and controls the cosine similarity value at [ -1,1]Within the range;
for the failure category increase, the number of classification layer neurons increases to coincide with the number of failure categories.
As a further improvement of the present invention, the representing the migration capability of different residual block layers by using the aggregation weight includes:
using the classical epsilon reserved in the initial stage 0 And the stage task data D 0 Training a double-branch aggregation network, and respectively endowing self-adaptive aggregation weights omega and xi aiming at different migration capacities of a dynamic residual block and a steady residual block of each residual block layer;
the fault training data x [0] Extracting features through a double-branch aggregation network, and extracting features of dynamic residual blocks at an mth residual block layer are as follows
Figure BDA0003452726960000053
The steady state residual block extraction is characterized by +.>
Figure BDA0003452726960000054
The aggregation characteristic of the mth residual block layer is that
Figure BDA0003452726960000055
wherein ,ω[m][m] =1。/>
As a further improvement of the present invention, the loss function of the initial stage is a classification cross entropy loss
Figure BDA0003452726960000056
The loss function of the increment stage is classified cross entropy loss
Figure BDA0003452726960000057
And knowledge distillation loss->
Figure BDA0003452726960000058
wherein ,
Figure BDA0003452726960000059
Figure BDA00034527269600000510
and />
Figure BDA00034527269600000511
The temperature T is typically greater than 1 for soft labels with old models in the old failure class and hard labels with new models in the old failure class, respectively.
As a further improvement of the invention, the loss function of the incremental phase is
Figure BDA00034527269600000512
Wherein lambda is more than 0 and less than or equal to 1;
the model parameters theta of the initial stage 0 Is conventional
Figure BDA00034527269600000513
The non-optimized parameters of the increment stage have model parameters theta n And aggregate weights ω and ζ, the update for aggregate weights ω and ζ requires a fixed model parameter Θ n Adopting a double-layer optimization scheme;
the double-layer optimization scheme is divided into upper layer problems
Figure BDA0003452726960000061
And lower layer problems
Figure BDA0003452726960000062
By passing through
Figure BDA0003452726960000063
Updating model parameters Θ n, wherein ,γ1 Is the lower problem learning rate;
using randomly sampled data sets D n Obtaining
Figure BDA0003452726960000064
Establishing balance data->
Figure BDA0003452726960000065
By passing through
Figure BDA0003452726960000066
Updating the aggregate weight, wherein γ 2 Is the learning rate of the upper layer problem.
As a further improvement of the invention, after each incremental training is completed, the performance of the model in new and old tasks is tested by using the test data of all learned tasks, and the capability of the model for not forgetting learning is verified, which comprises the following steps:
the model Θ obtained by training in the increment stage n n All learned faults C 0:n The test data contains all learned fault classes to verify that the model has the ability to learn without forgetting.
The gear box increment fault diagnosis system based on the life learning adopts the gear box increment fault diagnosis method based on the life learning to carry out gear box fault diagnosis.
The invention has the beneficial effects that: according to the method for diagnosing the faults of the gearbox, firstly, the vibration signals of the gearbox are collected through the acceleration sensor to construct an increment health state data set, diagnosis tasks in different stages are divided, and the increase of diagnosis tasks caused by the increase of fault types due to the occurrence of unexpected faults in an actual scene is simulated;
in the initial stage, an initial gear box bearing fault diagnosis task is learned by using an original ResNet-32, an incomplete fault diagnosis model trained by pre-collecting fault data in a simulated reality scene is used, and a certain number of typical examples are selected from initial task data for storage through an localization algorithm after training is completed; replacing the original ResNet-32 with an improved ResNet-32-based dual-branch aggregation network in a subsequent incremental stage to obtain an incremental stage feature extractor structure so as to balance the plasticity (knowledge migration) and stability (knowledge accumulation) of the model, and simultaneously modifying the full-connection layer classifier into a cosine standardized classifier so as to avoid the classification bias problem of the model and increase the number of neurons of the classification layer according to the number of newly added fault types;
the model of the first incremental stage is trained by using the typical examples stored in the initial stage and the diagnosis task data of the stage so as to wake up the memory of the model to old knowledge and overcome the disastrous forgetfulness of the deep learning model; the loss function of the increment stage comprises a classification cross entropy loss function and a knowledge distillation loss function, and the knowledge distillation loss function can reduce the difference of the new and old stage models in the old task data, so as to further prevent catastrophic forgetting;
the migration capability of different residual block layers is represented by the aggregation weight, so that the migration capability of a steady-state branch and a dynamic branch can be balanced to balance the plasticity and the stability of the model; the optimization of the aggregation weight and the model parameters are mutually constrained, and a double-layer optimization scheme is adopted to update the parameters of the aggregation weight and the model parameters; after the diagnosis task training is completed in each incremental stage, a certain amount of typical example storage of the data in the stage is continuously selected and used for training in the next incremental stage;
the invention generally constructs a gear box increment fault diagnosis method based on life learning, adopts a double-branch aggregation network, combines knowledge distillation and typical examples, solves the problem of disastrous forgetting of a deep learning diagnosis model, and can be suitable for continuous gear box fault diagnosis of new unexpected faults.
Drawings
FIG. 1 is a flow chart of an embodiment of the method of the present invention;
FIG. 2 is a test chart of a gearbox data generation test stand of the present invention;
FIG. 3 is a fault section view of the gearbox of the present invention;
FIG. 4 is a dual-branch aggregation network structure in a model of the present invention;
FIG. 5 is a graph of diagnostic accuracy of two fine tuning methods and the method of the present invention for a depth model that does not use a lifetime learning method.
Detailed Description
The present invention will be further described with reference to the accompanying drawings and specific examples, which are not intended to be limiting, so that those skilled in the art will better understand the invention and practice it.
Referring to fig. 1, the invention provides a gear box increment fault diagnosis method based on life learning, which comprises the following steps of: collecting vibration data of a gear box to construct an increment health state data set, and dividing the increment health state data set into fault diagnosis tasks of different stages;
s102: utilizing an original ResNet-32 network to learn a fault diagnosis task at an initial stage, constructing an initial stage diagnosis model, and selecting a typical example of fault diagnosis task data at the initial stage;
s103: initializing a ResNet-32 double-branch aggregation network by using an initial stage diagnosis model, wherein the ResNet-32 double-branch aggregation network adopts a cosine standardized classifier, and the number of neurons of a classification layer is increased according to the number of newly increased fault types;
s104: training the stage diagnosis model through the selected typical cases and the stage fault diagnosis task data together, and selecting the typical cases of the stage fault diagnosis task data after training is completed;
the method comprises the steps of representing migration capacities of different residual block layers by using aggregation weights in the training process, reducing the difference of new and old stage diagnosis models in the old stage fault diagnosis task data by combining a knowledge distillation loss function, and optimizing the aggregation weights and model parameters by using a double-layer optimization scheme;
s105: repeating the steps S103-S104 in the subsequent increment stage to obtain a final fault diagnosis model, and performing fault diagnosis.
The invention adopts a life learning method to construct a diagnosis model capable of realizing continuous knowledge migration and accumulation so as to facilitate fault diagnosis of fault type increment caused by complex working conditions.
Further, the performance of the model in the new task and the old task is tested by using the test data of all the learned tasks, and the learning capacity of the model is verified.
Examples
The method is specifically described in this embodiment in connection with specific acquisition of experimental data.
The test bench shown in fig. 2 was used to collect the required experimental data and construct an incremental health status dataset. In order to obtain the unexpected fault of the gear box with the composite fault of the bearing and the gear as shown in fig. 3, a linear cutting technology is adopted to set cracks of 0.4 millimeter on the inner ring, the outer ring and the rollers of the bearing, so that the local fault of the bearing is simulated; and cutting half teeth on the driving gear by adopting an electric spark technology, and simulating the local fault of the gear.
In the experiment, the motor speed is 1496r/min, and the sampling frequency is set to 25.6KHz. The gearbox delta dataset was constructed for a total of 11 different health states consisting of a combination of gear and bearing faults as listed in table 1. The gear has two health states of normal gear and gear fault, the bearing has four basic health states including normal bearing, inner ring fault, roller fault and outer ring fault, and three mixed faults of the bearing, and the bearing is formed by combining three basic health states in pairs.
Thus, according to the actual scenario, the diagnostic tasks of the different phases are divided: and acquiring vibration signals of the gearbox by using an acceleration sensor to construct an incremental health state data set D. Assuming that there are n+1 gearbox fault diagnosis tasks in total, there are n+1 learning phases, i.e., a phase of learning diagnosis task 0 and N incremental phases, during which the number of diagnosis tasks gradually increases. In the nth stage, the training data of task n is
Figure BDA0003452726960000091
wherein Pn Is the number of fault data samples for task n. By J n Representing old fault class C 0:n-1 ={C 0 ,C 1 ,…,C n-1 Number, K of n Representing a new fault class C n And then J n+1 =K n +J n Therefore, it is
Figure BDA0003452726960000092
Representing the i-th sample, +.>
Figure BDA0003452726960000093
As listed in table 1, in an actual scenario, gearbox health status data obtained through experimentation will be used as training samples for task 0 to train the model in the initial phase. These health states are generally common, so are of a large variety and are easy to learn, so seven gearbox health states in which gears normally only have bearings failed are taken as the failure types for task 0 learning; to simulate the fault type increment caused by unexpected faults occurring in a real scene, each task of learning contains a gear-bearing hybrid fault type in each subsequent increment stage. There are 200 training samples and 100 test samples for each failure type. Table 1 health status and incremental task settings of gearbox:
Figure BDA0003452726960000094
therefore, the step S102 specifically includes the steps of:
s102.1: data using task 0
Figure BDA0003452726960000101
Training the original ResNet-32 learning failure class C 0 Obtaining an initial model theta 0 The detailed structure of ResNet-32 is shown in Table 2. The loss function of the model is a class cross entropy loss function: />
Figure BDA0003452726960000102
Where δ is the true label. The model parameters theta of the initial stage 0 Is conventional +.>
Figure BDA0003452726960000103
Table 2 structuring parameters of backbone network res net-32:
Figure BDA0003452726960000104
s102.2: after training, the feature extractor F before the classification layer is utilized 0 A certain number of typical examples epsilon are selected through a locking algorithm 0 . By using
Figure BDA0003452726960000105
Training samples representing a faulty class c, then class c averages +.>
Figure BDA0003452726960000106
wherein Pc Is the number of training samples of class c.
The number of selected typical cases has two schemes: firstly, fixing the number of typical cases selected by each fault type to be 5; or a fixed total storage number of 55. If the number of classics of the selected class c is t, each classics passes
Figure BDA0003452726960000107
Calculated epsilon= (e) 1 ,e 2 ,…,e t )。
The step S103 specifically includes the following steps:
s103.1: the original ResNet-32 was replaced with a dual-branch aggregation network, the dual-branch aggregation network structure of which is shown in FIG. 4. Wherein the dual-branch aggregation network comprises a dynamic branch and a steady-state branch.
The dynamic branch is a conventional parameter level fine tuning, namely, an initial model is used for initializing the dynamic branch of an incremental stage, and tasks of each stage are used for training and fine tuning the parameter alpha;
the steady-state branches are fine adjustments of parameters of neuron levels after freezing initial stage network parameters, namely, each neuron is given a weight beta and is trained and fine adjusted by tasks of each stage. Assuming that the steady-state branch k-th layer convolutional neural network comprises Q neurons, the neuron weights are parameters of initial model freezing
Figure BDA0003452726960000111
The input of the k-th layer convolutional neural network is x k-1 Output is x k =(W k ⊙β k )x k-1 Wherein, the "+.. The learnable parameters beta of the steady state blocks are smaller than alpha, and the method can enable the steady state residual blocks to be slowly adapted to the knowledge of new tasks, and meanwhile old knowledge is fully reserved.
S103.2: the classifier of the initial model is a conventional fully connected classification layer by
Figure BDA0003452726960000112
Calculating a predictive probability of the input x being class c, where θ 0 For the parameters of the full connection classification layer in the initial stage, h 0 Features extracted in an initial stage;
the cosine standardized classifier of the increment stage n passes
Figure BDA0003452726960000113
Calculating a predictive probability of the input x being class c, where θ n Full connection classification layer parameters for delta phase n, h n For the feature extracted in delta phase n, +.>
Figure BDA0003452726960000114
Representation l 2 Norms (F/F)>
Figure BDA0003452726960000115
Eta is a leachable scaling parameter and controls the cosine similarity value at [ -1,1]Within the range. The problem of classification bias of new and old classes can be avoided through a cosine standardized classifier.
For failure category increases, the number of classification layer neurons should increase to be consistent with the number of failure categories.
The step S104 specifically includes the following steps:
s104.1: using the classical epsilon reserved in the initial stage 0 And the stage task data D 0 Training a dual-branch aggregation network, and respectively endowing self-adaptive aggregation weights omega and xi aiming at different migration capacities of a dynamic residual block and a steady residual block of each residual block layer, as shown in fig. 4.
Failure training data x [0] Extracting features through a dual-branch aggregation network, and dynamically extracting features at an mth residual block layerResidual block extraction is characterized in that
Figure BDA0003452726960000116
The steady state residual block extraction is characterized by +.>
Figure BDA0003452726960000117
The aggregation characteristic of the mth residual block layer is that
Figure BDA0003452726960000118
wherein ω[m][m] =1。
S104.2: the loss function of the increment stage is classified cross entropy loss
Figure BDA0003452726960000121
And knowledge distillation loss->
Figure BDA0003452726960000122
wherein ,/>
Figure BDA0003452726960000123
Figure BDA0003452726960000124
and />
Figure BDA0003452726960000125
The temperature T is typically greater than 1 for soft labels with old models in the old failure class and hard labels with new models in the old failure class, respectively. Narrowing new models in old failure class C by knowledge distillation loss 0:n-1 The difference between the expression and the old model is approximately constrained to the similarity distribution of the old class in the old model. The loss function of the delta phase is +.>
Figure BDA0003452726960000126
Wherein lambda is more than 0 and less than or equal to 1.
S104.2: the non-optimized parameters of the increment stage have model parameters theta n And aggregate weights ω and ζ for aggregationUpdating the combined weights ω and ζ requires fixing the model parameters Θ n Adopting a double-layer optimization scheme;
the double-layer optimization scheme is divided into upper layer problems
Figure BDA0003452726960000127
And lower layer problems
Figure BDA0003452726960000128
By->
Figure BDA0003452726960000129
Updating model parameters Θ n, wherein γ1 Is the lower problem learning rate;
updating aggregate weights in upper layer problems to balance dynamic and steady state residual blocks, using a random sample data set D n Obtaining
Figure BDA00034527269600001210
Establishing balance data->
Figure BDA00034527269600001211
By passing through
Figure BDA00034527269600001212
Updating the aggregate weights, wherein γ 2 Is the learning rate of the upper layer problem.
The step S105 specifically includes the following steps:
the model Θ obtained by training in the increment stage n n All learned faults C 0:n The test data contains all learned fault classes to verify that the model has the ability to learn without forgetting. After 4 incremental task learning is completed, two fine-tuning and confusion matrices for the method of the present invention under two typical case number strategies are shown in FIG. 5. The two fine-tuned confusion matrixes reflect the disastrous forgetting of the deep learning diagnosis model without lifelong learning, and the method can effectively solve the disastrous forgetting and realize the continuous fault diagnosis of the gear box with new unexpected faults.
In summary, the invention designs a method for realizing the increment fault diagnosis of the gear box based on the life learning method. Compared with the traditional deep learning method, the method can solve the problem of disastrous forgetting and is more suitable for the actual scene of industrial application.
The invention also provides a gear box increment fault diagnosis system based on the life learning, which adopts the gear box increment fault diagnosis method based on the life learning to carry out gear box fault diagnosis.
The principles of which are similar to those described above and which are repeated, it is noted that the present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above-described embodiments are merely preferred embodiments for fully explaining the present invention, and the scope of the present invention is not limited thereto. Equivalent substitutions and modifications will occur to those skilled in the art based on the present invention, and are intended to be within the scope of the present invention. The protection scope of the invention is subject to the claims.

Claims (6)

1. A gear box increment fault diagnosis method based on life learning is characterized in that: the method comprises the following steps:
s101: collecting vibration data of a gear box to construct an increment health state data set, and dividing the increment health state data set into fault diagnosis tasks of different stages;
s102: utilizing an original ResNet-32 network to learn a fault diagnosis task at an initial stage, constructing an initial stage diagnosis model, and selecting a typical example of fault diagnosis task data at the initial stage;
s103: initializing a ResNet-32 double-branch aggregation network by using an initial stage diagnosis model, wherein the ResNet-32 double-branch aggregation network adopts a cosine standardized classifier, and the number of neurons of a classification layer is increased according to the number of newly increased fault types;
s104: training the stage diagnosis model through the selected typical cases and the stage fault diagnosis task data together, and selecting the typical cases of the stage fault diagnosis task data after training is completed;
the method comprises the steps of representing migration capacities of different residual block layers by using aggregation weights in the training process, reducing the difference of new and old stage diagnosis models in the old stage fault diagnosis task data by combining a knowledge distillation loss function, and optimizing the aggregation weights and model parameters by using a double-layer optimization scheme;
s105: repeating the steps S103-S104 in the subsequent increment stage to obtain a final fault diagnosis model, and performing fault diagnosis;
the step S103 specifically includes the following steps:
replacing the original ResNet-32 network with a ResNet-32 dual-branch aggregation network, wherein the ResNet-32 dual-branch aggregation network comprises a dynamic branch and a steady-state branch;
the dynamic branch is subjected to conventional parameter level fine tuning, namely, an initial stage diagnosis model is used for initializing the dynamic branch of an incremental stage, and each stage task is used for training fine tuning parameters;
the steady-state branch is the fine adjustment of the neuron level after freezing the network parameters in the initial stage, namely, each neuron is given a weight beta and is trained and fine adjusted by each stage of task, if the k-th layer convolutional neural network of the steady-state branch comprises Q neurons, the neuron weight is the parameter frozen by the initial model
Figure FDA0003998773490000011
The input of the k-th layer convolutional neural network is x k-1 Output is x k =(W k ⊙β k )x k-1 Wherein, the addition of the root is Hadamard product;
The cosine standardized classifier of the increment stage n passes
Figure FDA0003998773490000021
Calculating a predictive probability of the input x being class c, wherein θ n Full connection classification layer parameters for delta phase n, h n For the feature extracted in delta phase n, +.>
Figure FDA00039987734900000214
Representation l 2 Norms (F/F)>
Figure FDA0003998773490000022
Eta is a leachable scaling parameter and controls the cosine similarity value at [ -1,1]Within the range where J n Representing old fault class C 0:n-1 ={C 0 ,C 1 ,...,C n-1 Number, K of n Representing a new fault class C n Quantity, J n+1 =K n +J n
Aiming at the increase of the fault categories, the number of the neurons of the classification layer is increased to be consistent with the number of the fault categories;
the migration capability of representing different residual block layers by using the aggregation weight comprises the following steps:
using the classical epsilon reserved in the initial stage 0 And one-stage task data D 1 Training a double-branch aggregation network, and respectively endowing self-adaptive aggregation weights omega and xi aiming at different migration capacities of a dynamic residual block and a steady residual block of each residual block layer;
failure training data x [0] Extracting features through a double-branch aggregation network, and extracting features of dynamic residual blocks at an mth residual block layer are as follows
Figure FDA0003998773490000023
The steady state residual block extraction is characterized by +.>
Figure FDA0003998773490000024
The aggregation characteristic of the mth residual block layer is that
Figure FDA0003998773490000025
wherein ,ω[m][m] =1;/>
The loss function of the initial stage is classified cross entropy loss
Figure FDA0003998773490000026
The loss function of the increment stage is classified cross entropy loss
Figure FDA0003998773490000027
And knowledge distillation loss->
Figure FDA0003998773490000028
wherein ,
Figure FDA0003998773490000029
Figure FDA00039987734900000210
and />
Figure FDA00039987734900000211
Respectively a soft label of an old model in an old fault class and a hard label of a new model in the old fault class, wherein the temperature T is more than 1;
the loss function of the increment stage is
Figure FDA00039987734900000212
Wherein lambda is more than 0 and less than or equal to 1;
the model parameters theta of the initial stage 0 Is conventional
Figure FDA00039987734900000213
The non-optimized parameters of the incremental phase are modeledParameter theta n And aggregate weights ω and ζ, the update for aggregate weights ω and ζ requires a fixed model parameter Θ n Adopting a double-layer optimization scheme;
the double-layer optimization scheme is divided into upper layer problems
Figure FDA0003998773490000031
And lower layer problems
Figure FDA0003998773490000032
By passing through
Figure FDA0003998773490000033
Updating model parameters Θ n, wherein ,γ1 Is the lower problem learning rate;
using randomly sampled data sets D n Obtaining
Figure FDA0003998773490000034
Establishing balance data->
Figure FDA0003998773490000035
By passing through
Figure FDA0003998773490000036
Updating the aggregate weight, wherein γ 2 Is the learning rate of the upper layer problem.
2. The life-learning-based gearbox incremental fault diagnosis method according to claim 1, wherein: the step S101 specifically includes the following steps:
acquiring vibration signals of the gearbox by using an acceleration sensor to construct an incremental health state data set D;
if there are n+1 fault diagnosis tasks in total, there are n+1 learning phases, i.e., the fault diagnosis task 0 of the initial phase and N incremental phases, during which the number of diagnosis tasks gradually increases;
in the nth stage, the training data of task n is
Figure FDA0003998773490000037
wherein ,pn Is the number of fault data samples for task n;
Figure FDA0003998773490000038
representing the i-th sample, +.>
Figure FDA0003998773490000039
3. The life-learning-based gearbox incremental fault diagnosis method according to claim 2, wherein: the step S102 specifically includes the following steps:
data using task 0
Figure FDA00039987734900000310
Training the original ResNet-32 learning failure class C 0 Obtaining an initial stage diagnosis model theta 0 The loss function of the initial stage diagnostic model is a classification cross entropy loss function:
Figure FDA00039987734900000311
wherein δ is the true label;
after training, the feature extractor F before the classification layer is utilized 0 A certain number of typical examples epsilon are selected through a locking algorithm 0
4. A life-learning based gearbox incremental fault diagnosis method according to claim 3, wherein: using feature extractor F before classification layer 0 A certain number of typical examples are selected through a locking algorithm, including:
by using
Figure FDA0003998773490000041
Training samples representing a faulty class c, then class average for cIs->
Figure FDA0003998773490000042
wherein ,Pc Is the number of training samples of class c;
the number of selected typical cases is t, and each typical case passes through
Figure FDA0003998773490000043
Calculated epsilon= (e) 1 ,e 2 ,...,e t )。
5. The life-learning based gearbox delta fault diagnosis method according to any one of claims 1 to 4, wherein: after each increment training is completed, testing the performance of the model in new and old tasks by using test data of all learned tasks, and verifying the learning capacity of the model, wherein the test data comprises the following steps:
the model Θ obtained by training in the increment stage n n All learned faults C 0:n The test data contains all learned fault classes to verify that the model has the ability to learn without forgetting.
6. A life learning based gearbox incremental fault diagnosis system, characterized in that: gearbox fault diagnosis is performed by adopting the gearbox incremental fault diagnosis method based on life learning as claimed in any one of claims 1-4.
CN202111677774.4A 2021-12-31 2021-12-31 Gear box increment fault diagnosis method and system based on life learning Active CN114429153B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111677774.4A CN114429153B (en) 2021-12-31 2021-12-31 Gear box increment fault diagnosis method and system based on life learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111677774.4A CN114429153B (en) 2021-12-31 2021-12-31 Gear box increment fault diagnosis method and system based on life learning

Publications (2)

Publication Number Publication Date
CN114429153A CN114429153A (en) 2022-05-03
CN114429153B true CN114429153B (en) 2023-04-28

Family

ID=81311970

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111677774.4A Active CN114429153B (en) 2021-12-31 2021-12-31 Gear box increment fault diagnosis method and system based on life learning

Country Status (1)

Country Link
CN (1) CN114429153B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115270956B (en) * 2022-07-25 2023-10-27 苏州大学 Continuous learning-based cross-equipment incremental bearing fault diagnosis method
CN115510963A (en) * 2022-09-20 2022-12-23 同济大学 Incremental equipment fault diagnosis method
CN116029367A (en) * 2022-12-26 2023-04-28 东北林业大学 Fault diagnosis model optimization method based on personalized federal learning
CN116089883B (en) * 2023-01-30 2023-12-19 北京邮电大学 Training method for improving classification degree of new and old categories in existing category increment learning
CN117313000B (en) * 2023-09-19 2024-03-15 北京交通大学 Motor brain learning fault diagnosis method based on sample characterization topology
CN117150377B (en) * 2023-11-01 2024-02-02 北京交通大学 Motor fault diagnosis stepped learning method based on full-automatic motor offset
CN117313251B (en) * 2023-11-30 2024-03-15 北京交通大学 Train transmission device global fault diagnosis method based on non-hysteresis progressive learning
CN117591888B (en) * 2024-01-17 2024-04-12 北京交通大学 Cluster autonomous learning fault diagnosis method for key parts of train

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5566092A (en) * 1993-12-30 1996-10-15 Caterpillar Inc. Machine fault diagnostics system and method
CN108376264A (en) * 2018-02-26 2018-08-07 上海理工大学 A kind of handpiece Water Chilling Units method for diagnosing faults based on support vector machines incremental learning
CN109492765A (en) * 2018-11-01 2019-03-19 浙江工业大学 A kind of image Increment Learning Algorithm based on migration models
CN110162018A (en) * 2019-05-31 2019-08-23 天津开发区精诺瀚海数据科技有限公司 The increment type equipment fault diagnosis method that knowledge based distillation is shared with hidden layer
CN111651937A (en) * 2020-06-03 2020-09-11 苏州大学 Method for diagnosing similar self-adaptive bearing fault under variable working conditions
CN112381788A (en) * 2020-11-13 2021-02-19 北京工商大学 Part surface defect increment detection method based on double-branch matching network
CN112990280A (en) * 2021-03-01 2021-06-18 华南理工大学 Class increment classification method, system, device and medium for image big data
CN113281048A (en) * 2021-06-25 2021-08-20 华中科技大学 Rolling bearing fault diagnosis method and system based on relational knowledge distillation

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190339688A1 (en) * 2016-05-09 2019-11-07 Strong Force Iot Portfolio 2016, Llc Methods and systems for data collection, learning, and streaming of machine signals for analytics and maintenance using the industrial internet of things

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5566092A (en) * 1993-12-30 1996-10-15 Caterpillar Inc. Machine fault diagnostics system and method
CN108376264A (en) * 2018-02-26 2018-08-07 上海理工大学 A kind of handpiece Water Chilling Units method for diagnosing faults based on support vector machines incremental learning
CN109492765A (en) * 2018-11-01 2019-03-19 浙江工业大学 A kind of image Increment Learning Algorithm based on migration models
CN110162018A (en) * 2019-05-31 2019-08-23 天津开发区精诺瀚海数据科技有限公司 The increment type equipment fault diagnosis method that knowledge based distillation is shared with hidden layer
CN111651937A (en) * 2020-06-03 2020-09-11 苏州大学 Method for diagnosing similar self-adaptive bearing fault under variable working conditions
CN112381788A (en) * 2020-11-13 2021-02-19 北京工商大学 Part surface defect increment detection method based on double-branch matching network
CN112990280A (en) * 2021-03-01 2021-06-18 华南理工大学 Class increment classification method, system, device and medium for image big data
CN113281048A (en) * 2021-06-25 2021-08-20 华中科技大学 Rolling bearing fault diagnosis method and system based on relational knowledge distillation

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
A Continual Learning Survey: Defying Forgetting in Classification Tasks;Matthias De Lange 等;《arxiv》;20210416;第3366-3565页 *
Highly Accurate Machine Fault Diagnosis Using Deep Transfer Learning;Siyu Shao 等;《IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS》;20191231;第15卷(第4期);第2446-2455页 *
基于改进卷积神经网络及ightGBM 的滚动轴承故障诊断;杨瑞双 等;《轴承》;20210615(第06期);第44-49页 *
基于神经网络迁移学习和增量学习的脑电信号分类;韩久琦 等;《第四届全国神经动力学学术会议》;20180806;第78-80页 *

Also Published As

Publication number Publication date
CN114429153A (en) 2022-05-03

Similar Documents

Publication Publication Date Title
CN114429153B (en) Gear box increment fault diagnosis method and system based on life learning
CN111596604B (en) Intelligent fault diagnosis and self-healing control system and method for engineering equipment based on digital twinning
CN111237134B (en) Offshore double-fed wind driven generator fault diagnosis method based on GRA-LSTM-stacking model
CN115270956B (en) Continuous learning-based cross-equipment incremental bearing fault diagnosis method
CN110929918B (en) 10kV feeder fault prediction method based on CNN and LightGBM
CN111351665B (en) Rolling bearing fault diagnosis method based on EMD and residual error neural network
CN111812507A (en) Motor fault diagnosis method based on graph convolution
CN109472097B (en) Fault diagnosis method for online monitoring equipment of power transmission line
CN113343591B (en) Product key part life end-to-end prediction method based on self-attention network
CN115453356B (en) Power equipment operation state monitoring and analyzing method, system, terminal and medium
Shi et al. Study of wind turbine fault diagnosis and early warning based on SCADA data
CN115146718A (en) Depth representation-based wind turbine generator anomaly detection method
CN112949402A (en) Fault diagnosis method for planetary gear box under minimum fault sample size
CN112461543A (en) Rotary machine fault diagnosis method based on multi-classification support vector data description
CN116108346A (en) Bearing increment fault diagnosis life learning method based on generated feature replay
CN116956203B (en) Method and system for measuring action characteristics of tapping switch of transformer
CN107527093B (en) Wind turbine generator running state diagnosis method and device
CN111244937B (en) Method for screening serious faults of transient voltage stability of power system
CN112000923A (en) Power grid fault diagnosis method, system and equipment
CN116432359A (en) Variable topology network tide calculation method based on meta transfer learning
CN102788955A (en) Remaining lifetime prediction method of ESN (echo state network) turbine generator classification submodel based on Kalman filtering
CN102749199A (en) Method for predicting residual service lives of turbine engines on basis of ESN (echo state network)
Velasquez et al. Machine Learning Approach for Predictive Maintenance in Hydroelectric Power Plants
Peng et al. Wind turbine blades icing failure prognosis based on balanced data and improved entropy
CN112651628A (en) Power system transient stability evaluation method based on capsule neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant