CN114429153B

CN114429153B - Gear box increment fault diagnosis method and system based on life learning

Info

Publication number: CN114429153B
Application number: CN202111677774.4A
Authority: CN
Inventors: 沈长青; 陈博戬; 孔林; 陈良; 丁传仓; 申永军; 庄国龙; 张艳华; 李林; 张爱文; 祁玉梅; 石娟娟; 江星星; 黄伟国; 朱忠奎
Original assignee: Suzhou University
Current assignee: Suzhou University
Priority date: 2021-12-31
Filing date: 2021-12-31
Publication date: 2023-04-28
Anticipated expiration: 2041-12-31
Also published as: CN114429153A

Abstract

The invention discloses a gear box increment fault diagnosis method and system based on life learning, comprising the following steps: s101: collecting vibration data of a gear box to construct an increment health state data set, and dividing the increment health state data set into fault diagnosis tasks of different stages; s102: constructing an initial stage diagnosis model by utilizing the fault diagnosis task of the initial stage of the original ResNet-32 network learning; s103: initializing a ResNet-32 double-branch aggregation network by using an initial stage diagnosis model, and increasing the number of neurons of a classification layer according to the number of newly-increased fault types; s104: training the stage diagnosis model through the selected typical cases and the stage fault diagnosis task data together, and selecting the typical cases of the stage fault diagnosis task data after training is completed; s105: repeating the steps S103-S104 in the subsequent increment stage to obtain a final fault diagnosis model, and performing fault diagnosis. The invention aims to solve the problem that the existing fault diagnosis model based on deep learning and transfer learning cannot diagnose actual unexpected faults of the gear box.

Description

Gear box increment fault diagnosis method and system based on life learning

Technical Field

The invention relates to the technical field of mechanical fault diagnosis, in particular to a gear box increment fault diagnosis method and system based on life learning.

Background

With the rapid development of modern industrialization, the precision and importance of rotating machinery are increasing. Rotary machines have become one of the most widely used industrial machines, and there is an increasing demand for reliability. Rotary machines are used in many fields, such as aviation, navigation, machinery, chemical, energy, electric power, etc., and their service conditions show an increasingly complex trend, and performance degradation or even failure inevitably occurs during operation, resulting in huge economic losses, higher and higher operation and maintenance costs, and even catastrophic casualties, causing irrecoverable adverse effects to the environment and society. Therefore, the research of the health state monitoring and fault diagnosis method is carried out by taking the rotary mechanical equipment as an object, and the method has important significance for guaranteeing the safe and reliable operation of the mechanical equipment, preventing the critical equipment from faults and avoiding huge economic loss and disastrous accidents.

The requirements on high speed, heavy load and automation degree of modern rotary mechanical equipment are continuously improved, the dynamic signals are more complex, the modern state monitoring technology can realize multi-measuring point and full-service-life data acquisition of complex equipment, and massive data are further obtained, but the processing of the dynamic signals and the feature extraction of health state information in the dynamic signals bring great difficulty. The traditional fault diagnosis method comprises the steps of extracting fault characteristic frequency based on a vibration signal, short-time Fourier transformation, empirical mode decomposition, sparse representation and the like. The methods are mature, but for the state signals of the current mechanical equipment, the method based on signal processing does not have the capability of processing fault data with low density, strong interference and diversity under variable working conditions in a large amount of signal data.

In recent years, with the rapid development of the fields of artificial intelligence and machine learning, more and more intelligent fault diagnosis methods of rotating machinery based on machine learning are proposed. Machine learning-based fault diagnosis generally comprises the steps of signal acquisition, feature extraction, fault identification, prediction and the like. The method greatly simplifies the fault diagnosis process and improves the diagnosis efficiency, but has a simple structure and limited layers due to most shallow networks, the effectiveness of the method depends on the effectiveness of the pre-processing extraction features, and the processing capacity is limited when facing a large number of equipment state signals with complex structures.

In recent years, a plurality of students overcome the defect that a shallow model is difficult to characterize a complex mapping relation between signals and health conditions by utilizing excellent self-adaptive feature learning and extraction capability of deep learning, and achieve good effects. However, these methods are based on two assumptions: the training data is distributed with the test data and the training data is sufficient. In actual engineering, the operation conditions of mechanical equipment are changeable, faults occur accidentally, the obtained samples are difficult to meet the two assumptions, and the fault diagnosis result is directly affected. With the rapid development of transfer learning, with the aid of knowledge mining and transfer capability among cross fields and cross distribution, a transfer learning solution for a label sample limited (very small sample or no sample) problem or a variable working condition problem is developed in the field of mechanical fault diagnosis. However, the transfer learning can only meet the fault diagnosis of a single target task, namely, the primary transfer can be completed under the given conditions of a source domain and a target domain, and the generalization capability of a model is greatly reduced and the universality is poor when the model faces a new task due to the diversity of mechanical equipment faults and operation conditions; on the other hand, the migration learning does not involve accumulation of knowledge, and when the equipment state recognition task is carried out under the working condition corresponding to the source domain data, the performance is poor and is inconsistent with the actual requirements of engineering.

In practice, due to the complex and varying operating conditions, the machine often generates unexpected faults, resulting in increased fault types, disabling the deep diagnostic model and the deep migration diagnostic model trained by pre-collecting semi-complete fault data, thus requiring retraining of the model to identify new fault types. However, training the depth model directly with the new type of data will result in the identification of the old fault class exhibiting a cliff drop, which is referred to as catastrophic forgetfulness. Catastrophic forgetting has been an important problem in the field of deep learning, and also in the field of fault diagnosis, it is necessary to study the problem of solving the catastrophic forgetting of a deep diagnosis model caused by unexpected faults, so as to build a life-long fault diagnosis model with higher reliability, generalization and versatility.

Disclosure of Invention

The invention aims to provide a gear box increment fault diagnosis method and system based on life learning, which are used for solving the problem that the conventional fault diagnosis model based on deep learning and transfer learning cannot diagnose actual unexpected faults of a gear box.

In order to solve the technical problems, the invention provides a gear box increment fault diagnosis method based on life learning, which comprises the following steps:

s101: collecting vibration data of a gear box to construct an increment health state data set, and dividing the increment health state data set into fault diagnosis tasks of different stages;

s102: utilizing an original ResNet-32 network to learn a fault diagnosis task at an initial stage, constructing an initial stage diagnosis model, and selecting a typical example of fault diagnosis task data at the initial stage;

s103: initializing a ResNet-32 double-branch aggregation network by using an initial stage diagnosis model, wherein the ResNet-32 double-branch aggregation network adopts a cosine standardized classifier, and the number of neurons of a classification layer is increased according to the number of newly increased fault types;

s104: training the stage diagnosis model through the selected typical cases and the stage fault diagnosis task data together, and selecting the typical cases of the stage fault diagnosis task data after training is completed;

the method comprises the steps of representing migration capacities of different residual block layers by using aggregation weights in the training process, reducing the difference of new and old stage diagnosis models in the old stage fault diagnosis task data by combining a knowledge distillation loss function, and optimizing the aggregation weights and model parameters by using a double-layer optimization scheme;

s105: repeating the steps S103-S104 in the subsequent increment stage to obtain a final fault diagnosis model, and performing fault diagnosis.

As a further improvement of the present invention, the step S101 specifically includes the steps of:

acquiring vibration signals of the gearbox by using an acceleration sensor to construct an incremental health state data set D;

if there are n+1 fault diagnosis tasks in total, there are n+1 learning phases, i.e., the fault diagnosis task 0 of the initial phase and N incremental phases, during which the number of diagnosis tasks gradually increases;

in the nth stage, the training data of task n is

wherein ,Pⁿ Is the number of fault data samples for task n;

if J _n Representing old fault class C _0:n-1 ＝{C ₀ ,C ₁ ,…,C _n-1 Number, K of _n Representing a new fault class C _n And then J _n+1 ＝K _n +J _n ，

Representing the i-th sample, +.>

As a further improvement of the present invention, the step S102 specifically includes the steps of:

data using task 0

Training the original ResNet-32 learning failure class C ₀ Obtaining an initial stage diagnosis model theta ₀ The loss function of the initial stage diagnostic model is a classification cross entropy loss function: />

Wherein δ is the true label;

after training, the feature extractor F before the classification layer is utilized ₀ A certain number of typical examples epsilon are selected through a locking algorithm ₀ 。

As a further improvement of the present invention, a feature extractor F before the classification layer is utilized ₀ A certain number of typical examples are selected through a locking algorithm, including:

by using

Indicating the reasonTraining samples of barrier class c, then class c averages +.>

wherein ,P_c Is the number of training samples of class c;

or the number of selected classics is t, each classics passes

Calculated epsilon= (e) ₁ ,e ₂ ,…,e _t )。

As a further improvement of the present invention, the step S103 specifically includes the steps of:

replacing the original ResNet-32 network with a ResNet-32 dual-branch aggregation network, wherein the ResNet-32 dual-branch aggregation network comprises a dynamic branch and a steady-state branch;

the dynamic branch is a conventional parameter level fine tuning, namely, an initial stage diagnosis model is used for initializing the dynamic branch of an incremental stage, and tasks of each stage are used for training and fine tuning the parameter alpha;

the steady-state branch is the fine adjustment of the neuron level after freezing the network parameters in the initial stage, namely, each neuron is given a weight beta and is trained and fine adjusted by each stage of task, if the k-th layer convolutional neural network of the steady-state branch comprises Q neurons, the neuron weight is the parameter frozen by the initial model

The input of the k-th layer convolutional neural network is x _k-1 Output is x _k ＝(W _k ⊙β _k )x _k-1 Wherein, the addition is Hadamard product;

the cosine standardized classifier of the increment stage n passes

Calculating a predictive probability of the input x being class c, wherein θ ⁿ Full connection classification layer parameters for delta phase n, h ⁿ For the feature extracted in delta phase n, +.>

Representation l ₂ Norms (F/F)>

Eta is a leachable scaling parameter and controls the cosine similarity value at [ -1,1]Within the range;

for the failure category increase, the number of classification layer neurons increases to coincide with the number of failure categories.

As a further improvement of the present invention, the representing the migration capability of different residual block layers by using the aggregation weight includes:

using the classical epsilon reserved in the initial stage ₀ And the stage task data D ₀ Training a double-branch aggregation network, and respectively endowing self-adaptive aggregation weights omega and xi aiming at different migration capacities of a dynamic residual block and a steady residual block of each residual block layer;

the fault training data x ^[0] Extracting features through a double-branch aggregation network, and extracting features of dynamic residual blocks at an mth residual block layer are as follows

The steady state residual block extraction is characterized by +.>

The aggregation characteristic of the mth residual block layer is that

wherein ,ω^[m] +ξ ^[m] ＝1。/>

As a further improvement of the present invention, the loss function of the initial stage is a classification cross entropy loss

The loss function of the increment stage is classified cross entropy loss

And knowledge distillation loss->

wherein ,

and />

The temperature T is typically greater than 1 for soft labels with old models in the old failure class and hard labels with new models in the old failure class, respectively.

As a further improvement of the invention, the loss function of the incremental phase is

Wherein lambda is more than 0 and less than or equal to 1;

the model parameters theta of the initial stage ₀ Is conventional

The non-optimized parameters of the increment stage have model parameters theta _n And aggregate weights ω and ζ, the update for aggregate weights ω and ζ requires a fixed model parameter Θ _n Adopting a double-layer optimization scheme;

the double-layer optimization scheme is divided into upper layer problems

And lower layer problems

By passing through

Updating model parameters Θ _n, wherein ,γ₁ Is the lower problem learning rate;

using randomly sampled data sets D _n Obtaining

Establishing balance data->

By passing through

Updating the aggregate weight, wherein γ ₂ Is the learning rate of the upper layer problem.

As a further improvement of the invention, after each incremental training is completed, the performance of the model in new and old tasks is tested by using the test data of all learned tasks, and the capability of the model for not forgetting learning is verified, which comprises the following steps:

the model Θ obtained by training in the increment stage n _n All learned faults C _0:n The test data contains all learned fault classes to verify that the model has the ability to learn without forgetting.

The gear box increment fault diagnosis system based on the life learning adopts the gear box increment fault diagnosis method based on the life learning to carry out gear box fault diagnosis.

The invention has the beneficial effects that: according to the method for diagnosing the faults of the gearbox, firstly, the vibration signals of the gearbox are collected through the acceleration sensor to construct an increment health state data set, diagnosis tasks in different stages are divided, and the increase of diagnosis tasks caused by the increase of fault types due to the occurrence of unexpected faults in an actual scene is simulated;

in the initial stage, an initial gear box bearing fault diagnosis task is learned by using an original ResNet-32, an incomplete fault diagnosis model trained by pre-collecting fault data in a simulated reality scene is used, and a certain number of typical examples are selected from initial task data for storage through an localization algorithm after training is completed; replacing the original ResNet-32 with an improved ResNet-32-based dual-branch aggregation network in a subsequent incremental stage to obtain an incremental stage feature extractor structure so as to balance the plasticity (knowledge migration) and stability (knowledge accumulation) of the model, and simultaneously modifying the full-connection layer classifier into a cosine standardized classifier so as to avoid the classification bias problem of the model and increase the number of neurons of the classification layer according to the number of newly added fault types;

the model of the first incremental stage is trained by using the typical examples stored in the initial stage and the diagnosis task data of the stage so as to wake up the memory of the model to old knowledge and overcome the disastrous forgetfulness of the deep learning model; the loss function of the increment stage comprises a classification cross entropy loss function and a knowledge distillation loss function, and the knowledge distillation loss function can reduce the difference of the new and old stage models in the old task data, so as to further prevent catastrophic forgetting;

the migration capability of different residual block layers is represented by the aggregation weight, so that the migration capability of a steady-state branch and a dynamic branch can be balanced to balance the plasticity and the stability of the model; the optimization of the aggregation weight and the model parameters are mutually constrained, and a double-layer optimization scheme is adopted to update the parameters of the aggregation weight and the model parameters; after the diagnosis task training is completed in each incremental stage, a certain amount of typical example storage of the data in the stage is continuously selected and used for training in the next incremental stage;

the invention generally constructs a gear box increment fault diagnosis method based on life learning, adopts a double-branch aggregation network, combines knowledge distillation and typical examples, solves the problem of disastrous forgetting of a deep learning diagnosis model, and can be suitable for continuous gear box fault diagnosis of new unexpected faults.

Drawings

FIG. 1 is a flow chart of an embodiment of the method of the present invention;

FIG. 2 is a test chart of a gearbox data generation test stand of the present invention;

FIG. 3 is a fault section view of the gearbox of the present invention;

FIG. 4 is a dual-branch aggregation network structure in a model of the present invention;

FIG. 5 is a graph of diagnostic accuracy of two fine tuning methods and the method of the present invention for a depth model that does not use a lifetime learning method.

Detailed Description

The present invention will be further described with reference to the accompanying drawings and specific examples, which are not intended to be limiting, so that those skilled in the art will better understand the invention and practice it.

Referring to fig. 1, the invention provides a gear box increment fault diagnosis method based on life learning, which comprises the following steps of: collecting vibration data of a gear box to construct an increment health state data set, and dividing the increment health state data set into fault diagnosis tasks of different stages;

The invention adopts a life learning method to construct a diagnosis model capable of realizing continuous knowledge migration and accumulation so as to facilitate fault diagnosis of fault type increment caused by complex working conditions.

Further, the performance of the model in the new task and the old task is tested by using the test data of all the learned tasks, and the learning capacity of the model is verified.

Examples

The method is specifically described in this embodiment in connection with specific acquisition of experimental data.

The test bench shown in fig. 2 was used to collect the required experimental data and construct an incremental health status dataset. In order to obtain the unexpected fault of the gear box with the composite fault of the bearing and the gear as shown in fig. 3, a linear cutting technology is adopted to set cracks of 0.4 millimeter on the inner ring, the outer ring and the rollers of the bearing, so that the local fault of the bearing is simulated; and cutting half teeth on the driving gear by adopting an electric spark technology, and simulating the local fault of the gear.

In the experiment, the motor speed is 1496r/min, and the sampling frequency is set to 25.6KHz. The gearbox delta dataset was constructed for a total of 11 different health states consisting of a combination of gear and bearing faults as listed in table 1. The gear has two health states of normal gear and gear fault, the bearing has four basic health states including normal bearing, inner ring fault, roller fault and outer ring fault, and three mixed faults of the bearing, and the bearing is formed by combining three basic health states in pairs.

Thus, according to the actual scenario, the diagnostic tasks of the different phases are divided: and acquiring vibration signals of the gearbox by using an acceleration sensor to construct an incremental health state data set D. Assuming that there are n+1 gearbox fault diagnosis tasks in total, there are n+1 learning phases, i.e., a phase of learning diagnosis task 0 and N incremental phases, during which the number of diagnosis tasks gradually increases. In the nth stage, the training data of task n is

wherein Pⁿ Is the number of fault data samples for task n. By J _n Representing old fault class C _0:n-1 ＝{C ₀ ,C ₁ ,…,C _n-1 Number, K of _n Representing a new fault class C _n And then J _n+1 ＝K _n +J _n Therefore, it is

Representing the i-th sample, +.>

As listed in table 1, in an actual scenario, gearbox health status data obtained through experimentation will be used as training samples for task 0 to train the model in the initial phase. These health states are generally common, so are of a large variety and are easy to learn, so seven gearbox health states in which gears normally only have bearings failed are taken as the failure types for task 0 learning; to simulate the fault type increment caused by unexpected faults occurring in a real scene, each task of learning contains a gear-bearing hybrid fault type in each subsequent increment stage. There are 200 training samples and 100 test samples for each failure type. Table 1 health status and incremental task settings of gearbox:

therefore, the step S102 specifically includes the steps of:

s102.1: data using task 0

Training the original ResNet-32 learning failure class C ₀ Obtaining an initial model theta ₀ The detailed structure of ResNet-32 is shown in Table 2. The loss function of the model is a class cross entropy loss function: />

Where δ is the true label. The model parameters theta of the initial stage ₀ Is conventional +.>

Table 2 structuring parameters of backbone network res net-32:

s102.2: after training, the feature extractor F before the classification layer is utilized ₀ A certain number of typical examples epsilon are selected through a locking algorithm ₀ . By using

Training samples representing a faulty class c, then class c averages +.>

wherein P_c Is the number of training samples of class c.

The number of selected typical cases has two schemes: firstly, fixing the number of typical cases selected by each fault type to be 5; or a fixed total storage number of 55. If the number of classics of the selected class c is t, each classics passes

Calculated epsilon= (e) ₁ ,e ₂ ,…,e _t )。

The step S103 specifically includes the following steps:

s103.1: the original ResNet-32 was replaced with a dual-branch aggregation network, the dual-branch aggregation network structure of which is shown in FIG. 4. Wherein the dual-branch aggregation network comprises a dynamic branch and a steady-state branch.

The dynamic branch is a conventional parameter level fine tuning, namely, an initial model is used for initializing the dynamic branch of an incremental stage, and tasks of each stage are used for training and fine tuning the parameter alpha;

the steady-state branches are fine adjustments of parameters of neuron levels after freezing initial stage network parameters, namely, each neuron is given a weight beta and is trained and fine adjusted by tasks of each stage. Assuming that the steady-state branch k-th layer convolutional neural network comprises Q neurons, the neuron weights are parameters of initial model freezing

The input of the k-th layer convolutional neural network is x _k-1 Output is x _k ＝(W _k ⊙β _k )x _k-1 Wherein, the "+.. The learnable parameters beta of the steady state blocks are smaller than alpha, and the method can enable the steady state residual blocks to be slowly adapted to the knowledge of new tasks, and meanwhile old knowledge is fully reserved.

S103.2: the classifier of the initial model is a conventional fully connected classification layer by

Calculating a predictive probability of the input x being class c, where θ ⁰ For the parameters of the full connection classification layer in the initial stage, h ⁰ Features extracted in an initial stage;

the cosine standardized classifier of the increment stage n passes

Calculating a predictive probability of the input x being class c, where θ ⁿ Full connection classification layer parameters for delta phase n, h ⁿ For the feature extracted in delta phase n, +.>

Representation l ₂ Norms (F/F)>

Eta is a leachable scaling parameter and controls the cosine similarity value at [ -1,1]Within the range. The problem of classification bias of new and old classes can be avoided through a cosine standardized classifier.

For failure category increases, the number of classification layer neurons should increase to be consistent with the number of failure categories.

The step S104 specifically includes the following steps:

s104.1: using the classical epsilon reserved in the initial stage ₀ And the stage task data D ₀ Training a dual-branch aggregation network, and respectively endowing self-adaptive aggregation weights omega and xi aiming at different migration capacities of a dynamic residual block and a steady residual block of each residual block layer, as shown in fig. 4.

Failure training data x ^[0] Extracting features through a dual-branch aggregation network, and dynamically extracting features at an mth residual block layerResidual block extraction is characterized in that

The steady state residual block extraction is characterized by +.>

The aggregation characteristic of the mth residual block layer is that

wherein ω^[m] +ξ ^[m] ＝1。

S104.2: the loss function of the increment stage is classified cross entropy loss

And knowledge distillation loss->

wherein ,/>

and />

The temperature T is typically greater than 1 for soft labels with old models in the old failure class and hard labels with new models in the old failure class, respectively. Narrowing new models in old failure class C by knowledge distillation loss _0:n-1 The difference between the expression and the old model is approximately constrained to the similarity distribution of the old class in the old model. The loss function of the delta phase is +.>

Wherein lambda is more than 0 and less than or equal to 1.

S104.2: the non-optimized parameters of the increment stage have model parameters theta _n And aggregate weights ω and ζ for aggregationUpdating the combined weights ω and ζ requires fixing the model parameters Θ _n Adopting a double-layer optimization scheme;

the double-layer optimization scheme is divided into upper layer problems

And lower layer problems

By->

Updating model parameters Θ _n, wherein γ₁ Is the lower problem learning rate;

updating aggregate weights in upper layer problems to balance dynamic and steady state residual blocks, using a random sample data set D _n Obtaining

Establishing balance data->

By passing through

Updating the aggregate weights, wherein γ ₂ Is the learning rate of the upper layer problem.

The step S105 specifically includes the following steps:

the model Θ obtained by training in the increment stage n _n All learned faults C _0:n The test data contains all learned fault classes to verify that the model has the ability to learn without forgetting. After 4 incremental task learning is completed, two fine-tuning and confusion matrices for the method of the present invention under two typical case number strategies are shown in FIG. 5. The two fine-tuned confusion matrixes reflect the disastrous forgetting of the deep learning diagnosis model without lifelong learning, and the method can effectively solve the disastrous forgetting and realize the continuous fault diagnosis of the gear box with new unexpected faults.

In summary, the invention designs a method for realizing the increment fault diagnosis of the gear box based on the life learning method. Compared with the traditional deep learning method, the method can solve the problem of disastrous forgetting and is more suitable for the actual scene of industrial application.

The invention also provides a gear box increment fault diagnosis system based on the life learning, which adopts the gear box increment fault diagnosis method based on the life learning to carry out gear box fault diagnosis.

The principles of which are similar to those described above and which are repeated, it is noted that the present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The above-described embodiments are merely preferred embodiments for fully explaining the present invention, and the scope of the present invention is not limited thereto. Equivalent substitutions and modifications will occur to those skilled in the art based on the present invention, and are intended to be within the scope of the present invention. The protection scope of the invention is subject to the claims.

Claims

1. A gear box increment fault diagnosis method based on life learning is characterized in that: the method comprises the following steps:

s105: repeating the steps S103-S104 in the subsequent increment stage to obtain a final fault diagnosis model, and performing fault diagnosis;

the step S103 specifically includes the following steps:

the dynamic branch is subjected to conventional parameter level fine tuning, namely, an initial stage diagnosis model is used for initializing the dynamic branch of an incremental stage, and each stage task is used for training fine tuning parameters;

The input of the k-th layer convolutional neural network is x _k-1 Output is x _k ＝(W _k ⊙β _k )x _k-1 Wherein, the addition of the root is Hadamard product；

The cosine standardized classifier of the increment stage n passes

Representation l ₂ Norms (F/F)>

Eta is a leachable scaling parameter and controls the cosine similarity value at [ -1,1]Within the range where J _n Representing old fault class C _0：n-1 ＝{C ₀ ，C ₁ ，...，C _n-1 Number, K of _n Representing a new fault class C _n Quantity, J _n+1 ＝K _n +J _n ；

Aiming at the increase of the fault categories, the number of the neurons of the classification layer is increased to be consistent with the number of the fault categories;

the migration capability of representing different residual block layers by using the aggregation weight comprises the following steps:

using the classical epsilon reserved in the initial stage ₀ And one-stage task data D ₁ Training a double-branch aggregation network, and respectively endowing self-adaptive aggregation weights omega and xi aiming at different migration capacities of a dynamic residual block and a steady residual block of each residual block layer;

failure training data x ^[0] Extracting features through a double-branch aggregation network, and extracting features of dynamic residual blocks at an mth residual block layer are as follows

The steady state residual block extraction is characterized by +.>

The aggregation characteristic of the mth residual block layer is that

wherein ,ω^[m] +ξ ^[m] ＝1；/>

The loss function of the initial stage is classified cross entropy loss

The loss function of the increment stage is classified cross entropy loss

And knowledge distillation loss->

wherein ,

and />

Respectively a soft label of an old model in an old fault class and a hard label of a new model in the old fault class, wherein the temperature T is more than 1;

the loss function of the increment stage is

Wherein lambda is more than 0 and less than or equal to 1;

the model parameters theta of the initial stage ₀ Is conventional

The non-optimized parameters of the incremental phase are modeledParameter theta _n And aggregate weights ω and ζ, the update for aggregate weights ω and ζ requires a fixed model parameter Θ _n Adopting a double-layer optimization scheme;

the double-layer optimization scheme is divided into upper layer problems

And lower layer problems

By passing through

using randomly sampled data sets D _n Obtaining

Establishing balance data->

By passing through

2. The life-learning-based gearbox incremental fault diagnosis method according to claim 1, wherein: the step S101 specifically includes the following steps:

in the nth stage, the training data of task n is

wherein ,pⁿ Is the number of fault data samples for task n;

representing the i-th sample, +.>

3. The life-learning-based gearbox incremental fault diagnosis method according to claim 2, wherein: the step S102 specifically includes the following steps:

data using task 0

Training the original ResNet-32 learning failure class C ₀ Obtaining an initial stage diagnosis model theta ₀ The loss function of the initial stage diagnostic model is a classification cross entropy loss function:

wherein δ is the true label;

4. A life-learning based gearbox incremental fault diagnosis method according to claim 3, wherein: using feature extractor F before classification layer ₀ A certain number of typical examples are selected through a locking algorithm, including:

by using

Training samples representing a faulty class c, then class average for cIs->

wherein ,P_c Is the number of training samples of class c;

the number of selected typical cases is t, and each typical case passes through

Calculated epsilon= (e) ₁ ，e ₂ ，...，e _t )。

5. The life-learning based gearbox delta fault diagnosis method according to any one of claims 1 to 4, wherein: after each increment training is completed, testing the performance of the model in new and old tasks by using test data of all learned tasks, and verifying the learning capacity of the model, wherein the test data comprises the following steps:

the model Θ obtained by training in the increment stage n _n All learned faults C _0：n The test data contains all learned fault classes to verify that the model has the ability to learn without forgetting.

6. A life learning based gearbox incremental fault diagnosis system, characterized in that: gearbox fault diagnosis is performed by adopting the gearbox incremental fault diagnosis method based on life learning as claimed in any one of claims 1-4.