CN117194983B

CN117194983B - Bearing fault diagnosis method based on progressive condition domain countermeasure network

Info

Publication number: CN117194983B
Application number: CN202311157472.3A
Authority: CN
Inventors: 吴楚格; 赵铎; 夏元清
Original assignee: Beijing Institute of Technology BIT
Current assignee: Beijing Institute of Technology BIT
Priority date: 2023-09-08
Filing date: 2023-09-08
Publication date: 2024-04-19
Anticipated expiration: 2043-09-08
Also published as: CN117194983A

Abstract

The invention discloses a bearing fault diagnosis method based on a progressive condition domain countermeasure network, which introduces a progressive condition countermeasure network, carries out condition countermeasure training on a source domain and an intermediate domain to obtain a pseudo tag of the intermediate domain, and transfers the pseudo tag as a new source domain to the next intermediate domain until the migration of a target domain is completed, divides the larger transfer between the source domain and the target domain into a plurality of smaller transfers, solves the problem that the performance of a prediction result is reduced when the distance between the source domain and the target domain is larger and the distribution difference is larger in the traditional migration learning, enhances the generalization of an algorithm, and improves the fault diagnosis accuracy.

Description

Bearing fault diagnosis method based on progressive condition domain countermeasure network

Technical Field

The invention belongs to the technical field of fault diagnosis, and particularly relates to a bearing fault diagnosis method based on a progressive condition domain countermeasure network.

Background

Fault diagnosis is an important component of the intelligent industry, and intelligent prediction, detection and analysis of faults in industrial manufacturing equipment is important for the intelligence of discrete manufacturing. The rolling bearing is a precise machine part, is an important part of important equipment such as a generator, an engine, a high-speed rail and the like, is widely applied to national economy, national defense industry and scientific research, and is called as an industrial joint. The rolling bearing has the advantages of small friction coefficient, high running precision, insensitivity to lubricant viscosity, high and low speed bearing of radial and axial loads, high standardization degree, good substitutability, easy mass production, low price and the like, however, the rolling bearing is one of the parts which are most easy to break down and damage in mechanical equipment, and the faults caused by the damage of the rolling bearing account for 21% of the total mechanical faults. The fault of the mechanical equipment can cause adverse effects on production and life, so that economic loss caused by the fault is reduced, major production accidents are avoided, the fault prediction and diagnosis technology judges whether the mechanical operation state is healthy or not through processing the mechanical operation state signal, and performs various fault early warning, thereby effectively preventing the mechanical working accuracy from being reduced, and having extremely important significance.

With the rapid development of artificial intelligence technology, the deep learning method is widely applied to fault diagnosis of bearings. The fault diagnosis method based on data driving is the application of artificial intelligence technology in the traditional industrial process of fault diagnosis, is the fusion of a new generation of information technology and manufacturing industry, and adds new kinetic energy for the development of fault diagnosis technology. Deep learning based fault diagnosis requires a large amount of data and requires training data to be co-distributed with test data. In actual production, the working condition of mechanical equipment is affected by factors such as load, rotating speed, external disturbance and the like, the working time sequence data of the bearing cannot meet the same distribution assumption, the model is required to be retrained for fault diagnosis of different working conditions, and the application of the method is limited under the conditions of insufficient fault data, difficult labeling and the like. The transfer learning can learn common fault characteristics contained in different fault data based on the existing data and models, and apply the learned common knowledge to a new field through field similarity and association capability, so that the generalization capability of a fault recognition algorithm is enhanced. The method for migration learning can utilize the existing data and model to improve the fault diagnosis performance under the new working condition scene of less data, under-labeling and difficult modeling through migration among different working conditions of the same equipment, migration among different equipment and migration from the virtual equipment to the physical entity, and has good engineering application prospect.

The existing bearing fault diagnosis method based on deep migration learning mainly comprises the following types: network-based methods, instance-based methods, mapping-based methods, and countermeasure-based methods. Where the challenge-based approach has a better effect in most data sets. Domain antagonism neural network (Domain Adversarial Neural Network, DANN) is a classical antagonism domain adaptation method, DANN comprises a feature extractor, a domain discriminator and a class discriminator, the domain discriminator being prevented from distinguishing the differences of the source domain and the target domain by continuously training the feature extractor. DANN has wide application in transfer learning, however, DANN only aligns the global features but ignores alignment and label samples, and edge distribution alignment is achieved but conditional distribution alignment is not completed. The use of two new conditioning strategies in the condition domain countermeasure network (Conditional Domain Adversarial Network, CDAN) solves this problem: multi-linear conditioning and entropy conditioning. The former improves the recognition rate of the classifier by capturing the cross variance between the feature representation and the classifier prediction, and the latter ensures the portability of the classifier by controlling the uncertainty of the classifier prediction.

In summary, the existing methods mainly have the following problems:

1. The existing transfer learning algorithm trains a model in a source domain and then completes domain adaptation by means of unlabeled data of a target domain, however, the generalization error of domain adaptation increases with the increase of the difference between target distribution and source distribution, namely, when the distance between the source domain and the target domain is large, the performance of a prediction result is reduced. When the distribution difference between the source domain and the target domain is large, the migration learning algorithm is difficult to adapt to the target at one time.

Theoretical results indicate that the generalization error of domain adaptation increases with increasing difference between the two domains. The existing field adaptation method is poor in performance under larger deflection and better in performance under smaller deflection. When the distance between the source domain and the target domain is large enough, even a classifier that can achieve 100% accuracy on the source domain may misclassify all data on the target domain.

For fault diagnosis of industrial equipment, working conditions with marked data which can be used as a source domain are scarce, and in the face of complex and diverse industrial scenes, fault diagnosis can face various changes such as machine equipment, sensor positions, speeds, loads and temperatures, and the like, and the fact that a large difference exists between the source domain and a target domain is normal. Along with the development of industrial intelligence, the usage amount of the sensor is increased rapidly, a large amount of data is brought along with the usage amount, the sensor is limited by labor cost, marking all data is not feasible, and the unlabeled data is used as a middle domain for progressive migration training, so that the sensor is a precious resource for improving the migration learning effect and the generalization capability of a fault diagnosis algorithm.

2. Most of the current bearing fault diagnosis methods are based on unidirectional vibration data, however, the actual vibration situation is three-dimensional, and fault information contained in other two-direction vibration can be ignored by using unidirectional vibration data. In addition, the unidirectional vibration sensor may have a missing detection condition or be affected by external noise interference, so that the data is not completely reliable, and a reliable decision is not made for bearing fault diagnosis.

Disclosure of Invention

In view of the above, the invention provides a bearing fault diagnosis method based on a progressive condition domain countermeasure network, which realizes accurate diagnosis of bearing faults.

The invention provides a bearing fault diagnosis method based on a progressive condition domain countermeasure network, which comprises the following steps:

Step 1, constructing a training sample data set, wherein the training sample data set comprises a source domain D _s, N intermediate domains D _n and a target domain D _t, the source domain D _s is composed of labeled data, the intermediate domains D _n are composed of labeled data and unlabeled data which have a certain similarity with the source domain, and the target domain D _t is composed of unlabeled data;

Step 2, a bearing fault diagnosis model is established, wherein the bearing fault diagnosis model comprises a main classification network and a training network, the main classification network is a condition domain countermeasure network CDAN, the training network is a progressive condition domain countermeasure network, and the progressive condition domain countermeasure network is formed by connecting N condition domain countermeasure networks CDAN in series;

Step 3, training a training network by adopting the source domain and N intermediate domains in the training sample set constructed in the step 1, and finally completing migration from the source domain to the target domain by adopting the training network obtained by training to obtain a main classification network model of the target domain;

And 4, acquiring three-dimensional vibration data of the bearing to be diagnosed in the mode of the step 1, inputting the three-dimensional vibration data into the main classification network of the target domain obtained by training in the step 3 to classify the vibration data of the bearing to be diagnosed, obtaining a label of the bearing state so as to judge the current working state of the bearing, and completing the diagnosis of the bearing fault.

Further, the convolution of the first layer of the feature extractor in CDAN is set to: the input is three channels, the output is six channels, and the convolution kernel size is 15.

Further, in the step 3, training of the training network is completed by adopting the source domain and the N intermediate domains in the training sample set constructed in the step 1, and then migration from the source domain to the target domain is finally completed by adopting the training network obtained by training, so as to obtain a main classification network model of the target domain, which specifically comprises the following steps: training the first condition domain countermeasure network by adopting the labeled data of the source domain and the unlabeled data of the intermediate domain I, completing migration from the source domain to the intermediate domain I, and classifying the unlabeled data of the intermediate domain I by using a first main classification network obtained after migration learning to obtain a first classification result; removing low-confidence data in the first classification result, taking the low-confidence data as a source domain, inputting second intermediate domain unlabeled data, training a second condition domain countermeasure network, completing migration from the first intermediate domain to the second intermediate domain, and classifying the second intermediate domain unlabeled data by using a second main classification network obtained after migration learning to obtain a second classification result; and removing the data with low confidence in the second classification result, taking the data as a source domain, inputting the label-free data of the intermediate domain, training a third condition domain countermeasure network, and the like until the training of the N-th condition domain countermeasure network is completed, and completing the migration from the intermediate domain N to the target domain to obtain the main classification network of the target domain.

Further, the confidence level is calculated using the following formula:

γ(x；f,φ)＝max_cp(Y＝c|x；f,φ)

Wherein p (y=c|x; f, phi) is the output of the softmax layer, i.e. the probability of the corresponding tag class; x is a sample, f is a loss function, phi is a classifier, c is a sample prediction class label, and Y is a sample true class label.

Further, the low confidence is: gamma (x; f, phi) is less than or equal to alpha, and alpha is 0.9.

Further, the method further comprises the following steps: after the data with low confidence coefficient is removed, the label with the highest probability is selected to be output as a classifier to be used as new source domain data.

The beneficial effects are that:

1. The invention introduces a progressive field adaptation to provide a progressive condition countermeasure network, carries out condition countermeasure training on a source field and an intermediate field to obtain a pseudo tag of the intermediate field, and transfers the pseudo tag as a new source field to the next intermediate field until the transfer of a target field is completed, and divides the larger transfer between the source field and the target field into a plurality of smaller transfers, thereby solving the problem that the performance of a prediction result is reduced when the distance between the source field and the target field is larger and the distribution difference is larger in the traditional transfer learning, enhancing the generalization of an algorithm and improving the fault diagnosis accuracy.

2. According to the invention, a training sample data set is established aiming at bearing characteristics by adopting three-dimensional vibration data, and a bearing fault diagnosis model is established by using vibration data input by three channels, so that information contained in mechanical vibration is fully utilized, the omission rate is effectively reduced by supplementing three-directional data of the fault characteristics, and the diagnosis accuracy is further improved.

Drawings

Fig. 1 is a schematic diagram of an overall framework of a bearing fault diagnosis method based on a progressive condition domain countermeasure network.

Fig. 2 is a schematic diagram of a network structure established by the bearing fault diagnosis method based on the progressive condition domain countermeasure network.

Fig. 3 is a schematic diagram of rolling element fault data sample X-direction data adopted by the bearing fault diagnosis method based on the progressive condition domain countermeasure network.

Fig. 4 is a schematic diagram of rolling element fault data sample Y-direction data adopted by the bearing fault diagnosis method based on the progressive condition domain countermeasure network.

Fig. 5 is a schematic diagram of rolling element fault data sample Z-direction data adopted by the bearing fault diagnosis method based on the progressive condition domain countermeasure network provided by the invention.

Detailed Description

The invention will now be described in detail by way of example with reference to the accompanying drawings.

The invention relates to the following basic concepts:

(1) Object for fault diagnosis

The purpose of fault diagnosis is to analyze and extract fault characteristics from information obtained by a sensor by adopting a fault diagnosis model, so as to diagnose the state of a bearing and provide the type of fault. Therefore, fault detection and early warning of industrial equipment are realized, the safety and reliability of the equipment are improved, the operation efficiency is improved, and the maintenance cost is reduced. The data in the bearing fault diagnosis data set established by the invention is obtained by monitoring the vibration sensor, each data set comprises vibration displacement time series data of different rotating speed working conditions and different fault categories, and the fault diagnosis aims at classifying the fault categories.

(2) Fault diagnosis model

The fault diagnosis model is a fault diagnosis method, which inputs data obtained by a sensor and outputs a fault diagnosis result. The input of the bearing fault diagnosis model is vibration sensor data, namely time series data collected by a specific bearing at fixed time intervals during working under a specific working condition, and the output is a model prediction result of fault types.

Given the sampling length k, for a set of bearing vibration data with length nk, n data samples may be obtained by sequentially dividing, which may be used as input x _i of the fault diagnosis model. The fault class to which x _i belongs is denoted as y _i, and for the fault diagnosis model β, β (x _i) is the output of the fault diagnosis model, and if β (x _i)＝y_i), the fault diagnosis result is considered to be correct.

(3) Working conditions of

The working condition refers to a specific working environment and an operation condition of equipment in the running process. For bearing fault diagnosis, the operating conditions may include the following variables: rotational speed, load, bearing type, lubrication status, environmental conditions, etc. The present invention focuses on the variation of rotational speed.

(4) Transfer learning task

The bearing has different vibration modes under different working conditions, and the fault diagnosis model under the existing working conditions can be migrated to the new working conditions through migration learning, so that fault diagnosis of the new working conditions is realized. In the migration learning task, the existing working condition is called a source domain, and the new working condition is called a target domain. The fault diagnosis model may obtain data containing fault class labels in the source domain and only data without class labels in the target domain.

(5) Probability distribution of data

The probability distribution is used to express the probability law of the random variable value, and training data and test data are required to obey the unified probability distribution in the traditional machine learning.

Consider training dataWherein x _i is the ith data input, y _i is the class label of the ith data, and n is the total amount of training data; test data/>M is the total amount of test data.

In conventional machine learning, the probability distribution of training data and test data is required to be the same, that is, P _train(x,y)＝P_test(x,y),P_train (x, y) is the probability distribution of training data and P _test (x, y) is the probability distribution of test data, but in practical cases, there is a problem that the probability distributions of training data and test data are different, so that transfer learning needs to be introduced to perform domain adaptation.

(6) Field of application

The field is a basic concept in transfer learning. The domain consists of data (input x and output y) and probability distribution P (x, y) of the data. And X represents the feature space where the data is located, and Y represents the label space where the data is located. For each sample (X _i,y_i), there is X _i∈X,y_i e Y, the domain can be expressed as d= { X, Y, P (X, Y) }, the source domain is denoted D _s and the target domain is denoted D _t, which are included in the migration learning. The source domain is a migrated domain, and is a domain with a large amount of tagged data; the target domain is a migrated target domain, and is typically devoid of tagged data. The process of transfer learning consists in passing knowledge of the source domain to the target domain.

(7) Target of transfer learning

The goal of the transfer learning is: for source domain D _s and target domain D _t, when at least one of the available conditions exists, the tagged data of the source domain is utilized to predict the untagged data of the target domain such that the prediction function fx _t→y_t possesses minimal prediction error in the target domain, i.eWhere L is the loss function, argmin _f is the prediction function f,/>, with minimum prediction error in the target domainIs the desire for samples on the target domain. The usable cases include: different feature spaces X _s≠X_t, different tag spaces Y _s≠Y_t, and different probability distributions P _s(x,y)≠P_t (X, Y).

(8) Domain adaptation

The domain adaptation refers to migration learning under the condition that the feature space and the label space of a source domain and a target domain are the same but probability distribution is different. Domain adaptation is divided into supervision domain adaptation, semi-supervision domain adaptation and unsupervised domain adaptation. All the target domain data in the supervision domain adaptation are labeled, namelyThe semi-supervision domain adaptation target domain data part is tagged, i.e./>Unsupervised domain adaptation target domain data is totally unlabeled, i.eAmong them, the unsupervised domain adaptation is the most difficult one in domain adaptation, and the invention focuses on unsupervised domain adaptation.

(9) Transfer learning loss function

The transfer-learned penalty function may be expressed as l=l _c+λL_T, where L _c is the source domain predicted penalty, λ is the trade-off parameter, and L _T is the penalty for reducing the feature differences between the source and target domains.

(10) Progressive distribution and intermediate domain

For a set of differently distributed domains { D _s,D₁,D₂,…,D_n,…,D_N,D_t }, where D ₀ is the source domain, D _t is the target domain, and D ₁,D₂,…,D_n,…,D_N is the intermediate domain. If ε >0 is present, such that when 0.ltoreq.n < N, ρ (D _n,D_n+1) < ε is constant, then the distribution of this set of fields is progressive, called progressive distribution, ρ is the distance of Wasserstein, a measure of the distance between the two distributions.

(11) Wasserstein distance

The wasperstein distance is a method used to measure the difference between two probability distributions and is used as a measure of the distance between the two distributions in progressive domain adaptation. The Wasserstein distance is expressed asWherein, pi (P _r,P_g) is a set of all possible joint distributions of P _r and P _g, E _(x,y)～γ [ |||x-y|| ] is in in a joint distribution II (P _r,P_g) is a desired value of sample versus distance in a distribution gamma. The lower bound that can be taken on the above desired value in all possible joint distributions is the Wasserstein distance.

The Wasserstein distance is illustrated. Assume two discrete distributions P and Q, and have

Consider the cost of converting two distributions to the same distribution:

The first step: for P it is first necessary to split 3 of the first term 2 to the second term, namely:

And a second step of: separating the newly obtained 4 of the second item from 2 to a third item, namely:

And a third step of: at this time, the third term of P is 3, the third term smaller than Q is 4, and the third term of Q needs to be separated by 1 to the fourth term of Q, namely:

Through the operation, the distribution of P is converted into the same distribution as that of Q, the cost of the first step is 2, the cost of the second step is 2, the cost of the third step is 1, and the total cost is 5.

In progressive domain adaptation, the distance between domains based on Wasserstein is the metric function:

n is the number of labels, and for fault diagnosis, the number of fault categories is the number.

The existing intermediate domain selection mode mainly comprises the following steps:

One is an intuitive approach. The direct selection of intermediate domains according to the physical conditions of industrial equipment is a simple intermediate domain selection mode. For example, given that the rotation speed of the source domain is 600 rotations and the rotation speed of the target domain is 1000 rotations, other conditions of the source domain and the target domain are the same, it is obvious that a domain containing no tag data with the rotation speed of 800 may be selected as the intermediate domain. Similarly, the type of the machine equipment, the sensor position, the speed, the load, the temperature and other physical quantity changes can be used as the basis for selecting the intermediate domain. The intuitive selection of the intermediate domain has certain convenience, but a further theoretical method is needed for the selection of the intermediate domain in the case of multiple variables.

And the second is the Wasserstein distance method. The Wasserstein distance is a measure of the distance between two distributions, and is positively correlated with the error bound of progressive domain adaptation. For intermediate domain D _t, if min is satisfied (W (D ₀,D_t),W(D_t,D_T))>W(D₀,D_T), the upper error bound from intermediate domain to source domain or target domain will be greater than the upper error bound from source domain to target domain, so the requirement for intermediate domain selection is max (W (D ₀,D_t),W(D_t,D_T))≤W(D₀,D_T).

Thirdly, selecting a path planning problem of the intermediate domain. For source domain D ₀, target domain D _T, define Delta _max as the upper bound of the average Wasserstein distance between any pair of consecutive domains on the path, i.e Let P be the set of paths connecting the source domain D ₀ and the destination domain D _T: /(I) This allows another form of progressive domain adaptation error bound to be obtained: Under the condition of ensuring that TΔ _max is at least the length of the geodesic line connecting the source domain and the target domain, defining L as the Wasserstein distance between the source domain and the target domain, and obtaining an optimal path T ^* (the relation between the number of optimal intermediate domains and the Wasserstein distance and the number of samples of each domain) of progressive self-training: As a function of the complexity of f (n).

The symbol explanations relevant to the invention are shown in the following table:

The invention provides a bearing fault diagnosis method based on a progressive condition domain countermeasure network, the whole framework of which is shown in figure 1, which mainly comprises the following steps:

Step 1, a training sample data set is constructed, the training sample data set comprises a source domain D _s, N intermediate domains D _n and a target domain D _t, the source domain D _s is composed of labeled data, the intermediate domains D _n are composed of labeled data and unlabeled data which have a certain similarity with the source domain, and the target domain D _t is composed of unlabeled data. Wherein, the data samples in D _t are randomly divided into 8:2, wherein 80% of the data are used as training data sets, and the rest 20% of the data are used as test data sets.

Step 1.1, collecting data

(1) Public dataset data

The kesi Chu Da (CWRU) dataset has test data OF normal bearings and faulty bearings, including inner ring faulty bearings (IF), rolling element faulty Bearings (BF), outer ring faulty bearings (OF) and normal bearings (NA). Vibration data were acquired using a sensor with a sampling frequency of 12 kHz. The data set of the university of Kasixi stores uses the motor to drive different loads to carry out experiments, and four motor load components respectively correspond to four running rotating speeds: 1797rpm, 1772rpm, 1750rpm, 1730rpm. The four rotational speeds are used as four different conditions. In the transfer learning task, the transfer of these working conditions serves as a different task. For example, task 0→1 represents migration from a source domain at a rotation speed of 1797rpm to a target domain at a rotation speed of 1772 rpm. The rotational speed difference for adjacent domains in CWRU dataset was about 20rpm.

The JNU dataset is a bearing failure dataset provided by university OF Jiangnan, and like the CWRU dataset, includes four bearing types, an inner ring failure bearing (IF), a rolling element failure Bearing (BF), an outer ring failure bearing (OF), and a normal bearing (NA). Vibration data were acquired using a sensor with a sampling frequency of 50 kHz. JNU the dataset contained three speeds of 600rpm, 800rpm and 1000rpm as three different operating conditions, denoted as operating condition 0, operating condition 1 and operating condition 2, respectively. The rotational speed difference for adjacent domains in JNU dataset was 200rpm.

(2) The experimental device collects data

The experiment table is composed of a power supply, a motor speed regulator, a coupler (8 x 12), a 12mm stainless steel optical axis, a vibration sensor, a rotating speed sensor and a bearing. The motor is positioned at the rightmost side and is connected to the stainless steel optical axis through an elastic coupling, the elastic coupling can compensate axial displacement and radial displacement between the shafts, and allowable shaft concentricity errors are increased; a first bearing is arranged at the position 6 cm on the left side of the elastic coupling, the bearing is a bearing to be detected, the bearing seat is fixed on the base through a fixed screw, a second bearing is arranged at the position 12 cm on the left side of the bearing to be detected, and the bearing is a non-fault bearing; the tail end of the stainless steel optical axis is connected with the rotating speed sensor through a rigid coupling, and a fixing screw and the rotating speed sensor are required to be dismounted when the bearing to be tested is replaced, so that the bearing to be tested slides out from the left side; the vibration sensor is adsorbed above the bearing to be tested, and the x axis is parallel to the main shaft to the left.

The experiment table collects three-dimensional vibration data samples for fault diagnosis analysis based on transfer learning, and verifies algorithm performance. Five types OF bearing conditions were considered in the experiments, namely an inner ring failure bearing (IF), an outer ring failure bearing (OF), a rolling element failure Bearing (BF), a cage failure bearing (RF) and a normal bearing (NA).

The experimental data collection includes five fault categories: { IF, OF, BF, RF, NA } and six operating conditions (motor speed/rpm): {300, 350, 400, 450, 500, 550}, and corresponding data samples were collected. The different conditions are numbered correspondingly with {0,1,2,3,4,5 }. For example, the present invention builds up a data set of TG data sets (three-dimensional gradual bearing fault dataset) consisting of 5×6=30 cases altogether. Vibration data were acquired using a sensor with a sampling frequency of 1kHz, and under each condition a sampling time of 100 seconds was acquired for 100000 three-dimensional vibration displacement data, including the (x, y, z) axis displacement (μm) of vibration. A three-dimensional vibration data sample example of a rolling element failure at 500rpm, in which a data sample in the X direction is shown in fig. 3, a data sample in the Y direction is shown in fig. 4, and a data sample in the Z direction is shown in fig. 5.

In addition, unlike traditional migration learning, the invention needs to consider the problem of selecting an intermediate domain, the good intermediate domain can obviously improve the accuracy and the running speed of a model, and the intermediate domain is selected only according to the Wasserstein distance, which is a necessary condition for applying a progressive condition to resist a network, but in practical application, a large number of intermediate domain options are faced, and the standard is insufficient for finding the optimal selection, so that the invention introduces a path planning for selecting the intermediate domain, namely, a path which is as close as possible to a geodesic line, namely, a shortest path between a source domain and a target domain based on the Wasserstein distance is selected.

Step 1.2, data preprocessing

The acquired data is normalized and divided into a plurality of segments. The sample length s is adjusted to obtain the necessary data samples, and the obtained time series data is taken as input. For the TG dataset, the sample length is 128. For CWRU dataset and JNU dataset, the sample length is 1024.

The z-score normalization method is adopted to accelerate the gradient descent to solve the optimal solution, the average value of the data after normalization becomes 0, and the variance becomes 1. The specific conversion formula is as followsWhere σ is the data standard deviation, μ is the data mean, xt is the raw sample data, and xt _{normalization} is the data normalized with z-score.

Input tag data is acquired from the source domain D _s, unlabeled data is acquired from the intermediate domain D _n, and unlabeled data is acquired from the target domain D _t, wherein the data samples in D _t are randomly divided into 80% of data as a training data set and the remaining 20% of data as a test data set.

Step 2, a bearing fault diagnosis model is established, wherein the bearing fault diagnosis model comprises a main classification network and a training network, the main classification network is a condition domain countermeasure network CDAN, the main classification network is a migration learning network based on a trunk classification model CNN, and the training network is a progressive condition domain countermeasure network (GCDAN, gradual Conditional Domain Adversarial Network) constructed by the invention. The progressive condition domain countermeasure network is formed by connecting N condition domain countermeasure networks CDAN in series, the network structure of which is shown in fig. 2.

The invention adopts three-dimensional vibration data to describe the working state of the bearing, so that multi-mode information fusion processing is required for information from different sources. In multi-mode information fusion, common methods include serial fusion (Sequential Fusion) and parallel fusion (parallel fusion), and compared with the serial fusion and parallel fusion, the multi-mode information fusion has the advantages of synchronous processing and information interaction, the data of different modes in the parallel fusion can be interacted in a subsequent level, and a neural network can better learn the association and complementarity between different modes and find better characteristic representation. Therefore, the invention adopts a mode of multiple input channels to process the three-dimensional vibration data so as to realize the parallel fusion of the multi-mode information. Specifically, the convolution of the feature extractor layer1 in CDAN in the present invention is set to: the input is three channels, the output is six channels, and the convolution kernel size is 15.

When the distance between adjacent fields is large, a considerable part of the pseudo labels of the target fields are marked with errors, and the error labels have a great influence on the next field adaptation. Theory shows that the influence of the error label is amplified in the gradual field adaptation process, so that the performance is seriously reduced, the target field pseudo label data is required to be screened, the error label is removed, and the label with high confidence coefficient is reserved.

The confidence is defined as: γ (x; f, phi) =max _c p (y=c|x; f, phi), where p (y=c|x; f, phi) is the output of the softmax layer, i.e. the probability of the corresponding label class, where x is the sample, f is the loss function, phi is the classifier, c is the sample prediction class label, and Y is the sample true class label. Only gamma (x; f, phi) > alpha data is retained in the data screening, alpha being a threshold, e.g. alpha takes 0.9.

And further performing label sharpening on the reserved probability labels, namely selecting the label with the highest probability as the output of the classifier, and inputting the label as new source domain data into the next condition countermeasure network (CDAN).

And 3, training a training network by adopting the source domain and N intermediate domains in the training sample set (the input of which comprises three-dimensional vibration data) constructed in the step 1, and finally completing migration from the source domain to the target domain by adopting the training network obtained by training to obtain a main classification network model of the target domain.

The specific training process is as follows: and (3) training the first condition domain countermeasure network by adopting the labeled data of the source domain and the unlabeled data of the intermediate domain I, completing migration from the source domain to the intermediate domain I, and classifying the unlabeled data of the intermediate domain I by using a first main classification network obtained after migration learning to obtain a first classification result. And removing the low-confidence data in the first classification result, taking the low-confidence data as a source domain, inputting unlabeled data of a second intermediate domain, training a second condition domain countermeasure network, completing migration from the first intermediate domain to the second intermediate domain, and classifying the unlabeled data of the second intermediate domain by using a second main classification network obtained after migration learning to obtain a second classification result. And removing the data with low confidence in the second classification result, taking the data as a source domain, inputting unlabeled data of a third intermediate domain, training a third condition domain countermeasure network, and the like until the training of the N condition domain countermeasure network is completed, completing the migration from the intermediate domain N to a target domain, and obtaining a main classification network of the target domain.

In order to verify the performance of GCDAN in the present invention, it was compared with the existing fault diagnosis method, and the performance of the present invention with a large distribution difference was verified by performing a transfer learning experiment 2→0 on JNU, a transfer learning experiment 3→0 on CWRU, and a transfer learning experiment 5→0 on TG. Each example was tested 10 times, giving the mean and standard deviation of Acc (%).

GCDAN is programmed in the python language, and experiments run on computers in Windows11 and Pytorch1.13.1, intel Kuri 9-13900HX,GeForce RTX 4070 and 16-GB RAM.

The following algorithm was co-compared:

(1) Basic convolutional neural network

And directly using the network trained on the source domain to complete fault diagnosis of the target domain.

(2) Domain antagonistic neural network DANN

And learning common characteristics of the source domain and the target domain by adopting an countermeasure mechanism, so as to realize domain adaptation to the target domain.

(3) Conditional domain resistant network CDAN

The recognition rate of the classifier is improved and the portability of the classifier is ensured by adopting multi-linear condition adjustment and entropy condition adjustment on the basis of the DANN.

(4) Gradual migration GDA based on convolutional neural network self-training

The transition is realized by introducing gradually changing intermediate distribution between a source domain and a target domain, wherein the migration mode is to sequentially perform self-training in each domain, and the backbone network is a convolutional neural network.

(5) The progressive condition countermeasure network GCADA proposed in the present invention

And introducing the progressive domain adaptation into a condition countermeasure network, sequentially performing condition countermeasure training in the gradually changed intermediate domain to obtain a pseudo tag of the intermediate domain, training the marked intermediate domain as a new source domain after removing the pseudo tag with low confidence, and migrating to the next domain again until migration of the target domain is completed.

Therefore, the GCADA method achieves a remarkable improvement effect in bearing fault diagnosis. Experiments on the public data set and the self-built data set prove the superiority of the GCADA method, prove the feasibility and the effectiveness of the method in an industrial scene, and are suitable for practical engineering application problems.

In summary, the above embodiments are only preferred embodiments of the present invention, and are not intended to limit the scope of the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. The bearing fault diagnosis method based on the progressive condition domain countermeasure network is characterized by comprising the following steps of:

step 4, acquiring three-dimensional vibration data of the bearing to be diagnosed in the mode of step 1, inputting the three-dimensional vibration data into a main classification network of the target domain obtained by training in step 3 to classify the vibration data of the bearing to be diagnosed, obtaining a label of the bearing state so as to judge the current working state of the bearing, and completing diagnosis of bearing faults;

In the step 3, training of the training network is completed by adopting the source domain and the N intermediate domains in the training sample set constructed in the step 1, and then migration from the source domain to the target domain is finally completed by adopting the training network obtained by training, so as to obtain a main classification network model of the target domain, wherein the specific mode is as follows: training the first condition domain countermeasure network by adopting the labeled data of the source domain and the unlabeled data of the intermediate domain I, completing migration from the source domain to the intermediate domain I, and classifying the unlabeled data of the intermediate domain I by using a first main classification network obtained after migration learning to obtain a first classification result; removing low-confidence data in the first classification result, taking the low-confidence data as a source domain, inputting second intermediate domain unlabeled data, training a second condition domain countermeasure network, completing migration from the first intermediate domain to the second intermediate domain, and classifying the second intermediate domain unlabeled data by using a second main classification network obtained after migration learning to obtain a second classification result; and removing the data with low confidence in the second classification result, taking the data as a source domain, inputting the label-free data of the intermediate domain, training a third condition domain countermeasure network, and the like until the training of the N-th condition domain countermeasure network is completed, and completing the migration from the intermediate domain N to the target domain to obtain the main classification network of the target domain.

2. The bearing fault diagnosis method of claim 1, wherein the convolution of the first layer of feature extractors within CDAN is configured to: the input is three channels, the output is six channels, and the convolution kernel size is 15.

3. The bearing fault diagnosis method according to claim 1, wherein the confidence level is calculated using the following formula:

γ(x；f，φ)＝max_cp(Y＝c|x；f，φ)

4. A bearing failure diagnosis method according to claim 3, characterized in that the low confidence is: gamma (x; f, phi) is less than or equal to alpha, and alpha is 0.9.

5. The bearing failure diagnosis method according to claim 1, further comprising: after the data with low confidence coefficient is removed, the label with the highest probability is selected to be output as a classifier to be used as new source domain data.