CN115876467A

CN115876467A - Pseudo label transfer type two-stage field self-adaptive rolling bearing fault diagnosis method

Info

Publication number: CN115876467A
Application number: CN202211377184.4A
Authority: CN
Inventors: 张楷; 丁国富; 丁昆; 刘永志; 邹益胜; 李致萱; 刘彦涛; 秦国浩
Original assignee: Southwest Jiaotong University
Current assignee: Southwest Jiaotong University
Priority date: 2022-11-04
Filing date: 2022-11-04
Publication date: 2023-03-31

Abstract

The invention relates to and discloses a self-adaptive rolling bearing fault diagnosis method based on a pseudo label transfer type two-stage field. The method aims to solve the problem that the rolling bearing to be diagnosed is difficult to diagnose accurately due to obvious data distribution difference between rolling bearing monitoring data with known health state and rolling bearing monitoring data to be diagnosed with unknown health state in the migration fault diagnosis of the rolling bearing. The method has the advantages that a bridging effect is formed between the source data and the target data through the available intermediate data, so that a single field self-adaptive fault diagnosis process is converted into a field self-adaptive fault diagnosis process consisting of two stages, the problem of data distribution difference is gradually reduced by using a field adaptive method, shared health state knowledge of the rolling bearing is gradually transferred to the target data, and the fault diagnosis precision of the rolling bearing to be diagnosed is improved.

Description

Pseudo label transfer type two-stage field self-adaptive rolling bearing fault diagnosis method

Technical Field

The invention relates to the technical field of bearing fault diagnosis, in particular to a rolling bearing fault diagnosis method based on pseudo label transfer type two-stage field self-adaption.

Background

The rolling bearing is in service for a long time under the working conditions of complex and variable speed, load, temperature and the like, and once a fault occurs, the normal operation of mechanical equipment is seriously influenced, even casualties are caused. Therefore, the fault diagnosis of the rolling bearing is a problem to be solved at present. The applicable conditions of the intelligent fault diagnosis model established based on the constant working condition are as follows: (1) the training data contains sufficient labeling information; (2) the training data and the test data satisfy the same distribution. And the physical signal data acquired under different working conditions have obvious distribution difference, so that the fault diagnosis model established based on the constant working conditions is difficult to diagnose and identify the fault data under other working conditions. In addition, the marking cost of the data is high, time and labor are consumed, and the high efficiency and convenience are not sufficient when all marking type supervision training is respectively carried out on the data under each working condition in the actual engineering. Therefore, it is a practical requirement in the current practical engineering to assist the diagnosis of rolling bearing monitoring data (target data) with unknown health state by using the existing rolling bearing monitoring data (source data) with known health state.

In recent years, a migration learning technology provides a solution for the problem of fault diagnosis of the rolling bearing under different working conditions. Migration fault diagnosis can apply known supervised diagnostic knowledge to fault diagnosis identification in the relevant field. Feature field adaptation is one of the commonly used methods in migration diagnosis, and such a method can apply a diagnostic model obtained by learning source data to target data by reducing the distribution difference between the source data features and the target data features. However, most existing migration diagnostic framework methods ignore the difficult diagnosis problem caused by the large difference between the source data and the target data distribution. When the working condition changes too much, particularly when the working conditions change greatly among different rolling bearings, the direct migration diagnosis may cause a negative migration phenomenon, resulting in lower diagnosis accuracy of the target data of the rolling bearing to be diagnosed.

Disclosure of Invention

The invention aims to: the invention provides a cross-mechanical equipment fault diagnosis method based on a deep transfer type transfer learning framework, aiming at the problems that when the transfer diagnosis is carried out between different mechanical equipment at present, the distribution difference of source domain data and target domain data is too large, so that the direct transfer diagnosis difficulty is higher and the diagnosis precision is lower.

In order to achieve the above object, the present invention provides the following technical solutions:

a rolling bearing fault diagnosis method based on pseudo label transmission type two-stage field self-adaptation is characterized by comprising the following steps:

step 1, collecting existing rolling bearing fault data as source data D ^S Acquiring signal data of a rolling bearing to be diagnosed as target data D ^T ；

Step 2, calculating source data D by using a data similarity measurement method ^S And target data D ^T The similarity of data distribution among the target data D is calculated according to the calculation result and the selection condition ^T Of the intermediate data D ^I ；

Step 3, constructing a rolling bearing fault diagnosis model based on a pseudo label transfer type two-stage field self-adaptive network;

step 4, utilizing the source data D in the step 1 and the step 2 ^S Target data D ^T And intermediate data D ^I Training the model in the step 3, and carrying out fault diagnosis on the rolling bearing to be diagnosed;

preferably, the step 1 specifically includes:

source data D ^S Refers to the available rolling bearing data with known health state, including fault data under N operating conditions, namely

Wherein +>

n _S Represents the number of sample strips, <' > based on the number of samples>

The (i) th sample is shown,

representing a corresponding fault label, and C representing the health state category number; object data D ^T Is the rolling bearing signal data to be diagnosed with unknown health state, comprises fault data under M operating conditions, namely->

Wherein->

Denotes the jth sample, n _T Indicating the number of sample strips;

preferably, the step 2 specifically includes:

step A: with S (D) _i ，D _j )∈[0，1]Representing data D _i And data D _j Similarity between them, when S (D) _i ，D _j ) Data D is represented when approaching 1 _i And data D _j Very similar to each other when S (D) _i ，D _j ) Data D is represented when approaching 0 _i And data D _j The data distribution between them is significantly different. The method for constructing the data similarity measurement comprises the following steps: d _i And D _j Labeled 0 and 1, respectively, and then train a linear support vector machine classifier h to discriminate D _i And D _j To calculate the loss err (h) for h:

in the formula I [ a ]]Is an indicator function, ia when a is true]Is 1, otherwise is 0; x is a radical of a fluorine atom _i And x _j Represents D _i And D _j M, m' samples obtained by respective sampling, then S (D) _i ，D _j ) The calculation is as follows:

and B: calculating the distribution similarity between source data containing N working condition data and target data containing M working condition data by using the data similarity measurement method in the step A, wherein the distribution similarity is calculated

Wherein N ∈ {1, ·, N } and m ∈ {1,. · N };

step C: according to the calculation result of the step B and combining the following stripsIntermediate data selectively utilized by the pieces 1-3

Condition 1:

selecting intermediate data in the target data;

condition 2:

and->

The similarity between the source data and the intermediate data and the similarity between the intermediate data and the target data are both greater than the similarity between the source data and the target data;

condition 3:

so that->

Is that the intermediate data->

When there are more than one, selection is enabled>

And &>

The target data with the minimum absolute value of the difference is used as final intermediate data;

preferably, the step 3 specifically includes:

step A: the rolling bearing fault diagnosis model based on the pseudo label transfer type two-stage field self-adaptive network comprises a two-stage network model, namely source data D of a stage I ^S To intermediate data D ^I And intermediate data D of phase II ^I To the target data D ^T The connection between the stage I and the stage II is established by a pseudo label constraint method based on threshold decision. Firstly, a deep diagnosis network model of a stage I is constructed and trained, and the network of the stage I mainly comprises a feature extraction module, a classification module and a field adaptation module. The feature extraction module is composed of a convolutional layer, a pooling layer and a global average pooling layer. The mapping function of the feature extraction module is f _g1 (x，θ _g1 ) The parameter to be trained is theta _g1 . The classification module consists of a full connection layer, a Dropout layer and a Softmax layer. The mapping function of the classification module is f _c1 (x，θ _c1 ) The parameter to be trained is theta _c1 . The domain adaptation module adopts a domain adaptation method based on maximum mean difference to reduce the feature distribution difference between different data, and the mapping function is d ² . The training process of the phase I is as follows: set of source data samples with known health status

And a set of intermediate data samples whose health status is unknown->

Input into the phase I network to optimize the partial model parameters. Through the characteristic extraction module, the source data and the intermediate data respectively generate corresponding source data characteristics->

And an intermediate data characteristic->

Then, the source data characteristic>

Obtaining a prediction label of a source data sample through a classification module in the stage I, and measuring an error between the sample prediction label and a real label by a cross entropy cost function shown as the following:

in the formula, v is the v-th dimension of the classification probability distribution finally output by the classification module. I {. Is an indicator function, when satisfied

It is 1, otherwise it is 0. Additionally, source data feature>

And the intermediate data characteristic->

The distribution difference is narrowed down by the domain adaptation module in stage I, and the partial loss function is expressed as:

preferably, the domain adaptive method based on the maximum mean difference adopted in the domain adaptive module specifically includes:

by H _k Represents the Regenerated Kernel Hilbert Space (RKHS) associated with the characteristic kernel k. Given two distributions P, Q, RKHSH is present _k Having a non-linear mapping function phi (·) epsilon H _k Mapping data to RKHSH _k . The maximum mean difference employed in the domain adaptation module can be theoretically calculated by the following formula:

where sup (-) is the supremum of the input set,

and &>

Respectively, represents a desired distribution P, Q>

Meaning phi (-) is a series of functions within a unit sphere in RKHS. Two groups of samples are taken independently and identically distributed from the distribution P, Q>

And &>

The empirical calculation of the maximum mean difference is then:

where, the kernel maps k (x) ^p ，x ^q )＝<φ(x ^p )，φ(x ^q )>. The optimal kernel function composed of a plurality of characteristic kernels in the multi-kernel maximum mean difference can better approximate the distribution of a characteristic space. Multiple characteristic nucleus

Can be defined as a convex combination of m nuclei:

in the formula, constraint conditions

Ensuring that derived multi-core k is unique, where u is the number of cores, β _u Is the constraint coefficient of the kernel.

Domain adapted intermediate sample features

By means of a sorting moduleThe rolling bearing health state type prediction probability of obtaining the sample>

I.e. the output of the last layer of the classification module:

in the formula (I), the compound is shown in the specification,

for intermediate sample features, θ _c1 Are the training parameters of the classification module. />

Is a one-dimensional probability distribution vector whose maximum corresponds to the label representing the sample->

The predicted health state category. To avoid model overfitting, L2 regularization penalty @>

With the penalty model training parameters, the objective function of stage i is represented as:

in the formula, α and β are penalty coefficients.

And B: constructing a pseudo label constraint method based on threshold decision by using intermediate data samples

Predictive probability setting threshold in a network model>

The method screens out intermediate data with high confidenceSample->

These screened out intermediate data samples->

Will be used for the training of phase II, the generated pseudo label->

Can be expressed as:

/>

where τ ∈ (0,1) is the threshold for generating a pseudo-label. I {. Is an indicator function, when satisfied

When is greater or less>

And a sample +>

Will generate a corresponding pseudo label->

Otherwise the sample will not generate a false label and will not be used in phase ii. When the number of intermediate data samples with the pseudo label reaches the ratio rho of the total number of intermediate data samples participating in the training, the intermediate data samples with the pseudo label are used for the training of the stage II. Thus, a high confidence set of intermediate data samples is represented as follows:

in the formula (I), the compound is shown in the specification,

rho epsilon (0,1) is a proportional parameter used to decide whether to end the phase I training; n is a radical of _I Is the total number of intermediate data samples involved in the training.

Step C: and constructing a deep diagnosis network structure of the stage II. The network structure of stage II is similar to that of stage I, and also includes a feature extraction module, a classification module and a field adaptation module, and the corresponding mapping function includes f _g2 (x，θ _g2 ) And f _c2 (x，θ _c2 ) The parameter to be trained comprises theta _g2 And theta _c2 . The training process of the stage II is as follows: intermediate data sample set with pseudo label information

And a target data sample set whose health status is unknown->

Input into the phase ii network to optimize the partial model parameters. Similar to the phase I training process, the objective function for phase ii is represented as:

in the formula, alpha and beta are penalty coefficients.

Wherein, the step 4 specifically comprises:

and optimizing the network model parameters of the stage I and the stage II by using an Adam back propagation algorithm. The total number of training iterations for stage I and stage II is K. And when the training iteration number K is reached, inputting the unmarked target data sample into the trained network model, outputting a prediction probability distribution vector of the target domain sample by a Softmax layer of the classification module, and representing the health state prediction category of the tested sample by the label corresponding to the maximum probability.

Compared with the prior art, the invention has the beneficial effects that:

in the scheme of the application, compared with the prior art, the concept of intermediate data is introduced, so that the traditional single migration diagnosis process from source data to target data is converted into a two-stage migration diagnosis process from the source data to the intermediate data (stage I) and from the intermediate data to the target data (stage II). The invention provides a data similarity measurement method which is used for calculating the similarity degree between source data containing a plurality of working conditions and target data containing a plurality of working conditions. The invention proposes to select 3 conditions that should be met by the available intermediate data for selecting the intermediate data among the target data comprising a plurality of operating conditions. The invention adopts a mode of generating a pseudo label for the intermediate data sample, and transfers and migrates the fault marking information to assist in diagnosing the target data. The invention provides a pseudo tag constraint method based on threshold decision, which is used for screening out an intermediate data sample with higher confidence coefficient so as to construct a pseudo tag transfer type two-stage field self-adaptive network and reduce accumulated errors of transfer type transfer. When the distribution difference between the source data and the target data is obvious, the method can divide the diagnosis process from the source data with obvious data distribution difference to the target data into two diagnosis processes with similar data distribution, namely the diagnosis process from the source data to the intermediate data and the diagnosis process from the intermediate data to the target data. The invention provides a two-stage field self-adaptive method based on a convolutional neural network, which gives full play to the advantage of strong nonlinear feature mapping capability of a deep neural network and effectively improves the extraction capability of nonlinear features in a physical signal of a rolling bearing. The method can avoid the negative migration problem caused by direct migration diagnosis from source data to target data, thereby accurately identifying the health state of the rolling bearing, effectively improving the diagnosis precision of the target data and providing a borrowable method for the fault diagnosis of the rolling bearing under actual industrial conditions.

Description of the drawings:

FIG. 1 is a schematic diagram of a two-stage field adaptation of the present invention;

FIG. 2 is a schematic flow diagram of the present invention;

FIG. 3 is a schematic diagram of a pseudo tag transitive two-phase domain adaptive network model of the present invention;

FIG. 4 is a result of the distribution similarity calculation between source data and target data according to the present invention;

FIG. 5 is a detailed information of the source and target bearing data sets of the present invention;

FIG. 6 is a result of data distribution similarity calculations among different data sets according to the present invention;

FIG. 7 shows the diagnostic results of experimental tests according to various methods of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be described clearly and completely with reference to the accompanying drawings. It is to be understood that the embodiments described are only a few embodiments of the present invention, and not all embodiments.

Thus, the following detailed description of the embodiments of the invention is not intended to limit the scope of the invention as claimed, but is merely representative of some embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that the embodiments of the present invention and the features and technical solutions thereof may be combined with each other without conflict.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.

In the description of the present invention, it should be noted that the terms "upper", "lower", and the like refer to orientations or positional relationships based on orientations or positional relationships shown in the drawings, orientations or positional relationships that are usually used for placing the products of the present invention, or orientations or positional relationships that are usually understood by those skilled in the art, and these terms are only used for convenience of description and simplification of the description, and do not indicate or imply that the devices or elements referred to must have specific orientations, be constructed and operated in specific orientations, and thus, should not be construed as limiting the present invention. Furthermore, the terms "first," "second," and the like are used merely to distinguish one description from another, and are not to be construed as indicating or implying relative importance.

As shown in fig. 1, the schematic diagram of the principle meaning of the present invention includes two parts a and b. In a part a of fig. 1, it is shown that a suitable scenario of the present invention is that there is an intermediate data in the monitoring data of the rolling bearing to be diagnosed, and the data distribution thereof is between the source data and the target data. After the intermediate data is added, one migration diagnosis process with a large distribution difference can be converted into two migration diagnosis processes with a small distribution difference, so that the fact that one migration diagnosis process with a large migration difficulty can be converted into two migration diagnosis processes with a small migration difficulty after the intermediate data is added is illustrated. In part b of fig. 1, a schematic diagram of the principle of the transitive two-stage domain adaptation is shown, i.e. by converting a direct migration diagnostic procedure with a large domain offset into two migration diagnostic stages (stage i and stage ii) procedures with a small domain offset. By the method, the shared health state knowledge of the source data is gradually migrated to the target data, so that high-accuracy diagnosis of the target data sample is realized.

As shown in fig. 2, the adaptive fault diagnosis process of the pseudo label transfer type two-stage field of the invention is shown, and specifically includes the following steps:

step 1, collecting existing rolling bearing fault data as source data D ^S Acquiring signal data of a rolling bearing to be diagnosed as target data D ^T . Source data D ^S Refers to the available rolling bearing data with known health state, including fault data under N operating conditions, namely

Wherein->

Indicates the ith sample, and>

represents->

C represents the health status category number; object data D ^T The rolling bearing signal data to be diagnosed with unknown health state comprises fault data under M operating conditions, namely

Wherein->

Denotes the j-th sample, n _T Representing the number of sample strips;

step 2, calculating source data D by using a data similarity measurement method ^S And target data D ^T The similarity of the data distribution between the target data D and the target data D according to the calculation result and the selection condition ^T Of the intermediate data D ^I . First, with S (D) _i ，D _j ) E [0,1) represents data D _i And data D _j Similarity between them, when S (D) _i ，D _j ) Data D is represented when approaching 1 _i And data D _j Very similar between them, when S (D) _i ，D _j ) Data D is represented when it is close to 0 _i And data D _j The data distribution between them is significantly different. The method for constructing the data similarity measurement comprises the following steps: d _i And D _j Labeled 0 and 1, respectively, and then train a linear support vector machine classifier h to discriminate D _i And D _j Of the sample in (a) to estimateLoss of h err (h):

in the formula I [ a ]]Is an indicator function, if a is true]Is 1, otherwise is 0; x is the number of _i And x _j Represents D _i And D _j M, m' samples obtained by respective sampling, then S (D) _i ，D _j ) The calculation is as follows:

then, the data similarity measurement method is used for calculating the distribution similarity between the source data containing N working condition data and the target data containing M working condition data

Where N ∈ { 1.,. N } and M ∈ { 1.,. M }, the result is shown in fig. 4;

finally, selecting available intermediate data according to the calculation result of the step B and combining the following conditions 1-3

Condition 1:

namely selecting intermediate data in the target data;

condition 2:

and->

That is, the similarity between the source data and the intermediate data and the similarity between the intermediate data and the target data are both smaller than the similarity between the source data and the target data;

condition 3:

so that->

I.e. when the intermediate data of condition 2 are fulfilled->

When there are more than one, selection is enabled>

And/or>

and step 3: and constructing a rolling bearing fault diagnosis model based on a pseudo label transfer type two-stage field self-adaptive network. The rolling bearing fault diagnosis model based on the pseudo label transfer type two-stage field self-adaptive network comprises a two-stage network model, namely source data D of a stage I ^S To intermediate data D ^I And intermediate data D of phase II ^I To the target data D ^T The connection between the stage I and the stage II is established by a pseudo label constraint method based on threshold decision. As shown in fig. 3, a model structure of the constructed pseudo label transitive two-stage domain adaptive network is shown.

Firstly, a deep diagnosis network model of a stage I is constructed and trained, and the stage I network mainly comprises a feature extraction module, a classification module and a field adaptation module. The feature extraction module consists of a convolutional layer, a pooling layer and a global average pooling layer. The mapping function of the feature extraction module is f _g1 (x，θ _g1 ) The parameter to be trained is theta _g1 . The classification module consists of a full connection layer, a Dropout layer and a Softmax layer. The mapping function of the classification module is f _c1 (x，θ _c1 ) The parameter to be trained is theta _c1 . The domain adaptation module adopts a domain adaptation method based on maximum mean difference to reduce the feature distribution difference between different data, and the mapping function is d ² . The training process of the phase I is as follows: set of source data samples with known health status

And a set of intermediate data samples whose health status is unknown->

Input into the phase I network to optimize the partial model parameters. Through a characteristic extraction module, the source data and the intermediate data respectively generate corresponding source data characteristics->

And an intermediate data characteristic->

Then, the source data characteristic->

Obtaining a prediction label of a source data sample through a classification module in the stage I, and measuring an error between the sample prediction label and a real label by a cross entropy cost function shown as follows:

where v is the v-th dimension of the classification probability distribution finally output by the classification module. I {. Is an indicator function, when satisfied

Its value is 1 if not 0. Additionally, source data feature>

And intermediate data feature>

The distribution difference is narrowed down by the domain adaptation module in phase I, and the partial loss function is expressed as:

the domain adaptive method based on the maximum mean difference adopted by the domain adaptive module specifically comprises the following steps:

by H _k Represents the Regenerated Kernel Hilbert Space (RKHS) associated with the characteristic kernel k. Given two distributions P, Q, RKHSH is present _k Having a non-linear mapping function phi (·) epsilon H _k Mapping data to RKHSH _k . The maximum mean difference employed in the domain adaptation module can be theoretically calculated by:

where sup (-) is the supremum of the input set,

and &>

Respectively, represents a desired distribution P, Q>

And &>

Then the empirical calculation of the maximum mean difference is: />

Where, the kernel maps k (x) ^p ，x ^q )＝＜φ(x ^p )，φ(x ^q )>. The optimal kernel function consisting of a plurality of characteristic kernels in the multi-kernel maximum mean difference can better approximate the distribution of the characteristic space. Multiple characteristic nucleus

Can be defined as a convex combination of m kernels:

in the formula, constraint conditions

Ensuring that the derived multi-core k is unique, wherein u is the number of cores, beta _u Is the constraint coefficient of the kernel.

Domain adapted intermediate sample features

The rolling bearing health state class prediction probability of obtaining the sample is determined by a classification module>

I.e. the output of the last layer of the classification module:

in the formula (I), the compound is shown in the specification,

The predicted health state category of (2). To avoid model overfitting, L2 regularization penalty @isemployed>

in the formula, α and β are penalty coefficients. And optimizing the network model parameters of the stage I through an Adam back propagation algorithm.

Then, a pseudo label constraint method based on threshold decision is constructed, and a method for setting a threshold for the prediction probability of the sample in a network model is used for screening out high-confidence intermediate data samples

These screened out intermediate data samples->

Will be used for the training of phase II, the generated pseudo label->

Can be expressed as:

When is greater or less>

And the sample->

Will generate a corresponding pseudo label->

Otherwise the sample will not generate a false tag and will not be used in stage ii. When the number of intermediate data samples with the pseudo label reaches the ratio rho of the total number of intermediate data samples participating in the training, the intermediate data samples with the pseudo label are used for the training of the stage II. Thus, the high confidence set of intermediate data samples is represented as follows:

in the formula (I), the compound is shown in the specification,

rho epsilon (0,1) is a proportional parameter used to decide whether to end stage I training; n is a radical of _I Is the total number of intermediate data samples involved in the training.

And finally, constructing a deep diagnosis network structure of the stage II. The network structure of stage II is similar to that of stage I, and also includes a feature extraction module, a classification module and a field adaptation module, and the corresponding mapping function includes f _g2 (x，θ _g2 ) And f _c2 (x，θ _c2 ) The parameter to be trained includes theta _g2 And theta _c2 . The training process of the stage II is as follows: will bear a false labelIntermediate data sample set of information

And a target data sample set whose health status is unknown->

in the formula, α and β are penalty coefficients. And optimizing the network model parameters of the phase II through an Adam back propagation algorithm.

And 4, step 4: using the source data D described in step 1 and step 2 ^S Target data D ^T And intermediate data D ^I And (4) training the model in the step (3) and carrying out fault diagnosis on the rolling bearing to be diagnosed. The total number of training iterations for stage I and stage II is K. And when the training iteration number K is reached, inputting the unmarked target data sample into the trained network model, outputting a prediction probability distribution vector of the target domain sample by a Softmax layer of the classification module, wherein the maximum probability of the prediction probability distribution vector corresponds to the label and represents the health state prediction category of the tested sample.

Example (b): the feasibility of the method is verified by taking the rolling bearing migration fault diagnosis between two different test beds under different working conditions as an example.

Data set P is from the bearing data center at the university of kaiser university, usa, using drive end failure bearing data. The bearing is a 6205-2RS deep groove ball bearing, and the bearing fault is a single-point fault artificially formed by electric sparks. The damage diameter of the used fault bearing is 0.178 (mm, millimeter), and the health condition of the bearing comprises a normal condition, an outer ring fault and an inner ring fault. Data for 3 conditions were used: the motor speeds were 1797, 1772 and 1750 (rpm ) respectively. The sampling frequency was 48kHz. Samples were taken using a sliding window, 2000 samples were obtained for each health state, training data: test data =7:3, each sample contains 2048 sample points.

Data set Q is from the bearing data set at the university of paddebbon, germany, the test bearing of the test stand being a 6203 model deep groove ball bearing. The method uses real damage data obtained by an accelerated life test, and the health state of the bearing comprises three states of a normal state, an outer ring fault and an inner ring fault. Different working conditions can be realized by changing the rotating speed N, the loading torque M and the loading radial force F, and the embodiment uses the bearing data acquired under 3 working conditions. The sampling frequency of the acquired vibration data was 64kHz. Samples were taken using a sliding window, 2000 samples were obtained for each health state, training data: test data =7:3, each sample contains 2048 sample points.

As shown in fig. 5, the data set details used are shown. Based on fig. 5, the distribution similarity between the condition data sets of the target rolling bearing and the source rolling bearing is analyzed, and the result is shown in fig. 6.

According to fig. 6, 3 migration diagnosis tasks with less similarity of data distribution are established: t is a unit of ₁ :A ₁ →C、T ₂ :A ₂ → C and T ₃ :A ₃ → C to verify the feasibility of the invention, aiming to make an auxiliary diagnosis on real damaged bearing data using artificially developed damaged bearing diagnostic knowledge. As can be seen from fig. 6 and the intermediate data selection conditions 1 and 2, in the migration task T ₁ 、T ₂ And T ₃ In B, the intermediate data is ₁ 、B ₂ Then the single migration diagnostic process is converted into a two-stage migration diagnostic process. With migration task T ₁ :A ₁ → C for example, if B is selected ₁ For intermediate data, there is S (A) ₁ ,B ₁ )＝0.5467、S(B ₁ C) =0.6029; if B is selected ₂ Is intermediate data, then there is S (A) ₁ ,B ₂ )＝0.4267、S(B ₂ C) =0.701. Obviously, in the migration task T ₁ In (B) ₁ The intermediate data selection condition 3 is more satisfied, so the task T is migrated ₁ Selection B ₁ Is the intermediate data.Similarly, in the migration task T ₂ And T ₃ All select B ₁ A pseudo label transitive two-stage domain adaptation network is established for intermediate data. To verify the feasibility of the invention, two comparison methods were employed: (1) a WDCNN method without field adaptation is adopted, (2) a CNN-MMD method with field adaptation is adopted; the basic network of the two comparison methods is the same as the network structure of the stage I (or the stage II) in the invention, and the related parameter setting is also the same, but the difference is that the WDCNN method is a common intelligent fault diagnosis method without application field adaptation; compared with the CNN-MMD method, the CNN-MMD method is different in that the CNN-MMD method does not adopt a transfer type two-stage field self-adaption and is a direct migration common field self-adaption diagnosis method. Taking the following hyper-parameters: ε =10 ^-3 ，τ＝0.9，ρ＝0.9，α＝0.05，β＝10 ^-6 ，K＝15000，γ ² ＝{10 ^-6 ,10 ^-5 ,10 ^-4 ,10 ^-3 ,0.01,0.1,1,5,10,15,20,25,30,35,100,10 ³ ,10 ⁴ ,10 ⁵ ,10 ⁶ }. The experimental results are repeated for 10 times, and the average value is taken as the final test precision of the method and the comparison method so as to ensure the reliability of the results, and the experimental results are shown in fig. 7.

The comparative method is a direct migration diagnostic, while the inventive method uses intermediate data B ₁ And performing transmission type two-stage field self-adaptive diagnosis. As can be seen from fig. 7, the average diagnosis accuracy of the CNN-MMD on the 3 migration diagnosis tasks is 64.61%, which is improved by 26.54% compared with the WDCNN method, which indicates that the domain adaptation method can effectively improve the migration diagnosis performance of the intelligent diagnosis model under different working conditions and between different rolling bearings, but still has a relatively high misdiagnosis rate. The average standard deviation of the accuracy of the CNN-MMD method reaches 5.82%, which shows that the common migration diagnosis framework is unstable when the source data and the target data have large data distribution difference, and the problems can not be solved well. The average diagnosis precision of the technical scheme disclosed by the invention on 3 migration diagnosis tasks is 96.30%, compared with a CNN-MMD method of direct migration, the method improves 31.69%, and the test performance of the model is more stable. Analyzing the cause of direct migrationThe domain adaptation method ignores large distribution differences between source data and target data, and when such problems exist, forcibly performing direct migration may yield an undesirable diagnostic result. The invention converts the single diagnosis process with higher migration difficulty into two diagnosis processes with lower migration difficulty, obtains higher diagnosis precision of the target rolling bearing by the idea of changing one into two and gradually migrating, and proves that the invention is feasible.

By comparing the method with the traditional intelligent fault diagnosis method (WDCNN) and the field adaptive migration diagnosis method (CNN-MMD) of direct migration, the method provided by the invention is proved to be capable of effectively solving the problem of migration fault diagnosis between different rolling bearings under different working conditions, and particularly the problem of unsatisfactory direct migration precision of target data caused by obvious distribution difference of source data and the target data.

The above embodiments are only used for illustrating the invention and not for limiting the technical solutions described in the invention, and although the present invention has been described in detail in the present specification with reference to the above embodiments, the present invention is not limited to the above embodiments, and therefore, any modification or equivalent replacement of the present invention is made; but all technical solutions and modifications thereof without departing from the spirit and scope of the present invention are encompassed in the claims of the present invention.

Claims

1. A rolling bearing fault diagnosis method based on pseudo label transmission type two-stage field self-adaptation is characterized by comprising the following steps:

Step 2, calculating source data D by using a data similarity measurement method ^S And target data D ^T The similarity of the data distribution between the target data D and the target data D according to the calculation result and the selection condition ^T Of the intermediate data D ^I ；

step 4, utilizing the source data D ^s Target data D ^T And intermediate data D ^I And training the rolling bearing fault diagnosis model based on the pseudo label transfer type two-stage field self-adaptive network, and performing fault diagnosis on the rolling bearing to be diagnosed after training is finished.

2. The rolling bearing fault diagnosis method based on the pseudo label transfer type two-stage field self-adaptation as claimed in claim 1, wherein the step 1 specifically comprises:

source data D ^S For the rolling bearing data which are available and known in state of health, the rolling bearing data which are known in state of health comprise fault data of the rolling bearing under N operating conditions, and the rolling bearing data are obtained

Wherein

n _S For the number of source data samples, based on the number of the source data samples>

Is the ith sample, is selected>

Is->

C represents the health status category number;

object data D ^T The method is characterized in that the rolling bearing signal data to be diagnosed with unknown health state comprises fault data of the rolling bearing under M operating conditions

Wherein->

Denotes the j-th sample, n _T Representing the number of target data samples.

3. The rolling bearing fault diagnosis method based on the pseudo label transfer type two-stage field self-adaptation as claimed in claim 2, wherein the data similarity measurement method in the step 2 is as follows:

S(D _i ，D _j ) E [0,1) is data D _i And data D _j Similarity between them, when S (D) _i ，D _j ) The closer to 1, the data D _i And data D _j The more similar, when S (D) _i ，D _j ) The closer to 0, the data D _i And data D _j The more significant the difference in data distribution between, D _i Labeled 0,D _j Marking as 1, training a linear support vector machine classifier h, wherein the classifier h is used for distinguishing D _i And D _j To calculate the loss err (h) for h:

said Ia]Is an indicator function, if a is true]Is 1, otherwise is 0; x is a radical of a fluorine atom _i Represents D _i M samples obtained by middle sampling, x _j Represents D _j M' samples obtained by middle sampling, then S (D) _i ，D _j ) The calculation is as follows:

calculating the distribution similarity between source data containing N working condition data and target data containing M working condition data by using the data similarity measurement method, wherein the distribution similarity is obtained by calculating the distribution similarity between the source data containing N working condition data and the target data containing M working condition data

Wherein N ∈ { 1.,. N } and m ∈ { 1.,. N }, whether or not>

According to the above

Satisfies the conditions 1-3 is the intermediate data->

Condition 1:

condition 2:

and->

Condition 3:

so that->

I.e. when the intermediate data of condition 2 are fulfilled->

When there are more than one, the selection is enabled>

And &>

The target data having the smallest absolute value of the difference is used as the final intermediate data.

4. The method for diagnosing the failure of the rolling bearing based on the pseudo tag transmission type two-stage field self-adaptation as claimed in claim 2, wherein the rolling bearing failure diagnosing model based on the pseudo tag transmission type two-stage field self-adaptation network in the step 3 comprises two-stage network models including a stage I network model and a stage ii network model, and the stage I network model is source data D ^s To intermediate data D ^I The phase II network model is intermediate data D ^I To the target data D ^T The phase I network model and the phase II network model are connected by a pseudo label constraint method based on threshold decision,

the stage I network model comprises a feature extraction module, a classification module and a field adaptation module, wherein the feature extraction module consists of a convolution layer, a pooling layer and a global average pooling layer, and the mapping function of the feature extraction module is f _g1 (x，θ _g1 ) The parameter to be trained is theta _g1 (ii) a The classification module consists of a full connection layer, a Dropout layer and a Softmax layer, and the mapping function of the classification module is f _c1 (x，θ _c1 ) The parameter to be trained is theta _c1 (ii) a The domain adaptation module adopts a domain adaptation method based on maximum mean difference to reduce the feature distribution difference between different data, and the mapping function is d ² ；

The training process of the phase I is as follows: set of source data samples with known health status

And a set of intermediate data samples whose health status is unknown->

Inputting the parameters into the stage I network model for optimizing the model parameters; through a characteristic extraction module, the source data sample and the intermediate data sample respectively generate corresponding source data characteristics->

And an intermediate data characteristic->

The source data characteristic->

Obtaining a prediction label of a source data sample through a classification module in the stage I network model, and measuring an error between the prediction label and a real label of the source data sample by a cross entropy cost function, wherein the error is as follows:

v is the v-th dimension of the classification probability distribution finally output by the classification module, and I is an indication function when the condition is met

If it is 1, otherwise it is 0,

the source data characteristics

And the intermediate data characteristic->

Reducing distribution differences by the domain adaptation module in the phase I network model, the loss function being:

/>

intermediate sample features after the Domain Adaptation Module

The rolling bearing health state class prediction probability of the sample is obtained through the classification module>

Output->

Comprises the following steps:

the above-mentioned

For intermediate sample features, θ _c1 For a training parameter of the classification module, the->

Is a one-dimensional probability distribution vector, said->

The label corresponding to the maximum value is a sample->

A predicted health status category of;

to avoid model overfitting, L2 regularization loss is employed

Parameters are trained with a penalty model, so the objective function of the phase i network model is represented as:

the alpha and beta are penalty coefficients;

the pseudo label constraint method based on threshold decision is constructed by carrying out intermediate data sample

Predicting a probability ≥ in a network model>

Method for setting a threshold value screens out high-confidence intermediate data samples->

Screened intermediate data sample

The generated pseudo label is used for training the phase II network model>

Comprises the following steps:

the tau e (0,1) is used to generate a threshold for a pseudo-label;

i is an indicator function when satisfied

At time, there is->

And the sample->

Will generate a corresponding pseudo label->

Otherwise the sample will not generate a false label and will not be used in the phase ii network model; when the number of the intermediate data samples with the pseudo labels reaches the total ratio rho of the intermediate data samples participating in training, the intermediate data samples with the pseudo labels are used for training the phase II network model; the high confidence set of intermediate data samples is represented as follows:

the above-mentioned

Rho epsilon (0,1) is used for determining whether to end the proportion parameter of the stage I network model training; n is a radical of _I The total number of the intermediate data samples participating in training is obtained;

the network structure of the phase II network model also comprises a feature extraction module, a classification module and a field adaptation module, and the corresponding mapping function comprises f _g2 (x，θ _g2 ) And f _c2 (x，θ _c2 ) The parameter to be trained comprises theta _g2 And theta _c2 ；

The training process of the stage II is as follows: intermediate data sample set with pseudo label information

And a target data sample set whose health status is unknown->

Input into a phase II network model for optimizing model parametersThe training process is the same as that of the stage I network model, and the objective function of the stage II network model is expressed as follows:

and the alpha and the beta are penalty coefficients.

5. The rolling bearing fault diagnosis method based on the pseudo label transfer type two-stage field self-adaptation as claimed in claim 4, wherein the step 4 specifically comprises:

and optimizing parameters of the phase I network model and the phase II network model through an Adam back propagation algorithm, wherein the total training iteration times of the phase I network model and the phase II network model are K, when the training iteration times are K, target data samples with unknown health states are input into the network model which completes training, a Softmax layer of the classification module outputs prediction probability distribution vectors of the target data samples, and the maximum probability corresponding labels are expressed as the health state prediction categories of the tested samples.

6. The rolling bearing fault diagnosis method based on the pseudo-label transfer type two-stage field adaptation as claimed in claim 1, wherein the field adaptation method based on the maximum mean difference adopted by the field adaptation module in the step 3 specifically comprises:

by H _k Representing the Hulbert space of the regenerating nucleus associated with the characteristic nucleus k, RKHS H is present given two distributions P, Q _k Having a non-linear mapping function phi (·) epsilon H _k Mapping data to RKHS H _k Then, the domain adaptation module calculates using the maximum mean difference:

said sup (-) is an input setThe supremum limit of (a) is,

indicates the expectation of the profile P>

Indicates the expectation of the distribution Q>

Denotes that φ (-) is a function within a unit cell in RKHS, two sets of samples are taken independently and identically distributed from distribution P, Q>

And &>

The empirical calculation of the maximum mean difference is then:

the kernel maps k (x) ^p ，x ^q )＝<φ(x ^p )，φ(x ^q )>The optimal kernel function composed of a plurality of characteristic kernels in the multi-kernel maximum mean difference can better approximate the distribution of the characteristic space, and the plurality of characteristic kernels

Defined as a convex combination of m nuclei:

the constraint condition

Guarantee pieThe raw polynuclear k is peculiar in that u is the number of nuclei, β _u Is the constraint coefficient of the kernel. />