CN115935187A - Mechanical fault diagnosis method under variable working conditions based on nuclear sensitivity alignment network - Google Patents

Mechanical fault diagnosis method under variable working conditions based on nuclear sensitivity alignment network Download PDF

Info

Publication number
CN115935187A
CN115935187A CN202211599722.4A CN202211599722A CN115935187A CN 115935187 A CN115935187 A CN 115935187A CN 202211599722 A CN202211599722 A CN 202211599722A CN 115935187 A CN115935187 A CN 115935187A
Authority
CN
China
Prior art keywords
domain
target
source
nuclear
source domain
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211599722.4A
Other languages
Chinese (zh)
Other versions
CN115935187B (en
Inventor
彭雷
张子蕴
戴光明
王茂才
宋志明
陈晓宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China University of Geosciences
Original Assignee
China University of Geosciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Geosciences filed Critical China University of Geosciences
Priority to CN202211599722.4A priority Critical patent/CN115935187B/en
Publication of CN115935187A publication Critical patent/CN115935187A/en
Application granted granted Critical
Publication of CN115935187B publication Critical patent/CN115935187B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a mechanical fault diagnosis method under variable working conditions based on a nuclear sensitivity alignment network, which combines full local adaptation and sub-domain adaptation to construct a sub-domain adaptive depth neural network model of nuclear sensitivity alignment, and in the network model, local Maximum Mean Difference (LMMD) is realized as sub-domain adaptation to be distributed under alignment conditions. In addition, based on the model, the invention also provides a antagonism learning method for nuclear sensitivity alignment, so as to overcome the defects of LMMD. The antagonism learning method of nuclear sensitivity alignment (KSA) of the present invention is spatially location sensitive, and can significantly reduce domain bias by discriminating the relationship between sample features, compared to the conventional antagonism domain adaptation method. The invention can solve the technical problem of low accuracy of mechanical fault diagnosis in the variable working condition scene in the prior art.

Description

Mechanical fault diagnosis method under variable working conditions based on nuclear sensitivity alignment network
Technical Field
The invention relates to the field of mechanical intelligent fault diagnosis and computer artificial intelligence, in particular to a mechanical fault diagnosis method under variable working conditions based on a nuclear sensitivity alignment network.
Background
Rotary machines are widely used in various industrial facilities and electrification systems as important mechanical devices in modern industries. As for the rolling bearing as a key component, since it is operated for a long time in severe environments such as high temperature, high speed, fatigue, large load variation range, etc., a failure may occur finally, resulting in high maintenance costs and even causing serious accidents. Statistically, the failure rate of the aircraft engine in the mechanical failure of the aircraft is about 40%, and the failure of the rolling bearing therein always accounts for a large proportion.
The intelligent fault diagnosis using the deep learning method can process massive monitoring data and judge the health state of the machine, so that the reliability and the safety of industrial production can be improved. Compared with the conventional common traditional machine learning method, the deep learning models do not need expert experience, train the parameters of the whole model in a self-adaptive mode, automatically learn key characteristics and predict results. Deep learning techniques have limitations. The success of most intelligent fault diagnosis methods relies on two conditions: firstly, deep learning requires a large amount of marked fault data to perform model training; second, the training and test data should satisfy the same probability distribution. However, for some machines, it is difficult to satisfy both conditions. Given the practical application scenarios of many industrial devices, collecting sufficient tagged data, particularly tagged fault data, is time consuming, laborious and even impractical. More importantly, industrial machines often operate in harsh, diverse, and complex environments, which makes the data distribution in future test cases different from that used by pre-trained models.
For complex and variable working conditions in fault diagnosis, the cross-domain diagnosis task can be solved by using a branch of transfer learning, namely domain self-adaptation, and similar features among different domains can be extracted on the premise of keeping good classification performance of source domain data. In previous approaches to dealing with such problems, global area adaptation is often used, but this may confuse the classification of the test data. Therefore, a subdomain adaptation method capable of adjusting the distribution of the source domain and the target domain on each class is receiving increasing attention. However, the existing sub-domain adaptation method has the limitation that only part of the sub-domain distribution can be aligned, and the accuracy of mechanical fault diagnosis under the variable working condition scene is low due to the limitation.
Therefore, the improvement of the accuracy of the mechanical fault diagnosis in the variable working condition scene is an urgent technical problem to be solved.
Disclosure of Invention
In order to solve the technical problem, the invention provides a method for diagnosing mechanical faults under variable working conditions based on a nuclear sensitivity alignment network, which can realize accurate diagnosis of the mechanical faults under the variable working condition scene.
The technical scheme provided by the invention specifically comprises the following steps: a mechanical fault diagnosis method under variable working conditions based on a nuclear sensitivity alignment network comprises the following steps:
s1: collecting data from mechanical equipment under different working conditions to form a source domain data set and a target domain data set;
s2: slicing the source domain data set and the target domain data set to obtain a plurality of source domain samples and target domain samples, and performing normalization processing on each source domain sample and each target domain sample;
s3: constructing a subdomain self-adaptive deep neural network model based on nuclear sensitivity alignment, which comprises the following steps: the system comprises a feature extractor, a label classifier, an LMMD module and a nuclear sensitivity discriminator;
s4: respectively inputting the source domain sample and the target domain sample after normalization processing into a feature extractor to obtain a feature vector of a source domain and a feature vector of a target domain;
s5: inputting the feature vector of the source domain into a label classifier, and calculating the classification loss of the source domain by using the prediction result and the source domain label; inputting the feature vector of the target domain into a label classifier to obtain a pseudo label of the target domain;
s6: inputting the feature vector of the source domain, the feature vector of the target domain, the source domain label and the pseudo label of the target domain into an LMMD module to generate a source domain kernel matrix, a target domain kernel matrix and an LMMD loss;
s7: calculating the corresponding nuclear sensitivity of the source domain sample and the target domain sample according to the feature vector of the source domain, the feature vector of the target domain, the source domain nuclear matrix and the target domain nuclear matrix, and inputting the nuclear sensitivity into a nuclear sensitivity discriminator to obtain the KSA loss;
s8: adding the classification loss, the LMMD loss and the KSA loss of the source domain to obtain a total loss, and optimizing the model by using a random gradient descent method with the minimum total loss as an optimization target;
s9: judging whether the specified iteration times are reached, if so, finishing the training, carrying out mechanical fault diagnosis under variable working conditions through the trained deep neural network model, and obtaining a fault diagnosis result; otherwise, the step S4 is returned to.
Preferably, S1 and S2 specifically include:
collecting bearing vibration signal with known fault information as source domain data set
Figure BDA0003994774010000031
Classifying a source domain dataset as a source task->
Figure BDA0003994774010000032
Bearing vibration signals for collecting unknown fault information under other working conditions are used as target domain data sets
Figure BDA0003994774010000033
Figure BDA0003994774010000034
Classifying a target domain dataset as a target task->
Figure BDA0003994774010000035
wherein ,
Figure BDA0003994774010000036
and
Figure BDA0003994774010000037
Feature spaces, P, representing source and target domains, respectively S (X S) and PT (X T ) Respectively representing the probability distribution of the source domain and the target domain,
Figure BDA0003994774010000038
representing n in total by the source domain s A data set consisting of a number of samples,
Figure BDA0003994774010000039
representing a total of n in the object domain t A data set of individual samples, based on the number of samples in the sample set>
Figure BDA00039947740100000310
and
Figure BDA00039947740100000311
Label spaces representing source and target tasks, respectively, f S(·) and fT () is a mapping function of the source and target domains representing a relationship between samples of the data set and the predicted outcome;
segmenting the collected source domain data set and target domain data set through a sliding window to generate a source domain sample and a target domain sample;
and carrying out normalization processing on each source domain sample and each target domain sample.
Preferably, in step S3, the feature extractor includes three one-dimensional convolution layers, a flattening layer, and a full-link layer, which are sequentially disposed; the convolution kernel sizes of the first two convolution layers are larger, the convolution kernel size of the next convolution layer is smaller, and a maximum pooling layer is arranged behind each convolution layer; batch normalization and the Leaky ReLU function were used after each convolutional layer and the ReLU function was used after the fully-connected layer.
The label classifier comprises a full connection layer, the input dimension number is the dimension number of the feature vector, and the output dimension number is the number of the bearing fault categories.
The nuclear sensitivity discriminator comprises a gradient reversal layer GRL and three full connection layers which are arranged in sequence, wherein batch standardization, a ReLU function and a dropout function are used behind each full connection layer.
Preferably, step S4 specifically includes:
for source domain samples
Figure BDA00039947740100000312
And target field sample->
Figure BDA00039947740100000313
Dividing x using a feature extractor G (-) s and xt By passing
Figure BDA00039947740100000314
and
Figure BDA00039947740100000315
Mapping to a common feature space, wherein>
Figure BDA00039947740100000316
D-dimensional feature vectors representing the source domain and the target domain.
Preferably, step S5 specifically includes:
feature vector of source domain
Figure BDA00039947740100000317
And the feature vector of the target field->
Figure BDA00039947740100000318
Sending the result into a label classifier C (-) for prediction to obtain the prediction result of->
Figure BDA00039947740100000319
wherein ,
Figure BDA00039947740100000320
Respectively obtaining score vectors of a source domain and a target domain, wherein K is the number of the types of the samples;
according to z s And a real source domain tag
Figure BDA0003994774010000041
Calculating the classification loss of the source domain by using a standard cross entropy formula, and training a classification model consisting of a feature extractor G (-) and a label classifier C (-) by propagating back to minimize the loss, wherein the classification loss of the model on the source domain is->
Figure BDA0003994774010000042
Is represented as follows:
Figure BDA0003994774010000043
wherein ,Lc (-) is a cross entropy loss function;
vector the score of the target domain z t Processed by softmax function to obtain vector
Figure BDA0003994774010000044
Each element of (4 >>
Figure BDA0003994774010000045
All represent->
Figure BDA0003994774010000046
The probability of belonging to the corresponding k classes, which is calculated as follows:
Figure BDA0003994774010000047
by using
Figure BDA0003994774010000048
As->
Figure BDA0003994774010000049
The pseudo tag of (1).
Preferably, in step S6: the LMMD module is configured to align distribution of the same category data in the source domain and the target domain, so that conditional distribution of the two domains is the same, where the LMMD is defined as follows:
Figure BDA00039947740100000410
wherein ,xs and xt Is a sample of the source and target domains, E stands for mathematical expectation, p (c) and q(c) The distribution of c classes in the source domain and the target domain respectively,
Figure BDA00039947740100000421
is a regenerative kernel hilbert space RKHS generated by a defined kernel function k (·, ·), Φ representing a feature mapping that maps raw data to RKHS;
will be the parameter w c Defined as the weight of each sample belonging to each class, the unbiased estimation of LMMD is defined as follows:
Figure BDA00039947740100000411
wherein ,
Figure BDA00039947740100000412
and
Figure BDA00039947740100000413
Respectively represent the ith source sample->
Figure BDA00039947740100000414
And the jth target sample->
Figure BDA00039947740100000415
The weight values belonging to the class C are,
Figure BDA00039947740100000416
and
Figure BDA00039947740100000417
and
Figure BDA00039947740100000418
Is a weighted sum of the class C samples;
Figure BDA00039947740100000419
The calculation method of (c) is as follows:
Figure BDA00039947740100000420
wherein ,yic Is a vector y i For source domain samples
Figure BDA0003994774010000051
Using a true source domain tag->
Figure BDA0003994774010000052
Is calculated by a one-hot encoding of>
Figure BDA0003994774010000053
^ based on each unlabeled target domain sample in unsupervised domain adaptation>
Figure BDA0003994774010000054
By taking>
Figure BDA0003994774010000055
Calculating a ≥ of the target sample as a kind of pseudo-label>
Figure BDA0003994774010000056
Calculating a feature vector ≥ of the source field>
Figure BDA0003994774010000057
Figure BDA0003994774010000058
And a feature vector of the target field>
Figure BDA0003994774010000059
The LMMD distances of (a) are as follows:
Figure BDA00039947740100000510
wherein k (·, ·) represents a kernel function;
calculating a kernel matrix K: the matrix is composed of inner product matrixes K respectively defined in a source domain, a target domain and a cross-domain s,s ,K t,t ,K s,t ,K t,s The expression is as follows:
Figure BDA00039947740100000511
the LMMD distance is expressed by a kernel matrix method, and each element W in a weight matrix W ij The definition is as follows:
Figure BDA00039947740100000512
based on the kernel matrix K and the weight matrix W, LMMD losses are expressed as follows:
Figure BDA00039947740100000513
preferably, step S7 specifically includes:
Figure BDA00039947740100000514
and
Figure BDA00039947740100000515
Is the sum of the sample inner products of the source domain and the target domain in RKHS, and the corresponding source domain kernel matrix K s,s And a target domain kernel matrix K t,t Is represented as follows:
Figure BDA0003994774010000061
Figure BDA0003994774010000062
obtaining a kernel sensitivity s of each sample by partial derivation of the samples by a source domain kernel matrix and a target domain kernel matrix i The calculation is as follows:
Figure BDA0003994774010000063
Figure BDA0003994774010000064
wherein, G (·) d The d-th element representing the feature vector,
Figure BDA0003994774010000065
and
Figure BDA0003994774010000066
Source domain samples and target domain samples, respectively.
Inputting nuclear sensitivity into nuclear sensitivity discriminator D m In (-) using the binary classification results and domain labels of the kernel sensitivity discriminator, the KSA loss was calculated using the binary cross entropy as follows:
Figure BDA0003994774010000067
wherein ,Lb (. Is) a binary cross-entropy loss function, d i =0 as source domain label, d j =1 is the target domain label.
Preferably, in step S8, the expression of the total loss is as follows:
Figure BDA0003994774010000068
wherein ,λ1 ,λ 2 Two balance parameters;
Figure BDA0003994774010000069
for loss of the source field>
Figure BDA00039947740100000610
For LMMD loss, is>
Figure BDA00039947740100000611
For KSA loss, back propagation is used to minimize total loss->
Figure BDA00039947740100000612
Training parameters of a deep neural network model for the target;
parameter θ of feature extractor f Parameter θ of label classifier c And the parameter theta of the nuclear sensitivity discriminator m The update by back propagation is as follows:
Figure BDA0003994774010000071
Figure BDA0003994774010000072
Figure BDA0003994774010000073
where η represents the learning rate.
The technical scheme provided by the invention has the following beneficial effects:
the invention discloses a mechanical fault diagnosis method under variable working conditions based on a nuclear sensitivity alignment network, which combines full Local area adaptation and sub-area adaptation to construct a sub-area adaptive depth neural network model of nuclear sensitivity alignment, and in the network model, local Maximum Mean variance (LMMD) is realized as sub-area adaptation to be distributed under alignment conditions. In addition, based on the model, the invention also provides a antagonism learning method for nuclear sensitivity alignment, so as to overcome the defects of LMMD. Compared with the traditional antagonism domain adaptation method, the antagonism learning method of the nuclear sensitivity alignment is space position sensitive, and domain deviation can be obviously reduced by distinguishing the relation between sample characteristics. The invention can solve the technical problem of low accuracy of mechanical fault diagnosis in the variable working condition scene in the prior art.
Drawings
The invention will be further described with reference to the accompanying drawings and examples, in which:
FIG. 1 is a general flowchart of a method for diagnosing a mechanical fault under a variable condition based on a kernel sensitivity alignment network according to an embodiment of the present invention;
FIG. 2 is a diagram illustrating a sub-domain adaptive depth neural network model framework based on kernel sensitivity alignment according to an embodiment of the present invention;
FIG. 3 is a model of a Pasteborn data set experiment platform according to an embodiment of the present invention;
FIG. 4 is a graph illustrating the prediction accuracy of each method in the Padebowen data set C → A task at the target domain in an embodiment of the present invention;
FIG. 5 is a confusion matrix for the task B → C of the Padebowen dataset in an embodiment of the present invention; fig. 5 (a) corresponds to the DSAN method; FIG. 5 (b) corresponds to the method of the present invention;
FIG. 6 is a t-SNE result of the Padebbon data set B → C task in an embodiment of the present invention; fig. 6 (a) corresponds to a DSAN method; FIG. 6 (b) corresponds to the method of the present invention.
Detailed Description
For a more clear understanding of the technical features, objects, and effects of the present invention, embodiments of the present invention will now be described in detail with reference to the accompanying drawings.
Referring to fig. 1, the present invention provides a method for diagnosing a mechanical fault under a variable working condition based on a nuclear sensitivity alignment network, which specifically includes the following steps:
s1: collecting data from mechanical equipment under different working conditions to form a source domain data set and a target domain data set;
s2: slicing the source domain data set and the target domain data set to obtain a plurality of source domain samples and target domain samples, and normalizing each source domain sample and each target domain sample;
s3: constructing a subdomain self-adaptive depth neural network model based on nuclear sensitivity alignment, which comprises the following steps: the system comprises a feature extractor, a label classifier, an LMMD module and a nuclear sensitivity discriminator;
s4: respectively inputting the source domain sample and the target domain sample after normalization processing into a feature extractor to obtain a feature vector of a source domain and a feature vector of a target domain;
s5: inputting the feature vector of the source domain into a label classifier, and calculating the classification loss of the source domain by using the prediction result and the source domain label; inputting the feature vector of the target domain into a label classifier to obtain a pseudo label of the target domain;
s6: inputting the feature vector of the source domain, the feature vector of the target domain, the source domain label and the pseudo label of the target domain into an LMMD module to generate a source domain kernel matrix, a target domain kernel matrix and an LMMD loss;
s7: calculating the corresponding nuclear sensitivity of the source domain sample and the target domain sample according to the feature vector of the source domain, the feature vector of the target domain, the source domain nuclear matrix and the target domain nuclear matrix, and inputting the nuclear sensitivity into a nuclear sensitivity discriminator to obtain the KSA loss;
s8: adding the classification loss, the LMMD loss and the KSA loss of the source domain to obtain a total loss, and optimizing the model by using a random gradient descent method with the minimum total loss as an optimization target;
s9: judging whether the specified iteration times are reached, if so, finishing the training, carrying out mechanical fault diagnosis under variable working conditions through the trained deep neural network model, and obtaining a fault diagnosis result; otherwise, the step S4 is returned to.
As a preferred embodiment, the present invention takes a bearing fault of a mechanical fault as an example, and describes in detail a mechanical fault diagnosis method under variable operating conditions based on a nuclear sensitivity alignment network according to the present invention:
as shown in fig. 1, a method for diagnosing a mechanical fault under a variable condition based on a nuclear sensitivity alignment network can be summarized as 3 major parts, that is, collecting bearing fault data under a variable condition, constructing a sub-domain adaptive deep neural network model based on nuclear sensitivity alignment, and training the model.
(1) Collecting bearing fault data under variable working conditions:
in an actual machine fault diagnosis scenario, the main factor causing the distribution variation between training data and test data is the variation of the machine operating state caused by frequent changes in speed, load or operation. Therefore, in the problem definition, the change of these properties is taken as different working conditions of the bearing, which are called different domains.
A domain
Figure BDA0003994774010000091
Can be defined by two parts: characteristic space->
Figure BDA0003994774010000092
And a probability distribution P (X), wherein
Figure BDA0003994774010000093
Is a field of>
Figure BDA0003994774010000094
The data set of samples in (1). For a task->
Figure BDA0003994774010000095
It is formed by a label space->
Figure BDA0003994774010000096
And a mapping function f (-) definition, in which &>
Figure BDA0003994774010000097
Figure BDA0003994774010000098
Is field>
Figure BDA0003994774010000099
Label set (y) of corresponding sample in i Is x i The label of (c). The mapping function f (·), also denoted as f (X) = P (y | X), represents the relationship between the samples of the data set X and the prediction result.
The invention uses an accelerometer to collect bearing vibration signals of known fault information under a working condition as source domain data
Figure BDA00039947740100000910
Classifying source domain data as a source task->
Figure BDA00039947740100000911
Bearing vibration signal collecting unknown fault information under other working conditions and serving as target domain data->
Figure BDA00039947740100000912
And corresponding target task
Figure BDA00039947740100000913
wherein ,
Figure BDA00039947740100000914
and
Figure BDA00039947740100000915
Feature spaces, P, representing source and target domains, respectively S (X S) and PT (X T ) Represents the probability distribution of the source domain and the target domain, respectively, < >>
Figure BDA00039947740100000916
Figure BDA00039947740100000917
Representing n in total by the source domain s A data set consisting of a plurality of samples, based on the number of samples in a sample group>
Figure BDA00039947740100000918
Figure BDA00039947740100000919
Representing a total of n in the object domain t A data set consisting of a number of samples,
Figure BDA00039947740100000920
and
Figure BDA00039947740100000921
Label spaces representing source and target tasks, respectively, f S(·) and fT () is a mapping function of the source and target domains that represents a relationship between samples of the dataset and the predicted outcome;
for the collected vibration data, the sample is generated by slicing it through a sliding window. The size of the sliding window is typically chosen to be a multiple of 2 of the two periodic samples of the bearing rotation, and 4096 samples are preferably chosen as the time window in this embodiment. Normalization was performed for each sample collected. There are multiple fault types under each condition, including: normal state, outer ring failure, inner ring failure and rolling element failure, wherein the failure types may have different damage sizes and different composite situations. The present embodiment assumes that the source domain and the target domain have the same feature space and fault type, i.e.
Figure BDA00039947740100000922
But the distribution of the two domains is different, i.e. P S (X S )≠P T (X T ). It is an object of the invention to find a suitable mapping f S→T (·):X S →Y T To reduce the difference in distribution between the source domain and the target domain. Based on labeled source domain data, X S and YS And unmarked target domain data X T By passingTraining neural networks with source and target domains in the same mapping f S→T The distribution after (. Cndot.) is as uniform as possible.
(2) Constructing a subdomain self-adaptive deep neural network model based on nuclear sensitivity alignment:
as shown in fig. 2, the model includes 4 modules, which are a feature extractor, a tag classifier, a local maximum mean difference module (LMMD module), and a nuclear sensitivity discriminator (KSA module).
(2.1) constructing a feature extractor:
the feature extractor acts as a map to reduce the dimensionality of the raw data and extract valid features for subsequent use. Because the collected vibration signals are one-dimensional data, the method directly processes the original one-dimensional vibration data by using the one-dimensional convolution neural network and extracts the characteristics. The feature extractor consists of three one-dimensional convolution layers, a flattening layer and a full-link layer. The present invention uses two convolutional layers of large size cores and one convolutional layer of small size cores, each with a maximum pooling layer behind it. Batch Normalization (Batch Normalization) and the Leaky ReLU function were used after each convolutional layer, and the ReLU function was used after the fully-connected layer.
For source domain samples
Figure BDA0003994774010000101
And target field sample->
Figure BDA0003994774010000102
Using a feature extractor G (-) to extract x s and xt By passing
Figure BDA0003994774010000103
and
Figure BDA0003994774010000104
Mapping to a common feature space, wherein>
Figure BDA0003994774010000105
D-dimensional feature vectors representing the source domain and the target domain.
(2.2) constructing a label classifier:
the label classifier predicts the label of the sample according to the feature vector extracted from the sample; the label classifier is composed of a full connection layer, the input dimension number of the label classifier is the dimension number of the characteristic vector, and the output dimension of the label classifier is the number of the bearing fault categories.
Feature vector of source domain
Figure BDA0003994774010000106
And a feature vector of the target field>
Figure BDA0003994774010000107
Sending the result into a label classifier C (-) for prediction to obtain the prediction result of->
Figure BDA00039947740100001013
wherein ,
Figure BDA0003994774010000109
Respectively obtaining the score vectors of a source domain and a target domain, wherein K is the number of the types of the samples;
according to z s And a real source domain data tag
Figure BDA00039947740100001010
Calculating the classification loss of the source domain by using a standard cross entropy formula, and training a classification model consisting of a feature extractor G (-) and a label classifier C (-) by propagating back to minimize the loss, wherein the classification loss of the model on the source domain is->
Figure BDA00039947740100001011
Is represented as follows:
Figure BDA00039947740100001012
wherein Lc (-) is a cross entropy loss function;
vector the score of the target domain z t By softmax functionProcessing to obtain vector
Figure BDA0003994774010000111
Each element of (4 >>
Figure BDA0003994774010000112
All represent->
Figure BDA0003994774010000113
The probability of belonging to the corresponding k classes is calculated as follows:
Figure BDA0003994774010000114
by using
Figure BDA00039947740100001112
As->
Figure BDA0003994774010000116
The pseudo tag of (1).
(2.3) constructing an LMMD module;
in order to make the source domain data and the target domain data have similar distribution after mapping, a domain adaptive method is generally adopted to train the model. Domain adaptation may extract useful knowledge from one or more source tasks and apply the knowledge to a target task, where the distribution of source and target domains is different but related. One common strategy is to find a suitable index to evaluate the similarity of the distributions and optimize the model to minimize the distribution differences between different domains. Therefore, the quality of the index will directly affect the performance of the model.
Among the numerous statistical distance measures, the Maximum Mean variance (MMD) is the most widely used distance measure method in transfer learning. The method has the functions of finding a Kernel function, mapping data samples of a source domain and a target domain to a Reproduction Kernel Hilbert Space (RKHS), taking the data samples of the two domains on the RKHS as the difference after the mean value respectively, and then taking the difference as the distance. The formula for MMD is defined as follows:
Figure BDA0003994774010000117
wherein
Figure BDA0003994774010000118
Is RKHS generated by a defined kernel function k (·, ·). Φ (-) represents the feature map mapping raw data to RKHS. n represents the number of samples in X and m represents the number of samples in Y. In practical applications, the square of the MMD is typically used as a measure of the distribution difference. Calculating characteristics of source field samples>
Figure BDA0003994774010000119
And characteristics of the target domain samples
Figure BDA00039947740100001110
The MMD distance of (c) is as follows:
Figure BDA00039947740100001111
wherein the kernel function k (x) i ,x j )=<φ(x i ),φ(x j )>Represents the inner product of two samples on RKHS. The gaussian kernel function is used in the present invention and is defined as follows:
Figure BDA0003994774010000121
where σ is the bandwidth of the kernel function. For convenient calculation, a kernel matrix is introduced, which is composed of inner product matrixes K respectively defined in a source domain, a target domain and a cross-domain s,s ,K t,t ,K s,t ,K t,s Composition, as follows:
Figure BDA0003994774010000122
defining L as a weight matrix, wherein the element L ij Is calculated as follows:
Figure BDA0003994774010000123
with the help of the kernel matrix technique defined above, the MMD distance in equation (3) can be written as:
Figure BDA0003994774010000124
MMD has been widely used to measure the distribution difference between a source domain and a target domain. However, the MMD-based classical approach aligns only the global distribution of the source and target domains, and rarely considers the subdomain distribution differences of features and output labels under different operating conditions. This may lose fine-grained information for each class, leading to confusion in the discrimination structure. Errors can occur near the classification boundary due to too close distances between the data of different sub-fields.
The present invention uses Local Maximum Mean Difference (LMMD) to achieve sub-domain adaptation, instead of traditional MMD, to address the above challenges. A subdomain is a class in either a source domain or a target domain that contains samples of the same fault class. The core of sub-domain adaptation is the learning of local domain transitions. The LMMD calculates the average difference of the same sub-domain samples in the source domain and the target domain in the RKHS according to the weights of different samples. Based on this idea, the LMMD aligns the distribution of the same category data in the source domain and the target domain, so that the conditional distributions of the two domains are the same, which is defined as follows:
Figure BDA0003994774010000125
wherein ,xs and xt Is a sample of the source and target domains, E stands for mathematical expectation, p (c) and q(c) The distribution of c classes in the source domain and the target domain respectively,
Figure BDA0003994774010000126
is a regenerative kernel hilbert space RKHS generated by a defined kernel function k (·, ·), Φ representing a feature mapping that maps raw data to RKHS;
will be the parameter w c Defined as the weight of each sample belonging to each class, the unbiased estimation of LMMD is defined as follows:
Figure BDA0003994774010000131
wherein ,
Figure BDA0003994774010000132
and
Figure BDA0003994774010000133
Respectively represent the ith source sample->
Figure BDA0003994774010000134
And the jth target sample->
Figure BDA0003994774010000135
The weight value that belongs to the class C,
Figure BDA0003994774010000136
and
Figure BDA0003994774010000137
and
Figure BDA0003994774010000138
Is a weighted sum of the class C samples;
Figure BDA0003994774010000139
The calculation of (c) is as follows:
Figure BDA00039947740100001310
wherein ,yic Is a vector y i For source domain samples
Figure BDA00039947740100001311
Using a true source domain tag->
Figure BDA00039947740100001312
One-hot encoding of (a) to calculate &>
Figure BDA00039947740100001313
^ based on each unlabeled target domain sample in unsupervised domain adaptation>
Figure BDA00039947740100001314
By taking>
Figure BDA00039947740100001315
Calculating a ≥ of the target sample as a kind of pseudo-label>
Figure BDA00039947740100001316
Calculating a source field feature vector->
Figure BDA00039947740100001317
Figure BDA00039947740100001318
And target domain feature vectors
Figure BDA00039947740100001319
The LMMD distances of (a) are as follows:
Figure BDA00039947740100001320
the LMMD distance is expressed by a kernel matrix method, and each element W in a weight matrix W ij The definition is as follows:
Figure BDA0003994774010000141
based on the kernel matrix K and the weight matrix W, LMMD losses are expressed as follows:
Figure BDA0003994774010000142
(2.4) construction of Nuclear sensitivity discriminator (KSA Module)
In order to calculate the loss value of a certain class in the LMMD formula, both the source domain and the target domain of the current batch must have samples of the class. And due to the influence of the size, the randomness, the prediction accuracy of the model and the like of the batch, the loss of some classes is only calculated for a few times, so that the distribution of the source domain and the target domain on the classes is not aligned. When the predicted pseudo-label is incorrect, using an LMMD loss training feature extractor can result in the source and target domains being aligned on the wrong class. And the model will be more likely to produce erroneous predictions for the target domain samples.
To address the limitations of LMMD, the present invention proposes a new method called Kernel Sensitivity Alignment (KSA) to reduce the domain offset of the source domain and the target domain even more. The nuclear sensitivity of a sample can be seen as the ability to affect the sum of the volumes on RKHS of all samples of the same domain. In deep learning models, mapped RKHS is typically highly complex, so only two samples very close together may have similar kernel sensitivities.
Based on the nuclear sensitivity alignment method, a nuclear sensitivity discriminator is constructed, and the nuclear sensitivity discriminator specifically comprises a gradient reversal layer GRL and three full connection layers which are arranged in sequence, wherein batch standardization, a ReLU function and a dropout function are used behind each full connection layer.
According to the definition of the kernel function, it takes the vector of the original space as input and returns the inner product of the vectors of the feature space. Thus, in the formula (3)
Figure BDA0003994774010000143
and
Figure BDA0003994774010000144
Is the sum of the sample inner products of the source domain and the target domain in the RKHS, and the corresponding source domain kernel matrix K s,s And a target domain kernel matrix K t,t Is represented as follows:
Figure BDA0003994774010000151
each element in the kernel matrix measures the correlation between two samples in the mapped high-dimensional space. The product of the inner products and the sensitivity to the samples can be obtained by partial derivation of the samples by a kernel matrix, the kernel sensitivity s of each sample in the source and target domains i The calculation is as follows:
Figure BDA0003994774010000152
Figure BDA0003994774010000153
wherein, G (·) d The d-th element representing the feature vector,
Figure BDA0003994774010000158
and
Figure BDA0003994774010000155
Respectively source domain samples and target domain samples.
To ensure that the nuclear sensitivity distributions of the two domains are consistent, a nuclear sensitivity discriminator D may be used m (. Cndot.) to determine if the kernel sensitivity is from the source domain or the target domain, and the feature extractor G (. Cndot.) attempts to confuse it. On the premise of keeping the source domain data correctly classified, the antagonistic learning can further reduce the distribution difference between the two domains. Using the binary classification results and domain labels of the nuclear sensitivity discriminator, KSA loss was calculated using binary cross entropy as follows:
Figure BDA0003994774010000156
wherein ,Lb (,) is a binary cross entropy loss function, d i =0 as source domain label, d j =1 is the target domain label.
(3) Training model
The model framework of the method is shown in fig. 2, and the main stem of the model is a classification model of a prediction target domain label, wherein the classification model is composed of a One-dimensional Convolutional Neural Network (1D-CNN) serving as a feature extractor and a full connection layer serving as a label classifier. In order to solve the domain offset problem, the invention designs two modules, LMMD and KSA. The LMMD module requires four input parameters: source domain feature f s Target feature f t Source domain real label y s And target domain pseudo-tag
Figure BDA0003994774010000157
The KSA module (nuclear sensitivity discriminator) is characterized by comprising a feature extractor G (-) and a nuclear sensitivity discriminator D m Antagonism learning of (c) to make the nuclear sensitivity distribution of the source domain and the target domain consistent. After the LMMD module calculates, a kernel matrix of the source domain and the target domain may be obtained. Based on the kernel matrixes of the source domain and the target domain, the corresponding kernel sensitivities of the source domain sample and the target domain sample can be calculated and sent to a kernel sensitivity discriminator for binary classification discrimination.
According to the above, the overall optimization objective, namely the overall loss function, of the subdomain adaptive deep neural network model based on the kernel sensitivity alignment, which is proposed by the invention, consists of three parts including
Figure BDA0003994774010000161
and
Figure BDA0003994774010000162
First, in order to guarantee the accuracy of classification, the classification loss of the source domain needs to be minimized through optimization. Then will->
Figure BDA0003994774010000163
The losses are minimized as sub-domain adaptation to make the conditional distributions of the source and target domains uniform. Third, maximize->
Figure BDA00039947740100001611
The penalty is applied as a global adaptation to align the edge distribution of the source and target domains and further reduce domain offsets. Thus, the overall optimization objective loss function can be expressed as:
Figure BDA0003994774010000164
wherein ,λ1 ,λ 2 Two balance parameters;
Figure BDA0003994774010000165
for the loss of the source field>
Figure BDA0003994774010000166
For LMMD loss, is>
Figure BDA0003994774010000167
For KSA loss, back propagation is used to minimize total loss->
Figure BDA0003994774010000168
Parameters of the deep neural network model are trained for the target.
In the present invention, back propagation is used to minimize total losses
Figure BDA0003994774010000169
Parameters of a deep neural network are trained for the target. It should be noted that KSA is a method of antagonism, where the eigenvectors are first fed into the Gradient Reverse Layer (GRL) and then into the nuclear sensitivity discriminator. Parameter θ of feature extractor f Parameter θ of label classifier c And the parameter theta of the sensitivity discriminator m The update by back propagation is as follows.
Figure BDA00039947740100001610
Where η represents the learning rate.
(4) Simulation experiment
(4.1) Experimental setup
The data set used in the simulation was the padboen data set (PU) provided by the KAT bearing data center at the university of padboen. The experimental platform is shown in fig. 3, and the basic components of the test bench are a driving motor (permanent magnet synchronous motor) 1 serving as a sensor, a torque measuring device 2, a bearing test module 3, a flywheel 4 and a load motor (synchronous servo motor) 5. In the padboen data set, 6203 rolling bearings manufactured by FAG, MTK, and IBU companies were used for the fault diagnosis test. The stator current on the drive motor 1 and the acceleration vibration signal on the bearing block housing are the main measurement variables on the bearing test stand.
The acceleration vibration signal with the sampling frequency of 64kHz is selected for fault diagnosis and analysis. The PU data set includes 6 healthy bearings and 26 failed bearings in total. The fault data contained 14 sets of faulty bearings from accelerated degradation experiments and 12 sets of artificially faulty bearings. A degraded-accelerated failing bearing is more prone to multiple types of failure than a man-made failure, and different bearings differ in failure mode, degree of failure, etc. At the same time, in order to more realistically simulate the actual situation, monitoring data of healthy bearings is collected from different periods of time during which the bearings are operating. Each different set of bearings is considered a class, so there are 32 classes in the PU dataset. By varying the radial force on the bearings and the load torque on the drive system, there are three operating conditions in the PU data set. The data are described in detail below.
Table 1 padboen data set settings
Figure BDA0003994774010000171
According to standard unsupervised domain adaptive experimental rules, all labeled source domain data and unlabeled target domain data are used as training data. For the three operating conditions a, B, and C of the PU data set described in table 1, a total of 6 migratory learning tasks are performed. In order to verify the transfer learning capability of the proposed method under different working conditions, 4 most advanced unsupervised Domain Adaptation methods, namely, a Deep Adaptation Network (DAN), deep coral, a Domain Adaptive Neural Network (DANN), and a Deep sub-Domain Adaptation Network (DSAN), are selected for comparison.
For fair comparison, the invention selects 1D-CNN as a basic network, and all field self-adaptive methods are adjusted based on the network. The present invention uses a Stochastic Gradient (SGD) optimizer to train parameters in the model, with momentum set to 0.9 and weight decay set to 5 × 10 -4 . Learning rate is represented by formula eta θ =η 0 /(1+αθ) β Dynamic adjustment, where θ is the training progress varying linearly from 0 to 1, η 0 =0.01, α =10, β =0.75. The batch size is set to 32 and the epochs to 200. To avoid excessive influence on the labeled classifier in the early training phase, the balance parameter λ is used 1 ,λ 2 Gradually adjusted by 2/(1 + exp (-gamma theta)) -1, where gamma =3 for lambda 1 γ =5 for λ 2 . The experiment was run in a Linux server environment using an Intel (R) Xeon (R) Gold 5117 CPU and NVIDIA GeForce RTX 3090 graphics card. The PyTorch deep learning framework is used to build the model and accelerate the computation using the GPU.
(4.2) Fault diagnosis results
Table 2 padboen data set experimental results
Figure BDA0003994774010000181
The experimental results of the padboen data set are shown in table 2. The task of diagnosing faults in bearings to accelerate degradation is generally more difficult than the task of diagnosing faults introduced manually. In addition, there are 32 classes of padboen datasets and fault formation is complex. Common features of different domains are not obvious in the domain adaptation process, which increases the difficulty of domain adaptation. All models achieved higher accuracy in tasks A → B and B → A, while other tasks achieved lower accuracy. Under two working conditions of A and B, the radial force is the same, and the load torque is different, which shows that the load torque has larger influence on the vibration signal. For all tasks, the proposed process is a significant improvement over any of DAN, deepCoral, DANN, DSAN. In particular, the proposed method is about 10% improved for tasks other than A → B and B → A. Overall, the average accuracy of the method of the invention is highest, reaching 84.7 ± 1.0, which demonstrates its powerful field adaptability.
(4.3) model analysis
The practical application of the transfer learning model not only needs to have good prediction performance, but also needs to keep the stability of precision. The prediction results of models with large accuracy fluctuations are unreliable. The predicted stability of the model can be compared by observing the predicted accuracy change curve of each pair of target domain data recorded in the experiment. From the collected data, the accuracy profiles of the different methods of task C → a in the paddborne dataset were plotted, as shown in fig. 4. It can be seen that the method of the present invention has the highest accuracy in the comparison method, and the fluctuation of the accuracy change curve is small.
In addition, the present invention also analyzes the confusion matrix of the target domain data prediction by different methods in the padboen dataset task B → C, fig. 5 (a) is the DSAN method, and fig. 5 (B) is the method of the present invention. The rows of the confusion matrix represent the true labels of the samples, while the columns of the matrix represent the predicted results of the model. It can be seen that the DSAN method using only LMMD alignment subdomains is completely wrong for the sample predictions with true labels of 10 and 20. For the sample labeled 10, the DSAN model error considers it to be label 9 or 11; for the sample label 20, the DSAN model identifies it as label 23 completely in error. The accuracy of the method of the invention for identifying all kinds is higher than 50%, and the situation of complete classification error in DSAN can not occur, which is attributed to the alignment method based on nuclear sensitivity provided by the invention.
In order to visually compare the quality of the features obtained by the adaptation methods in different fields, the output of the deep neural network can be visualized, and the ability of the deep neural network can be evaluated by observing the visualized features after dimension reduction. The invention applies t-SNE dimension reduction technology to embed the features of the test data set of the last hidden layer of the feature extractor into a two-dimensional space and visualize the features. FIG. 6 shows a visualization of samples of task B → C in the Padboen dataset, where the gray dots represent source domain samples and the black "X" represents target domain samples. As can be seen from the figure, there are some categories of source domain samples in the DSAN method using only LMMD aligned sub-domains that have no target domain samples aligned with them at all. In the visualization result of the method, the distance between the samples of the source domain and the target domain in the same category is more compact, and the samples in different categories are scattered.
The experiments prove that the method has strong migration capability under different working conditions and is obviously superior to all comparison algorithms. In addition, the nuclear sensitivity alignment method can be conveniently and effectively realized in adaptive networks in other different fields. The advantage enables the nuclear sensitivity alignment method to be widely applied to different fields, and has good application prospect.
It should be noted that the above-mentioned bearing fault diagnosis method example is only a preferred embodiment of the present invention, and the mechanical fault diagnosis method based on the variable working condition of the nuclear sensitivity alignment network according to the present invention is also applicable to other types of mechanical fault diagnosis, such as motor fault, gear fault, automobile transmission fault, fan fault, etc., and the specific implementation method thereof is similar to the above-mentioned embodiment, and can achieve a good fault diagnosis result, and is not described herein again.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrases "comprising a," "8230," "8230," or "comprising" does not exclude the presence of other like elements in a process, method, article, or system comprising the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments. In the unit claims enumerating several means, several of these means can be embodied by one and the same item of hardware. The use of the words first, second, third and the like do not denote any order, but rather the words first, second and the like may be interpreted as indicating any order.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention, and all equivalent structures or equivalent processes performed by the present invention or directly or indirectly applied to other related technical fields are also included in the scope of the present invention.

Claims (10)

1. A mechanical fault diagnosis method under variable working conditions based on a nuclear sensitivity alignment network is characterized by comprising the following steps:
s1: collecting data from mechanical equipment under different working conditions to form a source domain data set and a target domain data set;
s2: slicing the source domain data set and the target domain data set to obtain a plurality of source domain samples and target domain samples, and normalizing each source domain sample and each target domain sample;
s3: constructing a subdomain self-adaptive deep neural network model based on nuclear sensitivity alignment, which comprises the following steps: the system comprises a feature extractor, a label classifier, an LMMD module and a nuclear sensitivity discriminator;
s4: respectively inputting the source domain sample and the target domain sample after normalization processing into a feature extractor to obtain a feature vector of a source domain and a feature vector of a target domain;
s5: inputting the feature vector of the source domain into a label classifier, and calculating the classification loss of the source domain by using the prediction result and the source domain label; inputting the feature vector of the target domain into a label classifier to obtain a pseudo label of the target domain;
s6: inputting the feature vector of the source domain, the feature vector of the target domain, the source domain label and the pseudo label of the target domain into an LMMD module to generate a source domain kernel matrix, a target domain kernel matrix and an LMMD loss;
s7: calculating the corresponding nuclear sensitivity of the source domain sample and the target domain sample according to the feature vector of the source domain, the feature vector of the target domain, the source domain kernel matrix and the target domain kernel matrix, and inputting the nuclear sensitivity into a nuclear sensitivity discriminator to obtain the KSA loss;
s8: adding the classification loss, the LMMD loss and the KSA loss of the source domain to obtain a total loss, and optimizing the model by using a random gradient descent method with the minimum total loss as an optimization target;
s9: judging whether the specified iteration times are reached, if so, finishing the training, carrying out mechanical fault diagnosis under variable working conditions through the trained deep neural network model, and obtaining a fault diagnosis result; otherwise, the step S4 is returned to.
2. The method for diagnosing the mechanical fault under the variable working condition based on the nuclear sensitivity alignment network according to claim 1, wherein the S1 and the S2 specifically comprise:
collecting bearing vibration signal with known fault information as source domain data set
Figure FDA0003994773000000011
Classification of a source domain dataset as source task >>
Figure FDA0003994773000000012
Bearing vibration signals for collecting unknown fault information under other working conditions are used as target domain data sets
Figure FDA0003994773000000013
Figure FDA0003994773000000014
Classifying a target domain dataset as a target task->
Figure FDA0003994773000000015
wherein ,
Figure FDA0003994773000000021
and
Figure FDA0003994773000000022
Feature spaces, P, representing source and target domains, respectively S (X S) and PT (X T ) Represents the probability distribution of the source domain and the target domain, respectively, < >>
Figure FDA0003994773000000023
Representing n in total by the source domain s A data set consisting of a number of samples,
Figure FDA0003994773000000024
representing a total of n in the object domain t A data set consisting of a plurality of samples, based on the number of samples in a sample group>
Figure FDA0003994773000000025
and
Figure FDA0003994773000000026
Label spaces representing source and target tasks, respectively, f S(·) and fT (. H) is a mapping function of the source domain and the target domain, representing a relationship between samples of the data set and the predicted outcome;
segmenting the collected source domain data set and target domain data set through a sliding window to generate a source domain sample and a target domain sample;
and carrying out normalization processing on each source domain sample and each target domain sample.
3. The method for diagnosing mechanical faults under variable working conditions based on the nuclear sensitivity alignment network according to claim 1, wherein in the step S3, the feature extractor comprises three one-dimensional convolution layers, a flattening layer and a full connection layer which are sequentially arranged; the convolution kernel sizes of the first two convolution layers are larger, the convolution kernel size of the next convolution layer is smaller, and a maximum pooling layer is arranged behind each convolution layer; batch normalization and the Leaky ReLU function were used after each convolutional layer and the ReLU function was used after the fully-connected layer.
4. The method for diagnosing the mechanical fault under the variable working condition based on the nuclear sensitivity alignment network as claimed in claim 1, wherein in the step S3, the label classifier comprises a full connection layer, the input dimension number is the dimension number of the feature vector, and the output dimension number is the number of the bearing fault categories.
5. The method for diagnosing bearing faults under variable working conditions based on the nuclear sensitivity alignment network as claimed in claim 1, wherein in the step S3, the nuclear sensitivity discriminator comprises a gradient reversal layer GRL and three full connection layers which are arranged in sequence, and each full connection layer is followed by batch normalization, a ReLU function and a dropout function.
6. The method for diagnosing the mechanical fault under the variable working condition based on the nuclear sensitivity alignment network according to claim 2, wherein the step S4 specifically comprises the following steps:
for source domain samples
Figure FDA0003994773000000027
And target field samples>
Figure FDA0003994773000000028
Dividing x using a feature extractor G (-) s and xt By passing
Figure FDA0003994773000000029
and
Figure FDA00039947730000000210
Mapping to a common feature space, wherein>
Figure FDA00039947730000000211
D-dimensional feature vectors representing the source domain and the target domain.
7. The method for diagnosing the mechanical fault under the variable working condition based on the nuclear sensitivity alignment network according to claim 6, wherein the step S5 specifically comprises the following steps:
feature vector of source domain
Figure FDA00039947730000000212
And the feature vector of the target field->
Figure FDA00039947730000000213
Sending the result into a label classifier C (-) for prediction to obtain the prediction result of->
Figure FDA0003994773000000031
wherein ,
Figure FDA0003994773000000032
Respectively obtaining the score vectors of a source domain and a target domain, wherein K is the number of the types of the samples;
according to z s And a real source domain tag
Figure FDA0003994773000000033
Calculate the classification loss of the source domain using the standard cross entropy formula and train the classification model consisting of the feature extractor G (-) and the label classifier C (-) by backpropagating to minimize the loss, the classification loss of the model over the source domain->
Figure FDA0003994773000000034
Is represented as follows:
Figure FDA0003994773000000035
wherein ,Lc (-) is a cross entropy loss function;
vector the score of the target domain z t Processed by softmax function to obtain vector
Figure FDA0003994773000000036
Each element of (4 >>
Figure FDA0003994773000000037
All represent->
Figure FDA0003994773000000038
The probability of belonging to the corresponding k classes is calculated as follows:
Figure FDA0003994773000000039
by using
Figure FDA00039947730000000310
As->
Figure FDA00039947730000000311
The pseudo tag of (1).
8. The method for diagnosing the mechanical fault under the variable working condition based on the nuclear sensitivity alignment network according to claim 2, wherein in the step S6: the LMMD module is configured to align distribution of the same category data in the source domain and the target domain, so that conditional distribution of the two domains is the same, where the LMMD is defined as follows:
Figure FDA00039947730000000312
wherein ,xs and xt Is a sample of the source and target domains, E stands for mathematical expectation, p (c) and q(c) Respectively the distribution of c classes in the source domain and the target domain,
Figure FDA00039947730000000322
is a regenerative kernel hilbert space RKHS generated by a defined kernel function k (·, ·), Φ representing a feature mapping that maps raw data to RKHS;
will be the parameter w c Defined as the weight of each sample belonging to each class, the unbiased estimation of LMMD is defined as follows:
Figure FDA00039947730000000313
wherein ,
Figure FDA00039947730000000314
and
Figure FDA00039947730000000315
Respectively represent the ith source sample->
Figure FDA00039947730000000316
And the jth target sample->
Figure FDA00039947730000000317
The weight values belonging to the class C are,
Figure FDA00039947730000000318
and
Figure FDA00039947730000000319
and
Figure FDA00039947730000000320
Is a weighted sum of the class C samples;
Figure FDA00039947730000000321
The calculation method of (c) is as follows:
Figure FDA0003994773000000041
wherein ,yic Is a vector y i For source domain samples
Figure FDA0003994773000000042
Using a true source domain tag->
Figure FDA0003994773000000043
Is calculated by a one-hot encoding of>
Figure FDA0003994773000000044
^ based on each unlabeled target domain sample in unsupervised domain adaptation>
Figure FDA0003994773000000045
By taking>
Figure FDA0003994773000000046
Calculating a ≥ of the target sample as a kind of pseudo-label>
Figure FDA0003994773000000047
Calculating a feature vector ≥ of the source field>
Figure FDA0003994773000000048
Figure FDA0003994773000000049
And feature vectors of the target domain
Figure FDA00039947730000000410
The LMMD distances of (c) are as follows:
Figure FDA00039947730000000411
wherein k (·, ·) represents a kernel function;
calculating a kernel matrix K: the matrix is composed of inner product matrixes K respectively defined in a source domain, a target domain and a cross-domain s,s ,K t,t ,K s,t ,K t,s The expression is as follows:
Figure FDA00039947730000000412
the LMMD distance is expressed by a kernel matrix method, and each element W in a weight matrix W ij The definition is as follows:
Figure FDA00039947730000000413
based on the kernel matrix K and the weight matrix W, LMMD losses are represented as follows:
Figure FDA00039947730000000414
9. the method for diagnosing the mechanical fault under the variable working condition based on the nuclear sensitivity alignment network according to claim 2, wherein the step S7 specifically comprises the following steps:
Figure FDA0003994773000000051
and
Figure FDA0003994773000000052
Is the sum of the sample inner products of the source domain and the target domain in the RKHS, and the corresponding source domain kernel matrix K s,s And a target domain kernel matrix K t,t Is represented as follows:
Figure FDA0003994773000000053
Figure FDA0003994773000000054
obtaining a kernel sensitivity s of each sample by partial derivation of the samples by a source domain kernel matrix and a target domain kernel matrix i The calculation is as follows:
Figure FDA0003994773000000055
Figure FDA0003994773000000056
wherein, G (·) d The d-th element representing the feature vector,
Figure FDA0003994773000000057
and
Figure FDA0003994773000000058
Source domain samples and target domain samples, respectively.
Inputting nuclear sensitivity into nuclear sensitivity discriminator D m In (-) using the binary classification results and domain labels of the kernel sensitivity discriminator, the KSA loss was calculated using the binary cross entropy as follows:
Figure FDA0003994773000000059
wherein ,Lb (. Is) a binary cross-entropy loss function, d i =0 as source domain label, d j =1 is the target domain label.
10. The method for diagnosing the mechanical fault under the variable working condition based on the nuclear sensitivity alignment network as claimed in claim 1, wherein in the step S8, the expression of the total loss is as follows:
Figure FDA00039947730000000510
wherein ,λ12 Two balance parameters;
Figure FDA00039947730000000511
for the loss of the source field>
Figure FDA00039947730000000512
For LMMD loss, is>
Figure FDA00039947730000000513
For KSA loss, back propagation is used to minimize total loss->
Figure FDA00039947730000000514
Training parameters of a deep neural network model for the target;
parameter θ of feature extractor f Parameter θ of label classifier c And the parameter theta of the nuclear sensitivity discriminator m The update by back propagation is as follows:
Figure FDA0003994773000000061
Figure FDA0003994773000000062
Figure FDA0003994773000000063
where η represents the learning rate.
CN202211599722.4A 2022-12-12 2022-12-12 Nuclear sensitivity alignment network-based mechanical fault diagnosis method under variable working conditions Active CN115935187B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211599722.4A CN115935187B (en) 2022-12-12 2022-12-12 Nuclear sensitivity alignment network-based mechanical fault diagnosis method under variable working conditions

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211599722.4A CN115935187B (en) 2022-12-12 2022-12-12 Nuclear sensitivity alignment network-based mechanical fault diagnosis method under variable working conditions

Publications (2)

Publication Number Publication Date
CN115935187A true CN115935187A (en) 2023-04-07
CN115935187B CN115935187B (en) 2023-08-22

Family

ID=86655513

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211599722.4A Active CN115935187B (en) 2022-12-12 2022-12-12 Nuclear sensitivity alignment network-based mechanical fault diagnosis method under variable working conditions

Country Status (1)

Country Link
CN (1) CN115935187B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116340833A (en) * 2023-05-25 2023-06-27 中国人民解放军海军工程大学 Fault diagnosis method based on countermeasure migration network in improved field

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120262167A1 (en) * 2011-04-18 2012-10-18 General Electric Company Accelerated multispectral data magnetic resonance imaging system and method
CN114429150A (en) * 2021-12-30 2022-05-03 哈尔滨理工大学 Rolling bearing fault diagnosis method and system under variable working conditions based on improved depth subdomain adaptive network
CN114817548A (en) * 2022-05-13 2022-07-29 平安科技(深圳)有限公司 Text classification method, device, equipment and storage medium
CN115099270A (en) * 2022-06-16 2022-09-23 浙江大学 Bearing fault diagnosis method under variable load based on sub-domain adaptive countermeasure network
US20220383985A1 (en) * 2019-09-25 2022-12-01 Helmholtz Zentrum Muenchen - Deutsches Forschungszentrum Fuer Gesundhelt Und Umwelt (GmbH) Modelling method using a conditional variational autoencoder

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120262167A1 (en) * 2011-04-18 2012-10-18 General Electric Company Accelerated multispectral data magnetic resonance imaging system and method
US20220383985A1 (en) * 2019-09-25 2022-12-01 Helmholtz Zentrum Muenchen - Deutsches Forschungszentrum Fuer Gesundhelt Und Umwelt (GmbH) Modelling method using a conditional variational autoencoder
CN114429150A (en) * 2021-12-30 2022-05-03 哈尔滨理工大学 Rolling bearing fault diagnosis method and system under variable working conditions based on improved depth subdomain adaptive network
CN114817548A (en) * 2022-05-13 2022-07-29 平安科技(深圳)有限公司 Text classification method, device, equipment and storage medium
CN115099270A (en) * 2022-06-16 2022-09-23 浙江大学 Bearing fault diagnosis method under variable load based on sub-domain adaptive countermeasure network

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
NAOHIRO TAWARA等: "ROBUST SPEECH-AGE ESTIMATION USING LOCAL MAXIMUM MEAN DISCREPANCY UNDER MISMATCHED RECORDING CONDITIONS", 《2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP》, pages 114 - 121 *
YONGCHUN ZHU等: "Deep Subdomain Adaptation Network for Image Classification", 《IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS》, vol. 32, no. 4, pages 1713 - 1722, XP011847397, DOI: 10.1109/TNNLS.2020.2988928 *
袁壮等: "深度领域自适应及其在跨工况故障诊断中的应用", 振动与冲击, vol. 39, no. 12, pages 281 - 288 *
雷杨博等: "基于联合分布偏移差异的跨域滚动轴承故障诊断方法", 《电子测量与仪器学报》, vol. 36, no. 10, pages 146 - 156 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116340833A (en) * 2023-05-25 2023-06-27 中国人民解放军海军工程大学 Fault diagnosis method based on countermeasure migration network in improved field
CN116340833B (en) * 2023-05-25 2023-10-13 中国人民解放军海军工程大学 Fault diagnosis method based on countermeasure migration network in improved field

Also Published As

Publication number Publication date
CN115935187B (en) 2023-08-22

Similar Documents

Publication Publication Date Title
CN112763214B (en) Rolling bearing fault diagnosis method based on multi-label zero-sample learning
CN111273623B (en) Fault diagnosis method based on Stacked LSTM
CN113887136B (en) Electric automobile motor bearing fault diagnosis method based on improved GAN and ResNet
CN110135459B (en) Zero sample classification method based on double-triple depth measurement learning network
CN112633339A (en) Bearing fault intelligent diagnosis method, bearing fault intelligent diagnosis system, computer equipment and medium
CN112819059B (en) Rolling bearing fault diagnosis method based on popular retention transfer learning
Zhang et al. Triplet metric driven multi-head GNN augmented with decoupling adversarial learning for intelligent fault diagnosis of machines under varying working condition
CN114358123B (en) Generalized open set fault diagnosis method based on deep countermeasure migration network
CN114332649A (en) Cross-scene remote sensing image depth countermeasure transfer learning method based on dual-channel attention mechanism
CN114564987B (en) Rotary machine fault diagnosis method and system based on graph data
CN112001110A (en) Structural damage identification monitoring method based on vibration signal space real-time recursive graph convolutional neural network
CN108549817A (en) A kind of software security flaw prediction technique based on text deep learning
CN112147432A (en) BiLSTM module based on attention mechanism, transformer state diagnosis method and system
CN114997211A (en) Cross-working-condition fault diagnosis method based on improved countermeasure network and attention mechanism
CN114266289A (en) Complex equipment health state assessment method
CN115935187B (en) Nuclear sensitivity alignment network-based mechanical fault diagnosis method under variable working conditions
CN115374820A (en) Rotary machine cross-domain fault diagnosis method based on multi-source sub-domain adaptive network
CN112541524A (en) BP-Adaboost multi-source information motor fault diagnosis method based on attention mechanism improvement
Chou et al. SHM data anomaly classification using machine learning strategies: A comparative study
CN117312941A (en) Rolling bearing fault diagnosis method and device under sample unbalance condition
CN117932390A (en) Vibration signal analysis and diagnosis method based on integration of attention mechanism and DCGAN
CN115753102A (en) Bearing fault diagnosis method based on multi-scale residual error sub-domain adaptation
CN114383846B (en) Bearing composite fault diagnosis method based on fault label information vector
CN115479769A (en) Planetary gearbox open set fault diagnosis method based on target domain tilt countermeasure network
CN114626594A (en) Medium-and-long-term electric quantity prediction method based on cluster analysis and deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant