CN115935187A

CN115935187A - Mechanical Fault Diagnosis Method Under Variable Working Conditions Based on Kernel Sensitivity Alignment Network

Info

Publication number: CN115935187A
Application number: CN202211599722.4A
Authority: CN
Inventors: 彭雷; 张子蕴; 戴光明; 王茂才; 宋志明; 陈晓宇
Original assignee: China University of Geosciences
Current assignee: China University of Geosciences
Priority date: 2022-12-12
Filing date: 2022-12-12
Publication date: 2023-04-07
Anticipated expiration: 2042-12-12
Also published as: CN115935187B

Abstract

The invention discloses a mechanical fault diagnosis method under variable working conditions based on a nuclear sensitivity alignment network, which combines full local adaptation and sub-domain adaptation to construct a sub-domain adaptive depth neural network model of nuclear sensitivity alignment, and in the network model, local Maximum Mean Difference (LMMD) is realized as sub-domain adaptation to be distributed under alignment conditions. In addition, based on the model, the invention also provides a antagonism learning method for nuclear sensitivity alignment, so as to overcome the defects of LMMD. The antagonism learning method of nuclear sensitivity alignment (KSA) of the present invention is spatially location sensitive, and can significantly reduce domain bias by discriminating the relationship between sample features, compared to the conventional antagonism domain adaptation method. The invention can solve the technical problem of low accuracy of mechanical fault diagnosis in the variable working condition scene in the prior art.

Description

Mechanical fault diagnosis method under variable working conditions based on kernel sensitivity alignment network

技术领域Technical Field

本发明涉及机械智能故障诊断和计算机人工智能领域，具体涉及一种基于核敏感度对齐网络的变工况下机械故障诊断方法。The present invention relates to the fields of mechanical intelligent fault diagnosis and computer artificial intelligence, and in particular to a mechanical fault diagnosis method under variable working conditions based on a kernel sensitivity alignment network.

背景技术Background Art

旋转机械作为现代工业的重要机械设备，被广泛应用于各种工业设施和电气化系统。对于作为关键部件的滚动轴承来说，因其在高温、高速、疲劳、载荷变化区间大等恶劣环境中长期运行，最终可能会出现故障，产生高额的维修费用，甚至造成严重的事故。据统计，飞机机械故障中航空发动机的故障率约为40％，而其中滚动轴承故障一直占有较大比例。As an important mechanical equipment in modern industry, rotating machinery is widely used in various industrial facilities and electrification systems. As a key component, rolling bearings may eventually fail due to long-term operation in harsh environments such as high temperature, high speed, fatigue, and large load variation, resulting in high maintenance costs and even serious accidents. According to statistics, the failure rate of aircraft engines among aircraft mechanical failures is about 40%, and rolling bearing failures have always accounted for a large proportion.

使用深度学习方法的智能故障诊断能够处理海量的监测数据，并判断出机器的健康状态，这可以提高工业生产的可靠性和安全性。与以往常用的传统机器学习方法相比，这些深度学习模型不需要专家经验，以自适应的方式训练整个模型参数，自动学习关键特征并预测结果。但深度学习技术仍有局限性。大多数智能故障诊断方法的成功依赖于以下两个条件：第一，深度学习需要大量的有标记的故障数据进行模型训练；第二，训练和测试数据应满足相同的概率分布。然而，对于一些机器，很难满足这两个条件。考虑到许多工业设备的实际应用场景，收集足够的标记数据，特别是有标记的故障数据，是费时费力，甚至是不现实的。更重要的是，工业机械经常在恶劣、多变和复杂的环境中运行，这使得未来测试案例中的数据分布与预训练的模型所用的数据不同。Intelligent fault diagnosis using deep learning methods can process massive amounts of monitoring data and determine the health status of machines, which can improve the reliability and safety of industrial production. Compared with the traditional machine learning methods commonly used in the past, these deep learning models do not require expert experience, train the entire model parameters in an adaptive manner, automatically learn key features and predict results. But deep learning technology still has limitations. The success of most intelligent fault diagnosis methods depends on the following two conditions: first, deep learning requires a large amount of labeled fault data for model training; second, the training and test data should satisfy the same probability distribution. However, for some machines, it is difficult to meet these two conditions. Considering the actual application scenarios of many industrial equipment, collecting enough labeled data, especially labeled fault data, is time-consuming, labor-intensive, and even unrealistic. More importantly, industrial machinery often operates in harsh, changeable, and complex environments, which makes the data distribution in future test cases different from the data used by the pre-trained model.

对于故障诊断中复杂多变的工作条件，这种跨域的诊断任务可以用迁移学习的一个分支——领域自适应来解决，它可以在保持源域数据的良好分类性能的前提下提取不同领域之间较为相似的特征。在以前处理这类问题的方法中，经常使用全局域自适应，但这可能会混淆测试数据的类别。因此，能够调整源域和目标域在每个类别上的分布的子域自适应方法越来越受到关注。然而，现有的子域适应方法也有其局限性，即只能对齐部分子域的分布，该局限性使得变工况场景下机械故障诊断准确率低。For the complex and changeable working conditions in fault diagnosis, this cross-domain diagnosis task can be solved by a branch of transfer learning, domain adaptation, which can extract relatively similar features between different domains while maintaining good classification performance of source domain data. In previous methods for dealing with such problems, global domain adaptation is often used, but this may confuse the categories of test data. Therefore, subdomain adaptation methods that can adjust the distribution of source and target domains in each category are gaining more and more attention. However, existing subdomain adaptation methods also have their limitations, that is, they can only align the distribution of some subdomains, which makes the accuracy of mechanical fault diagnosis low under variable working conditions.

因此，提高在变工况场景下机械故障诊断的准确率是亟待解决的技术问题。Therefore, improving the accuracy of mechanical fault diagnosis under variable operating conditions is a technical problem that needs to be solved urgently.

发明内容Summary of the invention

为了解决上述技术问题，本发明提供了一种基于核敏感度对齐网络的变工况下机械故障诊断方法，可以在变工况场景下，实现机械故障的精确诊断。In order to solve the above technical problems, the present invention provides a method for diagnosing mechanical faults under variable operating conditions based on a kernel sensitivity alignment network, which can realize accurate diagnosis of mechanical faults under variable operating conditions.

本发明提供的技术方案具体为：一种基于核敏感度对齐网络的变工况下机械故障诊断方法，包括以下步骤：The technical solution provided by the present invention is specifically: a method for diagnosing mechanical faults under variable working conditions based on a nuclear sensitivity alignment network, comprising the following steps:

S1：从机械设备的不同工况下采集数据，构成源域数据集和目标域数据集；S1: Collect data from different working conditions of mechanical equipment to form source domain datasets and target domain datasets;

S2：对源域数据集和目标域数据集进行切片得到多个源域样本和目标域样本，并对每个源域样本和目标域样本进行归一化处理；S2: Slice the source domain dataset and the target domain dataset to obtain multiple source domain samples and target domain samples, and normalize each source domain sample and target domain sample;

S3：构建基于核敏感度对齐的子域自适应深度神经网络模型，包括：特征提取器、标签分类器、LMMD模块和核敏感度鉴别器；S3: Construct a subdomain adaptive deep neural network model based on kernel sensitivity alignment, including: feature extractor, label classifier, LMMD module and kernel sensitivity discriminator;

S4：将归一化处理后的源域样本和目标域样本分别输入到特征提取器中，获得源域的特征向量以及目标域的特征向量；S4: Input the normalized source domain samples and target domain samples into the feature extractor respectively to obtain the feature vector of the source domain and the feature vector of the target domain;

S5：将源域的特征向量输入到标签分类器中，使用预测结果和源域标签计算源域的分类损失；将目标域的特征向量输入到标签分类器中，获得目标域的伪标签；S5: Input the feature vector of the source domain into the label classifier, and use the prediction result and the source domain label to calculate the classification loss of the source domain; input the feature vector of the target domain into the label classifier to obtain the pseudo label of the target domain;

S6：将源域的特征向量、目标域的特征向量、源域标签和目标域的伪标签输入到LMMD模块，生成源域核矩阵、目标域核矩阵和LMMD损失；S6: Input the feature vector of the source domain, the feature vector of the target domain, the source domain label and the pseudo label of the target domain into the LMMD module to generate the source domain kernel matrix, the target domain kernel matrix and the LMMD loss;

S7：根据源域的特征向量、目标域的特征向量、源域核矩阵、目标域核矩阵计算出源域样本和目标域样本相应的核敏感度，将核敏感度输入到核敏感度鉴别器获得KSA损失；S7: Calculate the corresponding kernel sensitivities of source domain samples and target domain samples according to the feature vector of the source domain, the feature vector of the target domain, the source domain kernel matrix, and the target domain kernel matrix, and input the kernel sensitivity into the kernel sensitivity discriminator to obtain the KSA loss;

S8：将源域的分类损失、LMMD损失、KSA损失相加得到总损失，以总损失最小为优化目标，使用随机梯度下降法优化模型；S8: The classification loss, LMMD loss, and KSA loss of the source domain are added together to get the total loss. The minimum total loss is taken as the optimization goal, and the model is optimized using the stochastic gradient descent method.

S9：判断是否达到指定迭代次数，若是，则结束训练，通过训练好的深度神经网络模型进行变工况下机械故障诊断，并获得故障诊断结果；否则返回步骤S4。S9: Determine whether the specified number of iterations has been reached. If so, end the training and use the trained deep neural network model to perform mechanical fault diagnosis under variable working conditions and obtain fault diagnosis results; otherwise, return to step S4.

优选的，S1和S2具体包括：Preferably, S1 and S2 specifically include:

采集一种已知故障信息的轴承振动信号作为源域数据集

对源域数据集进行分类作为源任务

Collect a bearing vibration signal with known fault information as the source domain data set

Classify the source domain dataset as the source task

采集其他工况下未知故障信息的轴承振动信号作为目标域数据集

对目标域数据集进行分类作为目标任务

其中，

和

分别表示源域和目标域的特征空间，P^S(X^S)和P^T(X^T)分别表示源域和目标域的概率分布，

表示由源域的共n_s个样本组成的数据集，

表示由目标域中的共n_t个样本组成的数据集，

和

分别表示源任务和目标任务的标签空间，f^S(·)和f^T(·)是源域和目标域的映射函数表示数据集的样本与预测结果之间的关系；Collect bearing vibration signals with unknown fault information under other working conditions as the target domain data set

Classify the target domain dataset as the target task

in,

and

denote the feature spaces of the source domain and the target domain respectively, P ^S (X ^S ) and P ^T (X ^T ) denote the probability distributions of the source domain and the target domain respectively,

represents a dataset consisting of n _s samples in the source domain,

represents a dataset consisting of a total of n _t samples in the target domain,

and

denote the label space of the source task and the target task respectively, f ^S (·) and f ^T (·) are the mapping functions of the source domain and the target domain, representing the relationship between the samples of the dataset and the prediction results;

通过滑动窗口对采集的源域数据集和目标域数据集进行切分生成源域样本和目标域样本；The collected source domain data set and target domain data set are segmented through a sliding window to generate source domain samples and target domain samples;

对每个源域样本和目标域样本均进行归一化处理。Each source domain sample and target domain sample is normalized.

优选的，步骤S3中，所述特征提取器包括依次设置的三个一维卷积层、一个展平层和一个全连接层；其中前两个卷积层的卷积核尺寸较大，后一个卷积层的卷积核尺寸较小，每个卷积层后面都有一个最大池化层；在每个卷积层后使用批标准化和Leaky ReLU函数，在全连接层后使用ReLU函数。Preferably, in step S3, the feature extractor includes three one-dimensional convolutional layers, a flattening layer and a fully connected layer arranged in sequence; the convolution kernel size of the first two convolutional layers is larger, and the convolution kernel size of the last convolutional layer is smaller, and each convolutional layer is followed by a maximum pooling layer; batch normalization and Leaky ReLU function are used after each convolutional layer, and ReLU function is used after the fully connected layer.

所述标签分类器包括一个全连接层，输入维度数量为特征向量的维度数，输出维度为轴承故障类别的数量。The label classifier includes a fully connected layer, the number of input dimensions is the number of dimensions of the feature vector, and the output dimension is the number of bearing fault categories.

所述核敏感度鉴别器包括一个梯度反转层GRL和三个依次设置的全连接层，每个全连接层后都使用批标准化、ReLU函数和dropout函数。The kernel sensitivity discriminator includes a gradient reversal layer GRL and three fully connected layers arranged in sequence, and batch normalization, ReLU function and dropout function are used after each fully connected layer.

优选的，步骤S4具体包括：Preferably, step S4 specifically includes:

对于源域样本

和目标域样本

使用特征提取器G(·)将x^s和x^t通过

和

映射到一个共同特征空间，其中，

表示源域和目标域的D维特征向量。For source domain samples

and target domain samples

Use the feature extractor G(·) to transform ^xs and ^xt through

and

is mapped to a common feature space, where

Represents the D-dimensional feature vector of the source domain and the target domain.

优选的，步骤S5具体包括：Preferably, step S5 specifically includes:

将源域的特征向量

和目标域的特征向量

送入标签分类器C(·)进行预测，得到预测结果为

其中，

分别是源域和目标域的得分向量，K为样本的种类数；The feature vector of the source domain

and the feature vector of the target domain

Send it to the label classifier C(·) for prediction, and the prediction result is

in,

are the score vectors of the source domain and the target domain respectively, and K is the number of sample types;

根据z^s和真实的源域标签

使用标准交叉熵公式计算源域的分类损失，并通过反向传播最小化损失来训练由特征提取器G(·)和标签分类器C(·)构成的分类模型，模型在源域上的分类损失

表示如下：According to z ^s and the true source domain label

The classification loss of the source domain is calculated using the standard cross entropy formula, and the classification model consisting of the feature extractor G(·) and the label classifier C(·) is trained by minimizing the loss through back propagation. The classification loss of the model on the source domain is

It is expressed as follows:

其中，L_c(·，·)是交叉熵损失函数；Where L _c (·, ·) is the cross entropy loss function;

将目标域的得分向量z^t由softmax函数处理，得到向量

的每个元素

都代表

属于相应k类别的概率，其计算如下：The score vector z ^t of the target domain is processed by the softmax function to obtain the vector

Each element of

All represent

The probability of belonging to the corresponding k categories is calculated as follows:

采用

作为

的伪标签。use

As

's pseudo-labels.

优选的，步骤S6中：所述LMMD模块，用于对源域和目标域中相同类别数据的分布进行对齐，使两个域的条件分布相同，LMMD定义如下：Preferably, in step S6: the LMMD module is used to align the distribution of the same category data in the source domain and the target domain so that the conditional distribution of the two domains is the same. The LMMD is defined as follows:

其中，x^s和x^t是源域和目标域的样本，E代表数学期望，p^(c)和q^(c)分别是源域和目标域中c类的分布，

是由定义的核函数k(·，·)产生的再生核希尔伯特空间RKHS，Φ表示将原始数据映射到RKHS的特征映射；Among them, ^xs and ^xt are samples of the source domain and the target domain, E represents the mathematical expectation, p ^(c) and q ^(c) are the distributions of class c in the source domain and the target domain, respectively.

is the reproducing kernel Hilbert space RKHS generated by the defined kernel function k(·,·), Φ represents the feature map that maps the original data to RKHS;

将参数w^c定义为每个样本属于每个类别的权重，则对LMMD的无偏估计定义如下：Define the parameter w ^c as the weight of each sample belonging to each category, then the unbiased estimate of LMMD is defined as follows:

其中，

和

分别表示第i个源样本

和第j个目标样本

属于C类的权值，

和

以及

是类别C样本的加权和；

的计算方式如下：in,

and

Represents the i-th source sample

and the jth target sample

The weights belonging to class C,

and

as well as

is the weighted sum of samples of category C;

The calculation method is as follows:

其中，y_ic是向量y_i的第c个元素，对于源域样本

使用真实的源域标签

的one-hot编码来计算

对于无监督领域自适应中的每个未标记的目标领域样本

采用

作为一种伪标签来计算目标样本的

计算源域的特征向量

和目标域的特征向量

的LMMD距离如下：Among them, y _ic is the c-th element of the vector _yi , for the source domain sample

Use true source domain labels

One-hot encoding is used to calculate

For each unlabeled target domain sample in unsupervised domain adaptation

use

As a pseudo label to calculate the target sample

Calculate the eigenvector of the source domain

and the feature vector of the target domain

The LMMD distance is as follows:

其中，k(·，·)表示核函数；Where k(·,·) represents the kernel function;

计算核矩阵K：该矩阵由分别定义在源域、目标域和跨域的内积矩阵K_s，s，K_t，t，K_s，t，K_t，s组成，表达式如下：Calculate the kernel matrix K: This matrix is composed of the inner product matrices Ks _,s , Kt _,t , _Ks,t , _Kt,s defined in the source domain, target domain, and cross-domain respectively. The expression is as follows:

将LMMD距离用核矩阵方法来表示，权值矩阵W中的每个元素W_ij定义如下：The LMMD distance is represented by the kernel matrix method, and each element _Wij in the weight matrix W is defined as follows:

基于核矩阵K和权值矩阵W，LMMD损失表示如下：Based on the kernel matrix K and the weight matrix W, the LMMD loss is expressed as follows:

优选的，步骤S7具体包括：Preferably, step S7 specifically includes:

和

是RKHS中源域和目标域的样本内积之和，其对应的源域核矩阵K_s，s和目标域核矩阵K_t，t表示如下：

and

It is the sum of the sample inner products of the source domain and the target domain in RKHS. The corresponding source domain kernel matrix _Ks,s and target domain kernel matrix _Kt,t are expressed as follows:

通过源域核矩阵和目标域核矩阵对样本求偏导来获得每个样本的核敏感度s_i，计算如下：The kernel sensitivity s _i of each sample is obtained by taking partial derivatives of the source domain kernel matrix and the target domain kernel matrix, and is calculated as follows:

其中，G(·)_d表示特征向量的第d个元素，

和

分别为源域样本和目标域样本。Where G(·) _d represents the dth element of the eigenvector,

and

are source domain samples and target domain samples respectively.

将核敏感度输入核敏感度鉴别器D_m(·)中，利用核敏感度鉴别器的二元分类结果和域标签，使用二元交叉熵计算出KSA损失如下：The kernel sensitivity is input into the kernel sensitivity discriminator _Dm (·), and the binary classification result of the kernel sensitivity discriminator and the domain label are used to calculate the KSA loss using binary cross entropy as follows:

其中，L_b(·，·)是二元交叉熵损失函数，d_i＝0为源域标签，d_j＝1为目标域标签。Where L _b (·, ·) is the binary cross entropy loss function, d _i = 0 is the source domain label, and d _j = 1 is the target domain label.

优选的，步骤S8中，总损失的表达式如下：Preferably, in step S8, the expression of total loss is as follows:

其中，λ₁，λ₂为两个平衡参数；

为源域的损失，

为LMMD损失，

为KSA损失，使用反向传播以最小化总损失

为目标来训练深度神经网络模型的参数；Among them, λ ₁ and λ ₂ are two equilibrium parameters;

is the loss of the source domain,

is the LMMD loss,

For KSA loss, backpropagation is used to minimize the total loss

To train the parameters of the deep neural network model for the purpose;

特征提取器的参数θ_f、标签分类器的参数θ_c和核敏感度鉴别器的参数θ_m通过反向传播更新如下：The parameters θ _f of the feature extractor, θ _c of the label classifier, and θ _m of the kernel sensitivity discriminator are updated through back-propagation as follows:

其中，η表示学习率。Here, η represents the learning rate.

本发明提供的技术方案具有以下有益效果：The technical solution provided by the present invention has the following beneficial effects:

本发明公开了一种基于核敏感度对齐网络的变工况下机械故障诊断方法，该方法结合全局域适应和子域适应构建了一种核敏感度对齐的子域自适应深度神经网络模型，在这个网络模型中，实现了局部最大平均差异(Local Maximum Mean Discrepancy，LMMD)作为子域自适应，以对齐条件分布。此外基于该模型，本发明还提出了一种核敏感度对齐的对抗性学习方法，以克服LMMD的缺点。与传统的对抗性域适应方法相比，本发明的核敏感度对齐的对抗性学习方法是空间位置敏感的，可以通过对样本特征之间的关系进行辨别来显著减少域偏移。本发明可以解决现有技术在变工况场景下，机械故障诊断准确率低的技术问题。The present invention discloses a method for diagnosing mechanical faults under variable working conditions based on a kernel sensitivity alignment network. The method combines global domain adaptation and subdomain adaptation to construct a subdomain adaptive deep neural network model with kernel sensitivity alignment. In this network model, the local maximum mean difference (LMMD) is implemented as subdomain adaptation to align conditional distributions. In addition, based on the model, the present invention also proposes an adversarial learning method for kernel sensitivity alignment to overcome the shortcomings of LMMD. Compared with traditional adversarial domain adaptation methods, the adversarial learning method for kernel sensitivity alignment of the present invention is sensitive to spatial position and can significantly reduce domain offset by discerning the relationship between sample features. The present invention can solve the technical problem of low accuracy of mechanical fault diagnosis in variable working condition scenarios in the prior art.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

下面将结合附图及实施例对本发明作进一步说明，附图中：The present invention will be further described below with reference to the accompanying drawings and embodiments, in which:

图1为本发明实施例中一种基于核敏感度对齐网络的变工况下机械故障诊断方法的总体流程图；FIG1 is an overall flow chart of a method for diagnosing mechanical faults under variable working conditions based on a kernel sensitivity alignment network in an embodiment of the present invention;

图2为本发明实施例中基于核敏感度对齐的子域自适应深度神经网络模型框架；FIG2 is a subdomain adaptive deep neural network model framework based on kernel sensitivity alignment in an embodiment of the present invention;

图3为本发明实施例中帕德博恩数据集实验平台；FIG3 is a Paderborn data set experimental platform according to an embodiment of the present invention;

图4为本发明实施例中帕德博恩数据集C→A任务中每种方法在目标域上的预测精度；FIG4 shows the prediction accuracy of each method in the target domain in the Paderborn dataset C→A task according to an embodiment of the present invention;

图5为本发明实施例中帕德博恩数据集B→C任务的混淆矩阵；图5(a)对应DSAN方法；图5(b)对应本发明方法；FIG5 is a confusion matrix of the Paderborn dataset B→C task in an embodiment of the present invention; FIG5(a) corresponds to the DSAN method; FIG5(b) corresponds to the method of the present invention;

图6为本发明实施例中帕德博恩数据集B→C任务的t-SNE结果；图6(a)对应DSAN方法；图6(b)对应本发明方法。FIG6 is a t-SNE result of the Paderborn dataset B→C task in an embodiment of the present invention; FIG6(a) corresponds to the DSAN method; and FIG6(b) corresponds to the method of the present invention.

具体实施方式DETAILED DESCRIPTION

为了对本发明的技术特征、目的和效果有更加清楚的理解，现对照附图详细说明本发明的具体实施方式。In order to have a clearer understanding of the technical features, purposes and effects of the present invention, specific embodiments of the present invention are now described in detail with reference to the accompanying drawings.

参考图1，本发明提供了一种基于核敏感度对齐网络的变工况下机械故障诊断方法，具体包括以下步骤：Referring to FIG1 , the present invention provides a method for diagnosing mechanical faults under variable working conditions based on a kernel sensitivity alignment network, which specifically includes the following steps:

作为一种优选的实施例，本发明以机械故障的轴承故障为例，对本发明所述的基于核敏感度对齐网络的变工况下机械故障诊断方法进行详细的说明：As a preferred embodiment, the present invention takes a bearing fault of a mechanical fault as an example to explain in detail the mechanical fault diagnosis method under variable working conditions based on a kernel sensitivity alignment network of the present invention:

如图1所示，一种基于核敏感度对齐网络的变工况下机械故障诊断方法，其可以概括为3大部分，即变工况下轴承故障数据收集、构建基于核敏感度对齐的子域自适应深度神经网络模型以及训练模型。As shown in Figure 1, a mechanical fault diagnosis method under variable working conditions based on kernel sensitivity alignment network can be summarized into three parts, namely, bearing fault data collection under variable working conditions, construction of a subdomain adaptive deep neural network model based on kernel sensitivity alignment, and training model.

(1)变工况下轴承故障数据收集：(1) Bearing failure data collection under variable working conditions:

在实际的机械故障诊断场景中，导致训练数据和测试数据之间分布变化的主要因素是速度、负载或操作的频繁变化导致的机械工作状态的变化。因此，在问题定义中，以这些属性的改变作为轴承的不同工作状况，不同的工作状况称为不同的域。In actual mechanical fault diagnosis scenarios, the main factor that causes the distribution change between training data and test data is the change in mechanical working state caused by frequent changes in speed, load or operation. Therefore, in the problem definition, the changes in these attributes are used as different working conditions of the bearing, and different working conditions are called different domains.

一个域

可以由两部分定义：特征空间

和概率分布P(X)，其中

是一个由领域

中的样本组成的数据集。对于一个任务

它由一个标签空间

和一个映射函数f(·)定义，其中

是域

中相应样本的标签集(y_i是x_i的标签)。映射函数f(·)，也表示为f(x)＝P(y|x)，表示数据集X的样本与预测结果之间的关系。A domain

It can be defined by two parts: feature space

and probability distribution P(X), where

It is a field

For a task

It consists of a label space

and a mapping function f(·) is defined, where

Is Domain

The mapping function f(·), also expressed as f(x ₎ = P(y|x), represents the relationship between the samples in the dataset _X and the predicted results.

本发明使用加速度计收集一种工况下已知故障信息的轴承振动信号作为源域数据

对源域数据进行分类作为源任务

收集其他工况下未知故障信息的轴承振动信号作为目标域数据

和对应的目标任务

其中，

和

表示由源域的共n_s个样本组成的数据集，

表示由目标域中的共n_t个样本组成的数据集，

和

分别表示源任务和目标任务的标签空间，f^S(·)和f^T(·)是源域和目标域的映射函数表示数据集的样本与预测结果之间的关系；The present invention uses an accelerometer to collect bearing vibration signals with known fault information under a working condition as source domain data.

Classify source domain data as the source task

Collect bearing vibration signals with unknown fault information under other working conditions as target domain data

and the corresponding target tasks

in,

and

represents a dataset consisting of n _s samples in the source domain,

and

对于收集的振动数据，通过滑动窗口对其进行切分生成样本。滑动窗口的大小一般选择为接近轴承旋转两个周期采样点的2的倍数，在本实施例中优选选择4096个采样点作为时间窗口。对于收集的每个样本都进行归一化处理。每种工况下具有多种故障类型，包括：正常状态、外圈故障、内圈故障和滚动体故障，其中，故障类型可能有不同的损伤大小以及不同的复合情况。本实施例假设源域和目标域具有相同的特征空间和故障类型，即

但两个域的分布不同，即P^S(X^S)≠P^T(X^T)。本发明的目的是要找到一个合适的映射f^S→T(·)：X^S→Y^T以减少源域和目标域之间的分布差异。基于有标记的源域数据，X^S和Y^S，以及未标记的目标域数据X^T，通过训练神经网络，源域和目标域在同一映射f^S→T(·)后的分布尽可能一致。The collected vibration data is segmented and generated into samples through a sliding window. The size of the sliding window is generally selected to be a multiple of 2 close to the sampling points of two cycles of bearing rotation. In this embodiment, 4096 sampling points are preferably selected as the time window. Each collected sample is normalized. There are multiple fault types under each working condition, including: normal state, outer ring fault, inner ring fault and rolling element fault, among which the fault types may have different damage sizes and different complex situations. This embodiment assumes that the source domain and the target domain have the same feature space and fault type, that is,

But the distributions of the two domains are different, that is, P ^S (X ^S ) ≠ P ^T (X ^T ). The purpose of the present invention is to find a suitable mapping f ^S→T (·): X ^S →Y ^T to reduce the distribution difference between the source domain and the target domain. Based on the labeled source domain data, X ^S and Y ^S , and the unlabeled target domain data ^XT , by training a neural network, the distributions of the source domain and the target domain after the same mapping f ^S→T (·) are as consistent as possible.

(2)构建基于核敏感度对齐的子域自适应深度神经网络模型：(2) Constructing a subdomain adaptive deep neural network model based on kernel sensitivity alignment:

如图2所示，该模型包含4个模块，分别为特征提取器、标签分类器、局部最大均值差异模块(LMMD模块)以及核敏感度鉴别器(KSA模块)。As shown in Figure 2, the model consists of four modules, namely, feature extractor, label classifier, local maximum mean difference module (LMMD module) and kernel sensitivity discriminator (KSA module).

(2.1)构建特征提取器：(2.1) Construct feature extractor:

特征提取器作为一个映射，用于减少原始数据的维度并提取出有效的特征供后续使用。由于收集的振动信号为一维数据，故本发明使用一维卷积神经网络直接处理原始的一维振动数据，提取出特征。特征提取器由三个一维卷积层、一个展平层和一个全连接层组成。本发明使用两个大尺寸核的卷积层和一个小尺寸核的卷积层，每个卷积层后面都有一个最大池化层。在每个卷积层之后使用批标准化(Batch Normalization)和Leaky ReLU函数，在全连接层之后使用ReLU函数。The feature extractor is used as a mapping to reduce the dimension of the original data and extract effective features for subsequent use. Since the collected vibration signal is one-dimensional data, the present invention uses a one-dimensional convolutional neural network to directly process the original one-dimensional vibration data and extract features. The feature extractor consists of three one-dimensional convolutional layers, a flattening layer and a fully connected layer. The present invention uses two convolutional layers with large-size kernels and one convolutional layer with a small-size kernel, and each convolutional layer is followed by a maximum pooling layer. Batch Normalization and Leaky ReLU functions are used after each convolutional layer, and ReLU functions are used after the fully connected layer.

对于源域样本

和目标域样本

使用特征提取器G(·)将x^s和x^t通过

和

映射到一个共同特征空间，其中，

表示源域和目标域的D维特征向量。For source domain samples

and target domain samples

Use the feature extractor G(·) to transform ^xs and ^xt through

and

is mapped to a common feature space, where

(2.2)构建标签分类器：(2.2) Build a label classifier:

标签分类器根据样本被提取出的特征向量，预测该样本的标签；本发明中标签分类器由一个全连接层组成，其输入维度数量为特征向量的维度数，其输出维度为轴承故障类别的数量。The label classifier predicts the label of the sample according to the feature vector extracted from the sample; in the present invention, the label classifier consists of a fully connected layer, the number of its input dimensions is the number of dimensions of the feature vector, and the number of its output dimensions is the number of bearing fault categories.

将源域的特征向量

和目标域的特征向量

送入标签分类器C(·)进行预测，得到预测结果为

其中，

and the feature vector of the target domain

in,

根据z^s和真实的源域数据标签

表示如下：According to z ^s and the true source domain data label

It is expressed as follows:

其中L_c(·，·)是交叉熵损失函数；Where L _c (·, ·) is the cross entropy loss function;

将目标域的得分向量z^t由softmax函数处理，得到向量

的每个元素

都代表

Each element of

All represent

采用

作为

的伪标签。use

As

's pseudo-labels.

(2.3)构建LMMD模块；(2.3) Construct LMMD module;

为了使源域数据和目标域数据在映射后能有相似的分布，一般采用领域自适应的方法来训练模型。领域自适应可以从一个或多个源任务中提取有用的知识，并将这些知识应用于目标任务，其中源域和目标域的分布是不同但相关的。一个常见的策略是找到一个合适的指标来评估分布的相似性，并优化模型，使不同领域之间的分布差异最小。因此，指标的质量将直接影响模型的性能。In order to make the source domain data and the target domain data have similar distributions after mapping, domain adaptation is generally used to train the model. Domain adaptation can extract useful knowledge from one or more source tasks and apply this knowledge to the target task, where the distributions of the source domain and the target domain are different but related. A common strategy is to find a suitable metric to evaluate the similarity of distributions and optimize the model to minimize the distribution differences between different domains. Therefore, the quality of the metric will directly affect the performance of the model.

在众多的统计距离度量中，最大均值差异(Maximum Mean Discrepancy，MMD)是迁移学习中使用最广泛的距离度量方法。它的作用是找一个核函数，将源域和目标域的数据样本都映射到一个再生核的希尔伯特空间(Reproducing Kernel Hilbert Space，RKHS)上，在RKHS上取这个两个域的数据样本分别作均值之后的差，然后将这个差作为距离。MMD的公式定义如下：Among the many statistical distance metrics, Maximum Mean Discrepancy (MMD) is the most widely used distance metric in transfer learning. Its function is to find a kernel function, map the data samples of the source domain and the target domain to a reproducing kernel Hilbert space (RKHS), take the difference of the mean of the data samples of the two domains on the RKHS, and then use this difference as the distance. The formula of MMD is defined as follows:

其中

是由定义的核函数k(·，·)产生的RKHS。Φ(·)表示将原始数据映射到RKHS的特征映射。n代表X中的样本数，m代表Y中的样本数。在实际应用中，通常使用MMD的平方作为分布差异的度量。计算源域样本的特征

和目标域样本的特征

的MMD距离如下：in

is the RKHS generated by the defined kernel function k(·,·). Φ(·) represents the feature map that maps the original data to the RKHS. n represents the number of samples in X, and m represents the number of samples in Y. In practical applications, the square of MMD is usually used as a measure of distribution difference. Calculate the features of the source domain samples

and the characteristics of the target domain samples

The MMD distance is as follows:

其中，核函数k(x_i，x_j)＝<φ(x_i)，φ(x_j)>代表两个样本在RKHS上的内积。本发明中使用高斯核函数，定义如下：Wherein, kernel function k( _xi , _xj ) = <φ( _xi ), φ( _xj )> represents the inner product of two samples on RKHS. The Gaussian kernel function is used in the present invention and is defined as follows:

其中，σ是核函数的带宽。为方便计算，引入了核矩阵，该矩阵由分别定义在源域、目标域和跨域的内积矩阵K_s，s，K_t，t，K_s，t，K_t，s组成，如下所示：Where σ is the bandwidth of the kernel function. To facilitate calculation, the kernel matrix is introduced, which consists of the inner product matrices K _s,s , K _t,t , K _s,t , K _t,s defined in the source domain, target domain, and cross-domain, respectively, as shown below:

定义L作为一个权值矩阵，其中的元素L_ij的计算如下：Define L as a weight matrix, where the elements _Lij are calculated as follows:

在上面定义的核矩阵技巧的帮助下，公式(3)中的MMD距离可以写成：With the help of the kernel matrix trick defined above, the MMD distance in formula (3) can be written as:

MMD已被广泛用于测量源域和目标域之间的分布差异。然而，基于MMD的经典方法只对齐源域和目标域的全局分布，很少考虑不同工作条件下特征和输出标签的子域分布差异。这可能会失去每个类别的细粒度信息，导致区分结构的混淆。由于不同子域的数据之间的距离太近，在分类边界附近会出现错误。MMD has been widely used to measure the distribution difference between the source and target domains. However, classic MMD-based methods only align the global distribution of the source and target domains, and rarely consider the subdomain distribution differences of features and output labels under different working conditions. This may lose the fine-grained information of each category, leading to confusion in distinguishing structures. Since the distance between data from different subdomains is too close, errors will occur near the classification boundary.

本发明使用局部最大平均差异(LMMD)来实现子域适应，代替传统的MMD，可以应对上述挑战。子域是源域或目标域中的一个类别，它包含同一故障类别的样本。子域适应的核心是学习局部域的转移。LMMD根据不同样本的权重计算出源域和目标域中相同子域样本在RKHS中的平均差异。基于这一思想，LMMD对源域和目标域中相同类别数据的分布进行对齐，使两个域的条件分布相同，定义如下：The present invention uses the local maximum mean difference (LMMD) to achieve subdomain adaptation instead of the traditional MMD to meet the above challenges. A subdomain is a category in the source domain or the target domain, which contains samples of the same fault category. The core of subdomain adaptation is to learn the transfer of local domains. LMMD calculates the average difference of the same subdomain samples in the source domain and the target domain in RKHS according to the weights of different samples. Based on this idea, LMMD aligns the distribution of the same category data in the source domain and the target domain so that the conditional distribution of the two domains is the same, which is defined as follows:

其中，

和

分别表示第i个源样本

和第j个目标样本

属于C类的权值，

和

以及

是类别C样本的加权和；

的计算方式如下：in,

and

Respectively represent the i-th source sample

and the jth target sample

The weights belonging to class C,

and

as well as

is the weighted sum of samples of category C;

The calculation method is as follows:

其中，y_ic是向量y_i的第c个元素，对于源域样本

使用真实的源域标签

的one-hot编码来计算

对于无监督领域自适应中的每个未标记的目标领域样本

采用

作为一种伪标签来计算目标样本的

计算源域特征向量

和目标域特征向量

Use true source domain labels

One-hot encoding is used to calculate

For each unlabeled target domain sample in unsupervised domain adaptation

use

As a pseudo label to calculate the target sample

Calculate the source domain feature vector

and the target domain feature vector

The LMMD distance is as follows:

(2.4)构建核敏感鉴别器(KSA模块)(2.4) Constructing the Kernel Sensitive Discriminator (KSA module)

为了计算LMMD公式中某类的损失值，当前批次的源域和目标域都必须有该类的样本。而受批次的大小、随机性、模型的预测准确率等影响，有些类的损失只计算了几次，导致源域和目标域在这些类上的分布没有对齐。当预测的伪标签不正确时，使用LMMD损失训练特征提取器会导致源域和目标域在错误的类上对齐。而模型将更有可能会对目标域样本产生错误的预测。In order to calculate the loss value of a certain class in the LMMD formula, both the source domain and the target domain of the current batch must have samples of this class. However, due to the size of the batch, randomness, and the prediction accuracy of the model, the loss of some classes is only calculated a few times, resulting in the misalignment of the distribution of the source and target domains on these classes. When the predicted pseudo-labels are incorrect, training the feature extractor with the LMMD loss will cause the source and target domains to align on the wrong classes. The model will be more likely to make incorrect predictions for the target domain samples.

为了解决LMMD的局限性，本发明提出了一种称为核敏感度对齐(KernelSensitivity Alignment，KSA)的新方法，以更加减小源域和目标域的域偏移。一个样本的核敏感度可以被看作是影响同一领域的所有样本在RKHS上的内积和的能力。在深度学习模型中，映射的RKHS通常是高度复杂的，所以只有两个距离非常近的样本可能有相似的核敏感度。To address the limitations of LMMD, this paper proposes a new method called Kernel Sensitivity Alignment (KSA) to further reduce the domain shift between the source and target domains. The kernel sensitivity of a sample can be regarded as the ability to affect the inner product sum of all samples in the same domain on the RKHS. In deep learning models, the mapped RKHS is usually highly complex, so only two samples that are very close may have similar kernel sensitivities.

基于该核敏感度对齐方法，构建一种核敏感度鉴别器，其具体包括一个梯度反转层GRL和三个依次设置的全连接层，每个全连接层后都使用批标准化、ReLU函数和dropout函数。Based on the kernel sensitivity alignment method, a kernel sensitivity discriminator is constructed, which specifically includes a gradient reversal layer GRL and three fully connected layers arranged in sequence. Batch normalization, ReLU function and dropout function are used after each fully connected layer.

根据核函数的定义，它将原始空间的向量作为输入，并返回特征空间的向量的内积。因此，公式(3)中

和

是RKHS中源域和目标域的样本内积之和，其对应的源域核矩阵K_s，s和目标域核矩阵K_t，t表示如下：According to the definition of the kernel function, it takes the vector of the original space as input and returns the inner product of the vector of the feature space. Therefore, in formula (3)

and

It is the sum of the sample inner products of the source domain and the target domain in RKHS. The corresponding source domain kernel matrix K _{s, s} and target domain kernel matrix K _{t, t} are expressed as follows:

核矩阵中的每个元素都衡量了映射后的高维空间中两个样本之间的关联性。内积和对样本的敏感度可以通过核矩阵对样本求偏导来获得，源域和目标域中每个样本的核敏感度s_i，计算如下：Each element in the kernel matrix measures the correlation between two samples in the mapped high-dimensional space. The inner product and the sensitivity to the sample can be obtained by taking the partial derivative of the kernel matrix on the sample. The kernel sensitivity s _i of each sample in the source domain and the target domain is calculated as follows:

其中，G(·)_d表示特征向量的第d个元素，

和

and

are source domain samples and target domain samples respectively.

为保证两个域的核敏感度分布一致，可以使用核敏感度鉴别器D_m(·)来判断核敏感度是来自源域还是目标域，而特征提取器G(·)则试图将其混淆。在保持源域数据正确分类的前提下，对抗性学习可以进一步减少两个域之间的分布差异。利用核敏感度鉴别器的二元分类结果和域标签，使用二元交叉熵计算出KSA损失如下：To ensure that the kernel sensitivity distribution of the two domains is consistent, the kernel sensitivity discriminator D _m (·) can be used to determine whether the kernel sensitivity comes from the source domain or the target domain, while the feature extractor G (·) attempts to confuse it. On the premise of keeping the source domain data correctly classified, adversarial learning can further reduce the distribution difference between the two domains. Using the binary classification results of the kernel sensitivity discriminator and the domain label, the KSA loss is calculated using binary cross entropy as follows:

(3)训练模型(3) Training Model

本发明方法的模型框架如图2所示，该模型的主干是一个一维卷积神经网络(One-dimensional Convolutional Neural Network，1D-CNN)作为特征提取器和一个全连接层作为标签分类器组成的预测目标域标签的分类模型。为了解决域偏移问题，本发明设计了LMMD和KSA两个模块。LMMD模块需要四个输入参数：源域特征f^s，目标特征f^t，源域真实标签y_s和目标域伪标签

KSA模块(核敏感度鉴别器)通过对特征提取器G(·)和核敏感度鉴别器D_m(·)的对抗性学习，使源域和目标域的核敏感度分布一致。在LMMD模块计算之后，可以得到源域和目标域的核矩阵。基于源域和目标域的核矩阵，可以计算出源域样本和目标域样本相应的核敏感度，并将其送入核敏感度鉴别器进行二元分类判别。The model framework of the method of the present invention is shown in FIG2 . The backbone of the model is a classification model for predicting the target domain label, which is composed of a one-dimensional convolutional neural network (1D-CNN) as a feature extractor and a fully connected layer as a label classifier. In order to solve the domain shift problem, the present invention designs two modules, LMMD and KSA. The LMMD module requires four input parameters: source domain feature f ^s , target feature f ^t , source domain true label y _s and target domain pseudo label

The KSA module (kernel sensitivity discriminator) makes the kernel sensitivity distribution of the source domain and the target domain consistent through adversarial learning of the feature extractor G(·) and the kernel sensitivity discriminator D _m (·). After the calculation of the LMMD module, the kernel matrices of the source domain and the target domain can be obtained. Based on the kernel matrices of the source domain and the target domain, the corresponding kernel sensitivities of the source domain samples and the target domain samples can be calculated and sent to the kernel sensitivity discriminator for binary classification.

根据上述内容，本发明所提出的基于核敏感度对齐的子域自适应深度神经网络模型的整体优化目标，即整体损失函数由三部分组成，包括

和

首先，为了保证分类的准确性，需要通过优化使源域的分类损失最小。其次，将

损失降到最低作为子域自适应，以使源域和目标域的条件分布一致。第三，最大化

损失作为全局域自适应，以对齐源域和目标域的边缘分布并进一步减少域偏移。因此，整体的优化目标损失函数可以表述为：According to the above content, the overall optimization goal of the subdomain adaptive deep neural network model based on kernel sensitivity alignment proposed in the present invention, that is, the overall loss function consists of three parts, including

and

First, in order to ensure the accuracy of classification, it is necessary to minimize the classification loss of the source domain through optimization.

The loss is minimized as subdomain adaptation to make the conditional distribution of the source domain and the target domain consistent. Third, maximize

The loss is used as a global domain adaptation to align the marginal distributions of the source and target domains and further reduce the domain shift. Therefore, the overall optimization objective loss function can be expressed as:

其中，λ₁，λ₂为两个平衡参数；

为源域的损失，

为LMMD损失，

为KSA损失，使用反向传播以最小化总损失

为目标来训练深度神经网络模型的参数。Among them, λ ₁ and λ ₂ are two equilibrium parameters;

is the loss of the source domain,

is the LMMD loss,

For KSA loss, backpropagation is used to minimize the total loss

The goal is to train the parameters of the deep neural network model.

在本发明中，使用反向传播以最小化总损失

为目标来训练深度神经网络的参数。需要注意的是，KSA是一种对抗性方法，特征向量首先被送入梯度反转层(GradientReversal Layer，GRL)，然后再送入核敏感度鉴别器。特征提取器的参数θ_f、标签分类器的参数θ_c和灵敏度判别器的参数θ_m通过反向传播更新如下。In the present invention, back propagation is used to minimize the total loss

The parameters of the deep neural network are trained as the target. It should be noted that KSA is an adversarial method, and the feature vector is first fed into the Gradient Reversal Layer (GRL) and then fed into the kernel sensitivity discriminator. The parameters θ _f of the feature extractor, the parameters θ _c of the label classifier, and the parameters θ _m of the sensitivity discriminator are updated through back propagation as follows.

其中，η表示学习率。Here, η represents the learning rate.

(4)仿真实验(4) Simulation experiment

(4.1)实验设置(4.1) Experimental setup

本仿真实验所用数据集为帕德博恩数据集(PU)，由帕德博恩大学的KAT轴承数据中心提供。实验平台如图3所示，其测试台的基本部件是作为传感器的驱动电机(永磁同步电机)1、扭矩测量装置2、轴承测试模块3、飞轮4和载荷电机(同步伺服电机)5。在帕德博恩数据集中，FAG、MTK和IBU公司生产的6203型滚动轴承被用于故障诊断测试。驱动电机1上的定子电流和轴承座外壳上的加速度振动信号是轴承试验台上的主要测量变量。The data set used in this simulation experiment is the Paderborn data set (PU), which is provided by the KAT Bearing Data Center of Paderborn University. The experimental platform is shown in Figure 3. The basic components of the test bench are the drive motor (permanent magnet synchronous motor) 1 as a sensor, the torque measurement device 2, the bearing test module 3, the flywheel 4 and the load motor (synchronous servo motor) 5. In the Paderborn data set, the 6203 type rolling bearings produced by FAG, MTK and IBU are used for fault diagnosis tests. The stator current on the drive motor 1 and the acceleration vibration signal on the bearing housing are the main measurement variables on the bearing test bench.

实验选择采样频率为64kHz的加速度振动信号进行故障诊断和分析。PU数据集共包括6个健康状况轴承和26个故障轴承。故障数据中包含14组来自加速退化实验的故障轴承以及12组人工故障轴承。与人工制造的故障相比，加速退化故障轴承更容易出现复合类型的故障，而且不同的轴承在故障模式、故障程度等方面存在差异。同时，为了更真实地模拟实际情况，健康轴承的监测数据从轴承运行的不同时间段收集。每一组不同的轴承被视为一个类别，所以在PU数据集中有32个类别。通过改变轴承上的径向力和驱动系统上的负载扭矩，PU数据集中有三种工作条件。数据的详细描述如下。The experiment selected acceleration vibration signals with a sampling frequency of 64kHz for fault diagnosis and analysis. The PU data set includes 6 healthy bearings and 26 faulty bearings. The fault data includes 14 groups of faulty bearings from accelerated degradation experiments and 12 groups of artificially faulty bearings. Compared with artificially manufactured faults, accelerated degradation faulty bearings are more likely to have complex types of faults, and different bearings have differences in fault modes, fault degrees, etc. At the same time, in order to simulate the actual situation more realistically, the monitoring data of healthy bearings are collected from different time periods of bearing operation. Each group of different bearings is regarded as a category, so there are 32 categories in the PU data set. By changing the radial force on the bearing and the load torque on the drive system, there are three working conditions in the PU data set. A detailed description of the data is as follows.

表1帕德博恩数据集设置Table 1 Paderborn dataset settings

根据标准的无监督领域自适应实验规则，所有标记的源域数据和未标记的目标域数据都被用作训练数据。对于表1中描述的PU数据集的A、B和C三种工作状况，总共执行6个迁移学习任务。为了验证所提出的方法在不同工作条件下的迁移学习能力，选择了深度适配网络(Domain Adaptation Network，DAN)、DeepCoral、领域对抗性神经网络(DomainAdversarial Neural Network，DANN)、深度子域自适应网络(Deep Subdomain AdaptationNetwork，DSAN)共4种最先进的无监督领域适应方法进行比较。According to the standard unsupervised domain adaptation experimental rules, all labeled source domain data and unlabeled target domain data are used as training data. For the three working conditions A, B and C of the PU dataset described in Table 1, a total of 6 transfer learning tasks are performed. In order to verify the transfer learning ability of the proposed method under different working conditions, four state-of-the-art unsupervised domain adaptation methods, including Deep Adaptation Network (DAN), DeepCoral, Domain Adversarial Neural Network (DANN), and Deep Subdomain Adaptation Network (DSAN), are selected for comparison.

为了公平比较，本发明选择了1D-CNN作为基础网络，所有的领域自适应方法都基于这个网络进行调整。本发明使用随机梯度(SGD)优化器来训练模型中的参数，其中动量设置为0.9，权重衰减设置为5×10^-4。学习率通过公式η_θ＝η₀/(1+αθ)^β动态调整，其中θ是从0到1线性变化的训练进度，η₀＝0.01，α＝10，β＝0.75。批量大小设置为32，Epochs设置为200。为了避免在训练初期对有标记的分类器产生过多的影响，平衡参数λ₁，λ₂通过2/(1+exp(-γθ))-1逐渐调整，其中γ＝3对于λ₁，γ＝5对于λ₂。实验在Linux服务器环境中运行，采用Intel(R)Xeon(R)Gold 5117 CPU和NVIDIA GeForce RTX 3090显卡。PyTorch深度学习框架用于建立模型，并使用GPU加速计算。For fair comparison, the present invention selects 1D-CNN as the basic network, and all domain adaptation methods are adjusted based on this network. The present invention uses a stochastic gradient (SGD) optimizer to train the parameters in the model, where the momentum is set to 0.9 and the weight decay is set to 5× ^10-4 . The learning rate is dynamically adjusted by the formula _ηθ ＝ _η0 /(1+αθ) ^β , where θ is the training progress that changes linearly from 0 to 1, _η0 ＝0.01, α＝10, β＝0.75. The batch size is set to 32 and the Epochs is set to 200. In order to avoid excessive influence on the labeled classifier in the early stage of training, the balance parameters _λ1 , _λ2 are gradually adjusted by 2/(1+exp(-γθ))-1, where γ＝3 for _λ1 and γ＝5 for _λ2 . The experiment was run in a Linux server environment, using an Intel(R)Xeon(R)Gold 5117 CPU and an NVIDIA GeForce RTX 3090 graphics card. The PyTorch deep learning framework is used to build the model and GPU is used to accelerate the computation.

(4.2)故障诊断结果(4.2) Fault diagnosis results

表2帕德博恩数据集实验结果Table 2 Experimental results of Paderborn dataset

帕德博恩数据集的实验结果见表2。与人工引入故障的诊断任务相比，加速退化的轴承故障诊断任务通常更加困难。此外，帕德博恩数据集有32个类，而且故障构成很复杂。在领域适应过程中，不同领域的共同特征并不明显，这增加了领域适应的难度。所有模型在任务A→B和B→A中都取得了较高的准确率，而其他任务准确率较低。在A和B两种工况下，径向力相同，负载扭矩不同，说明负载扭矩对振动信号有较大影响。对于所有的任务，本发明所提出的方法比DAN、DeepCoral、DANN、DSAN中的任何一种都有明显的改善。具体来说，对于除A→B和B→A外的任务，本发明所提出的方法有10％左右的改进。总的来说，本发明方法的平均精度是最高的，达到84.7±1.0，这证明了其强大的领域适应能力。The experimental results of the Paderborn dataset are shown in Table 2. Compared with the diagnosis task of artificially introduced faults, the diagnosis task of accelerated degraded bearing faults is usually more difficult. In addition, the Paderborn dataset has 32 classes and the fault composition is complex. In the process of domain adaptation, the common features of different domains are not obvious, which increases the difficulty of domain adaptation. All models achieved high accuracy in tasks A→B and B→A, while the accuracy of other tasks was lower. In the two working conditions A and B, the radial force is the same and the load torque is different, indicating that the load torque has a greater impact on the vibration signal. For all tasks, the method proposed in the present invention has obvious improvements over any of DAN, DeepCoral, DANN, and DSAN. Specifically, for tasks other than A→B and B→A, the method proposed in the present invention has an improvement of about 10%. In general, the average accuracy of the method of the present invention is the highest, reaching 84.7±1.0, which proves its strong domain adaptation ability.

(4.3)模型分析(4.3) Model analysis

迁移学习模型的实际应用，不仅要有良好的预测性能，还要保持精度的稳定性。准确率波动较大的模型的预测结果是不可靠的。模型的预测稳定性可以通过观察实验中记录的每轮对目标域数据的预测精度变化曲线来比较。根据收集的数据，绘制了帕德博恩数据集中任务C→A的不同方法的精度变化曲线，如图4所示。从中可以看出，本发明方法不仅在比较的方法中准确率最高，而且其准确率变化曲线的波动较小。The practical application of the transfer learning model must not only have good prediction performance, but also maintain the stability of accuracy. The prediction results of the model with large fluctuations in accuracy are unreliable. The prediction stability of the model can be compared by observing the prediction accuracy change curve of the target domain data in each round recorded in the experiment. Based on the collected data, the accuracy change curves of different methods for task C→A in the Paderborn dataset are plotted, as shown in Figure 4. It can be seen that the method of the present invention not only has the highest accuracy among the compared methods, but also has a smaller fluctuation in its accuracy change curve.

此外，本发明还分析了帕德博恩数据集任务B→C中不同方法对目标域数据预测的混淆矩阵，图5(a)为DSAN方法，而图5(b)为本发明方法。混淆矩阵的行代表样本的真实标签，而矩阵的列代表模型的预测结果。从中可以看出，只使用LMMD对齐子域的DSAN方法对于真实标签为10和20的样本预测结果完全错误。对于标签为10的样本，DSAN模型错误的将其认为是标签9或11；对于标签为20的样本，DSAN模型将其完全地错误地识别成标签23。而本发明方法对所有种类的识别精度都高于50％，不会出现DSAN中完全分类错误的情况，这归功于本发明所提出的基于核敏感度对齐方法。In addition, the present invention also analyzes the confusion matrix of target domain data prediction by different methods in Paderborn dataset task B→C, FIG5(a) is the DSAN method, and FIG5(b) is the present invention method. The rows of the confusion matrix represent the true labels of the samples, and the columns of the matrix represent the prediction results of the model. It can be seen that the DSAN method that only uses the LMMD alignment subdomain predicts completely wrong samples with true labels of 10 and 20. For samples with label 10, the DSAN model mistakenly considers it as label 9 or 11; for samples with label 20, the DSAN model completely misidentifies it as label 23. The recognition accuracy of the present invention method for all categories is higher than 50%, and there will be no complete classification errors in DSAN, which is attributed to the kernel sensitivity alignment method proposed by the present invention.

为了直观地比较不同领域适应方法得到的特征的质量，可以对深度神经网络的输出进行可视化，通过观察降维后的可视化特征来评估其能力。本发明应用t-SNE降维技术将特征提取器的最后一个隐藏层的测试数据集的特征嵌入到二维空间，并将其可视化。图6显示了帕德博恩数据集中任务B→C的样本的可视化结果，其中灰色的点代表源域样本，黑色的“X”代表目标域样本。从图中可以看出，只使用LMMD对齐子域的DSAN方法中有一些类别的源域样本完全没有目标域样本与之对齐。而在本发明方法的可视化结果中，同一类别的源域和目标域样本之间的距离更紧凑，而不同类别的样本之间是分散的。In order to intuitively compare the quality of features obtained by different domain adaptation methods, the output of the deep neural network can be visualized, and its ability can be evaluated by observing the visualized features after dimensionality reduction. The present invention uses t-SNE dimensionality reduction technology to embed the features of the test data set of the last hidden layer of the feature extractor into a two-dimensional space and visualize it. Figure 6 shows the visualization results of samples of task B→C in the Paderborn dataset, where the gray points represent source domain samples and the black "X" represents target domain samples. It can be seen from the figure that in the DSAN method that only uses LMMD to align subdomains, there are some categories of source domain samples that have no target domain samples aligned with them at all. In the visualization results of the method of the present invention, the distance between source domain and target domain samples of the same category is more compact, while samples of different categories are scattered.

上述实验证明了本发明方法在不同的工况下具有很强的迁移能力，且明显优于所有的对比算法。此外，本发明中的核敏感度对齐方法可以在其他不同的领域自适应网络中方便、有效地实现。这一优势使得核敏感度对齐方法可以广泛应用于不同领域，具有良好的应用前景。The above experiments prove that the method of the present invention has a strong migration ability under different working conditions and is significantly better than all the comparison algorithms. In addition, the kernel sensitivity alignment method of the present invention can be conveniently and effectively implemented in other different domain adaptive networks. This advantage enables the kernel sensitivity alignment method to be widely used in different fields and has good application prospects.

需要说明的是，上述的轴承故障诊断方法实例仅仅是本发明的一种优选实施例，本发明所述的基于核敏感度对齐网络的变工况下机械故障诊断方法同样适用于其他类型的机械故障诊断，如电机故障、齿轮故障、汽车变速器故障、风机故障等，其具体实施方法与上述实施例类似，且能实现很好的故障诊断结果，在此不再赘述。It should be noted that the above-mentioned bearing fault diagnosis method example is only a preferred embodiment of the present invention. The mechanical fault diagnosis method under variable working conditions based on the kernel sensitivity alignment network described in the present invention is also applicable to other types of mechanical fault diagnosis, such as motor fault, gear fault, automobile transmission fault, fan fault, etc. Its specific implementation method is similar to the above-mentioned embodiment and can achieve good fault diagnosis results, which will not be repeated here.

需要说明的是，在本文中，术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含，从而使得包括一系列要素的过程、方法、物品或者系统不仅包括那些要素，而且还包括没有明确列出的其他要素，或者是还包括为这种过程、方法、物品或者系统所固有的要素。在没有更多限制的情况下，由语句“包括一个……”限定的要素，并不排除在包括该要素的过程、方法、物品或者系统中还存在另外的相同要素。It should be noted that, in this article, the terms "include", "comprises" or any other variations thereof are intended to cover non-exclusive inclusion, so that a process, method, article or system including a series of elements includes not only those elements, but also other elements not explicitly listed, or also includes elements inherent to such process, method, article or system. In the absence of further restrictions, an element defined by the sentence "comprises a ..." does not exclude the existence of other identical elements in the process, method, article or system including the element.

上述本发明实施例序号仅仅为了描述，不代表实施例的优劣。在列举了若干装置的单元权利要求中，这些装置中的若干个可以是通过同一个硬件项来具体体现。词语第一、第二、以及第三等的使用不表示任何顺序，可将这些词语解释为标识。The serial numbers of the embodiments of the present invention are for descriptive purposes only and do not represent the superiority or inferiority of the embodiments. In a unit claim that lists several means, several of these means may be embodied by the same hardware item. The use of the words first, second, and third, etc. does not indicate any order and these words may be interpreted as identifiers.

以上仅为本发明的优选实施例，并非因此限制本发明的专利范围，凡是利用本发明说明书及附图内容所作的等效结构或等效流程变换，或直接或间接运用在其他相关的技术领域，均同理包括在本发明的专利保护范围内。The above are only preferred embodiments of the present invention, and are not intended to limit the patent scope of the present invention. Any equivalent structure or equivalent process transformation made using the contents of the present invention specification and drawings, or directly or indirectly applied in other related technical fields, are also included in the patent protection scope of the present invention.

Claims

1. A method for diagnosing mechanical faults under variable working conditions based on a nuclear sensitivity alignment network, characterized by comprising the following steps:

S1: Collect data from different working conditions of mechanical equipment to form source domain datasets and target domain datasets;

S2: Slice the source domain dataset and the target domain dataset to obtain multiple source domain samples and target domain samples, and normalize each source domain sample and target domain sample;

S3: Construct a subdomain adaptive deep neural network model based on kernel sensitivity alignment, including: feature extractor, label classifier, LMMD module and kernel sensitivity discriminator;

S4: Input the normalized source domain samples and target domain samples into the feature extractor respectively to obtain the feature vector of the source domain and the feature vector of the target domain;

S5: Input the feature vector of the source domain into the label classifier, and use the prediction result and the source domain label to calculate the classification loss of the source domain; input the feature vector of the target domain into the label classifier to obtain the pseudo label of the target domain;

S6: Input the feature vector of the source domain, the feature vector of the target domain, the source domain label and the pseudo label of the target domain into the LMMD module to generate the source domain kernel matrix, the target domain kernel matrix and the LMMD loss;

S7: Calculate the corresponding kernel sensitivities of source domain samples and target domain samples according to the feature vector of the source domain, the feature vector of the target domain, the source domain kernel matrix, and the target domain kernel matrix, and input the kernel sensitivity into the kernel sensitivity discriminator to obtain the KSA loss;

S8: The classification loss, LMMD loss, and KSA loss of the source domain are added together to get the total loss. The minimum total loss is taken as the optimization goal, and the model is optimized using the stochastic gradient descent method.

S9: Determine whether the specified number of iterations has been reached. If so, end the training and use the trained deep neural network model to perform mechanical fault diagnosis under variable working conditions and obtain fault diagnosis results; otherwise, return to step S4.

2. The method for diagnosing mechanical faults under variable working conditions based on a kernel sensitivity alignment network according to claim 1 is characterized in that S1 and S2 specifically include:

Classify the source domain dataset as the source task

Collect bearing vibration signals with unknown fault information under other working conditions as the target domain data set

Classify the target domain dataset as the target task

in,

and

represents a dataset consisting of n _s samples in the source domain,

and

denote the label space of the source task and the target task respectively, f ^S (·) and f ^T (·) are the mapping functions of the source domain and the target domain, indicating the relationship between the samples of the dataset and the prediction results;

The collected source domain data set and target domain data set are segmented through a sliding window to generate source domain samples and target domain samples;

Each source domain sample and target domain sample is normalized.

3. According to the method for diagnosing mechanical faults under variable working conditions based on the kernel sensitivity alignment network according to claim 1, it is characterized in that in step S3, the feature extractor includes three one-dimensional convolutional layers, a flattening layer and a fully connected layer arranged in sequence; the convolution kernel size of the first two convolutional layers is larger, and the convolution kernel size of the last convolutional layer is smaller, and each convolutional layer is followed by a maximum pooling layer; batch normalization and Leaky ReLU function are used after each convolutional layer, and ReLU function is used after the fully connected layer.

4. According to the method for diagnosing mechanical faults under variable working conditions based on the kernel sensitivity alignment network according to claim 1, it is characterized in that in step S3, the label classifier includes a fully connected layer, the number of input dimensions is the number of dimensions of the feature vector, and the output dimension is the number of bearing fault categories.

5. According to the bearing fault diagnosis method under variable working conditions based on the kernel sensitivity alignment network according to claim 1, it is characterized in that in step S3, the kernel sensitivity discriminator includes a gradient reversal layer GRL and three fully connected layers arranged in sequence, and batch normalization, ReLU function and dropout function are used after each fully connected layer.

6. The method for diagnosing mechanical faults under variable working conditions based on a kernel sensitivity alignment network according to claim 2, characterized in that step S4 specifically comprises:

For source domain samples

and target domain samples

Use the feature extractor G(·) to transform ^xs and ^xt through

and

is mapped to a common feature space, where

7. The method for diagnosing mechanical faults under variable working conditions based on a kernel sensitivity alignment network according to claim 6, characterized in that step S5 specifically comprises:

The feature vector of the source domain

and the feature vector of the target domain

in,

According to z ^s and the true source domain label

It is expressed as follows:

Where L _c (·,·) is the cross entropy loss function;

The score vector z ^t of the target domain is processed by the softmax function to obtain the vector

Each element of

All represent

use

As

's pseudo-labels.

8. The method for mechanical fault diagnosis under variable working conditions based on kernel sensitivity alignment network according to claim 2 is characterized in that in step S6: the LMMD module is used to align the distribution of the same category data in the source domain and the target domain so that the conditional distribution of the two domains is the same, and the LMMD is defined as follows:

Among them, ^xs and ^xt are samples of the source domain and the target domain, E represents the mathematical expectation, p ^(c) and q ^(c) are the distributions of class c in the source domain and the target domain, respectively.

Define the parameter w ^c as the weight of each sample belonging to each category, then the unbiased estimate of LMMD is defined as follows:

in,

and

Represents the i-th source sample

and the jth target sample

The weights belonging to class C,

and

as well as

is the weighted sum of samples of category C;

The calculation method is as follows:

Among them, y _ic is the c-th element of the vector _yi , for the source domain sample

Use true source domain labels

One-hot encoding is used to calculate

For each unlabeled target domain sample in unsupervised domain adaptation

use

As a pseudo label to calculate the target sample

Calculate the eigenvector of the source domain

and the feature vector of the target domain

The LMMD distance is as follows:

Where k(·,·) represents the kernel function;

Calculate the kernel matrix K: This matrix is composed of the inner product matrices Ks _,s ,Kt _,t , _Ks,t , _Kt,s defined in the source domain, target domain, and cross-domain respectively. The expression is as follows:

The LMMD distance is represented by the kernel matrix method, and each element _Wij in the weight matrix W is defined as follows:

Based on the kernel matrix K and the weight matrix W, the LMMD loss is expressed as follows:

9. The method for diagnosing mechanical faults under variable working conditions based on a kernel sensitivity alignment network according to claim 2, characterized in that step S7 specifically comprises:

and

It is the sum of the sample inner products of the source domain and the target domain in RKHS. The corresponding source domain kernel matrix K _s,s and target domain kernel matrix K _t,t are expressed as follows:

The kernel sensitivity s _i of each sample is obtained by taking partial derivatives of the source domain kernel matrix and the target domain kernel matrix, and is calculated as follows:

Where G(·) _d represents the dth element of the eigenvector,

and

are source domain samples and target domain samples respectively.

The kernel sensitivity is input into the kernel sensitivity discriminator _Dm (·), and the binary classification result of the kernel sensitivity discriminator and the domain label are used to calculate the KSA loss using binary cross entropy as follows:

Where L _b (·,·) is the binary cross entropy loss function, d _i = 0 is the source domain label, and d _j = 1 is the target domain label.

10. The method for diagnosing mechanical faults under variable working conditions based on a kernel sensitivity alignment network according to claim 1, characterized in that in step S8, the expression of the total loss is as follows:

Among them, λ ₁ and λ ₂ are two equilibrium parameters;

is the loss of the source domain,

is the LMMD loss,

For KSA loss, backpropagation is used to minimize the total loss

To train the parameters of the deep neural network model for the purpose;

The parameters θ _f of the feature extractor, θ _c of the label classifier, and θ _m of the kernel sensitivity discriminator are updated through back-propagation as follows:

Here, η represents the learning rate.