CN116106012A

CN116106012A - A Domain Adaptive Fault Diagnosis Method for Rolling Bearings Based on Attention Mechanism

Info

Publication number: CN116106012A
Application number: CN202111330197.1A
Authority: CN
Inventors: 杜劲松; 王煜; 高洁; 王伟; 杨旭
Original assignee: Shenyang Institute of Automation of CAS
Current assignee: Shenyang Institute of Automation of CAS
Priority date: 2021-11-11
Filing date: 2021-11-11
Publication date: 2023-05-12

Abstract

The invention relates to a fault diagnosis method for rolling bearing domain adaptation based on an attention mechanism, which is characterized in that a feature extractor which is formed by one-dimensional separable convolution and embedded with a channel attention mechanism and a length attention mechanism extracts deep fault features from collected rolling bearing vibration monitoring signals; the local attention domain adaptation module and the global attention domain adaptation module are constructed to screen signals and signal fragments with good mobility, so that the generalization capability of the model is improved, and the model can better cope with the problem of fault diagnosis under variable working conditions; compared with an intelligent fault diagnosis algorithm for transfer learning, the algorithm considers the different signals and the different migratability of the signal fragments, and improves the interpretability of the model; experiments are carried out by applying various bearing vibration data, so that the algorithm is verified to have good performance stability, and excellent diagnosis results can be still maintained under various working condition change conditions.

Description

A domain-adaptive fault diagnosis method for rolling bearings based on attention mechanism

技术领域Technical Field

本发明属于变工况轴承故障诊断领域，具体地说是一种基于注意力机制的滚动轴承域适应故障诊断方法。The present invention belongs to the field of variable operating condition bearing fault diagnosis, and specifically is a rolling bearing domain adaptive fault diagnosis method based on an attention mechanism.

背景技术Background Art

滚动轴承作为旋转机械关键零部件，其运行环境使较为恶劣，在服役期间极易受到损坏，从而造成难以预知和控制的意外工业事故，轻则影响局部设备运行，降低生产效率，重则引起严重的生产安全事故，造成无法挽回的后果。As a key component of rotating machinery, rolling bearings operate in a relatively harsh environment and are extremely susceptible to damage during service, resulting in unexpected industrial accidents that are difficult to predict and control. At the very least, they may affect the operation of local equipment and reduce production efficiency. At worst, they may cause serious production safety accidents and lead to irreversible consequences.

随着工业物联网等技术的快速兴起，机械故障诊断逐渐步入“大数据”时代，基于深度学习等方法的智能故障诊断算法逐渐成为近年来的研究热点。但现存基于深度学习的故障诊断算法一般建立在训练集测试集处于独立同分布的假设上，这与实际工作中存在大量噪声环境以及复杂变工况的条件不符。故而，能够应对这一点的迁移学习算法，逐步成为智能故障诊断算法的研究热点。With the rapid rise of technologies such as the Industrial Internet of Things, mechanical fault diagnosis has gradually entered the "big data" era, and intelligent fault diagnosis algorithms based on deep learning and other methods have gradually become a research hotspot in recent years. However, existing fault diagnosis algorithms based on deep learning are generally based on the assumption that the training set and the test set are independent and identically distributed, which is inconsistent with the conditions of a large number of noisy environments and complex variable working conditions in actual work. Therefore, transfer learning algorithms that can cope with this have gradually become a research hotspot for intelligent fault diagnosis algorithms.

但现有的迁移学习智能故障诊断算法缺少可解释性，且并未考虑到不同的信号及信号片段可能具有的可迁移性不同；因此，提出一种泛化能力更强且具备一定可解释性的智能故障诊断技术至关重要。However, the existing transfer learning intelligent fault diagnosis algorithms lack interpretability and do not take into account the different transferability that different signals and signal fragments may have. Therefore, it is crucial to propose an intelligent fault diagnosis technology with stronger generalization ability and certain interpretability.

发明内容Summary of the invention

针对现有技术的不足，本发明提供一种基于注意力机制的滚动轴承域适应故障诊断方法，提高了模型的泛化能力，能够应对各类故障诊断问题，同时也提高了模型的可解释性。In view of the shortcomings of the prior art, the present invention provides a rolling bearing domain adaptive fault diagnosis method based on the attention mechanism, which improves the generalization ability of the model and can cope with various fault diagnosis problems, while also improving the interpretability of the model.

本发明为实现上述目的所采用的技术方案是：The technical solution adopted by the present invention to achieve the above-mentioned purpose is:

一种基于注意力机制的滚动轴承域适应故障诊断方法，包括以下步骤：A rolling bearing domain adaptation fault diagnosis method based on attention mechanism includes the following steps:

针对同一轴承系统，分别采集不同工况条件下的各类轴承健康状况的振动监测数据，并根据故障类型为数据打好标签，得到针对不同工况的轴承数据集；For the same bearing system, vibration monitoring data of the health status of various types of bearings under different working conditions are collected respectively, and the data are labeled according to the fault type to obtain bearing data sets for different working conditions;

将轴承数据集中的数据分为源域数据集和目标域数据集，并分别分为训练集和验证集；The data in the bearing dataset is divided into a source domain dataset and a target domain dataset, and then divided into a training set and a validation set respectively;

构建滚动轴承域适应故障诊断模型，并使用源域数据集的训练集和目标域数据集的训练集训练模型；Construct a rolling bearing domain adaptive fault diagnosis model, and train the model using the training set of the source domain dataset and the training set of the target domain dataset;

对滚动轴承域适应故障诊断模型进行优化；Optimize the domain-adaptive fault diagnosis model for rolling bearings;

将目标域数据的中的测试集输入到优化后的滚动轴承域适应故障诊断模型中，得到该目标域数据的故障类型。The test set of the target domain data is input into the optimized rolling bearing domain adaptive fault diagnosis model to obtain the fault type of the target domain data.

所述源域数据为数量大于阈值且标签完整的振动监测数据，所述目标域数据为待检测工况下的振动监测数据。The source domain data is vibration monitoring data with a quantity greater than a threshold and complete labels, and the target domain data is vibration monitoring data under the working condition to be detected.

所述构建滚动轴承域适应故障诊断模型，包括：The construction of the rolling bearing domain adaptive fault diagnosis model comprises:

特征提取器，用于提取轴承数据集中的故障特征，并将故障特征分为多个片段；A feature extractor is used to extract fault features in the bearing dataset and divide the fault features into multiple segments;

局部注意力域适应模块，用于计算每个片段故障特征的局部注意力值，并对其进行加权处理，将每个片段加权后的局部注意力值进行维度合并；The local attention domain adaptation module is used to calculate the local attention value of each segment fault feature, perform weighted processing on it, and dimensionally merge the weighted local attention value of each segment;

全局注意力域适应模块，用于根据维度合并后的局部注意力值计算全局注意力值，并加权到一个分类损失中，完成对输入数据的故障类型分类。The global attention domain adaptation module is used to calculate the global attention value based on the local attention values after dimension merging, and weight it into a classification loss to complete the fault type classification of the input data.

所述特征提取器包括顺序连接的空间卷积、长度注意力模块、通道卷积、通道注意力模块以及池化层。The feature extractor includes sequentially connected spatial convolution, length attention module, channel convolution, channel attention module and pooling layer.

所述局部注意力域适应模块包括：最大均值差异模块、多个域分类器以及残差连接模块，其中，特征提取器输出的多个片段的故障特征，通过最大均值差异模块处理后，每个片段的故障特征输入到一个域分类器中，计算相应片段属于源域的概率，并使用熵函数计算每个片段故障特征的局部注意力值，以对故障特征进行加权，将所有片段加权后的故障特征进行维度合并后，通过残差连接模块输出。The local attention domain adaptation module includes: a maximum mean difference module, multiple domain classifiers and a residual connection module, wherein the fault features of multiple fragments output by the feature extractor are processed by the maximum mean difference module, and the fault features of each fragment are input into a domain classifier, the probability that the corresponding fragment belongs to the source domain is calculated, and the local attention value of the fault feature of each fragment is calculated using an entropy function to weight the fault features, and the weighted fault features of all fragments are dimensionally merged and output through the residual connection module.

所述全局注意力域适应模块包括：一个域分类器，将加权后的故障特征输入到域分类器中，计算该故障特征属于源域的概率，并使用熵函数计算故障特征的全局注意力值，以对故障特征进行加权。The global attention domain adaptation module includes: a domain classifier, inputting the weighted fault feature into the domain classifier, calculating the probability that the fault feature belongs to the source domain, and using an entropy function to calculate the global attention value of the fault feature to weight the fault feature.

在所述构建滚动轴承域适应故障诊断模型中，设置四类损失函数，具体为：In the construction of the rolling bearing domain adaptive fault diagnosis model, four types of loss functions are set, specifically:

1)源域数据集的轴承故障分类损失：1) Bearing fault classification loss of source domain dataset:

其中，n_S为源域样本数量，D_S为源域样本空间，

为交叉熵损失函数，G_y为分类器，h_i为局部注意力加权特征，y_i为样本标签；Among them, n _S is the number of source domain samples, D _S is the source domain sample space,

is the cross entropy loss function, _Gy is the classifier, _hi is the local attention weighted feature, and _yi is the sample label;

2)用于减小源域与目标域间分类差异的MMD损失：2) MMD loss used to reduce the classification difference between the source domain and the target domain:

其中，n_t为目标域样本数量，

和

来自源域与目标域中的信号样本；φ(·)是用于将源域及目标域样本映射到同一Hilbert空间的非线性映射，k(·,·)为核函数；Among them, n _t is the number of samples in the target domain,

and

Signal samples from the source domain and the target domain; φ(·) is a nonlinear mapping used to map the source domain and target domain samples to the same Hilbert space, and k(·,·) is a kernel function;

3)用于从信号片段和信号样本中分别提取局部域和全局域不变性特征的局部注意力域适应损失

及全局注意力域适应损失

3) Local attention domain adaptation loss for extracting local domain and global domain invariant features from signal segments and signal samples respectively

and global attention domain adaptation loss

其中，D_S表示源域样本空间；D_T表示目标域样本空间，d_i表示信号样本x_i的域标签；

是用于域分类的交叉熵损失；Where D _S represents the source domain sample space; D _T represents the target domain sample space, and d _i represents the domain label of the signal sample _xi ;

is the cross entropy loss for domain classification;

4)用于辅助分类的全局加权熵损失：4) Global weighted entropy loss for auxiliary classification:

其中，c表示故障种类数；p_i,j表示信号样本x_i被分为标签j的概率值。Where c represents the number of fault types; pi _,j represents the probability value of signal sample _xi being classified as label j.

总损失函数为：The total loss function is:

其中，θ_f，θ_y，θ_d和

分别为特征提取器、标签分类器、全局注意力域适应模块及局部注意力域适应模块的模型参数，η，γ和λ分别为修正权重。where, θ _f , θ _y , θ _d and

are the model parameters of the feature extractor, label classifier, global attention domain adaptation module and local attention domain adaptation module, respectively. η, γ and λ are the correction weights respectively.

使用反向传播及Adam优化算法对滚动轴承域适应故障诊断模型进行优化，即使总损失函数最小化。The rolling bearing domain adaptive fault diagnosis model is optimized using back propagation and Adam optimization algorithm, that is, the total loss function is minimized.

使用以下公式对所述模型参数进行更新：The model parameters are updated using the following formula:

其中，ε为学习率。Among them, ε is the learning rate.

本发明具有以下有益效果及优点：The present invention has the following beneficial effects and advantages:

1.本发明可以自主选择具有更好可转移性的信号和信号段，增强模型的域自适应的能力，提高故障诊断的准确性。1. The present invention can autonomously select signals and signal segments with better transferability, enhance the domain adaptation capability of the model, and improve the accuracy of fault diagnosis.

2.本发明与当前多种较为流行的迁移学习智能故障诊断算法相比，具有最佳性能及鲁棒性，可以处理各种变工况故障诊断问题。2. Compared with many currently popular transfer learning intelligent fault diagnosis algorithms, the present invention has the best performance and robustness and can handle various variable operating condition fault diagnosis problems.

3.本发明在可迁移特征的选择上具有一定的可解释性。3. The present invention has a certain degree of explainability in the selection of transferable features.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1是本发明的基于注意力机制的滚动轴承域适应故障诊断方法的模型结构图。FIG1 is a model structure diagram of the rolling bearing domain adaptive fault diagnosis method based on the attention mechanism of the present invention.

具体实施方式DETAILED DESCRIPTION

下面结合附图及实施例对本发明做进一步的详细说明。The present invention is further described in detail below in conjunction with the accompanying drawings and embodiments.

步骤1：针对同一轴承系统，分别采集不同工况条件下的各类轴承健康状况的振动监测信号，并依据故障类型为数据打好标签，从而得到针对不同工况的轴承数据集；Step 1: For the same bearing system, collect vibration monitoring signals of various bearing health conditions under different working conditions, and label the data according to the fault type to obtain bearing data sets for different working conditions;

步骤2：选取步骤1中所得不同工况下的数据集中的两种分别作为源域及目标域，并划分相应的训练集与测试集；Step 2: Select two of the data sets under different working conditions obtained in step 1 as the source domain and target domain, and divide them into corresponding training sets and test sets;

步骤3：将源域及目标域的训练集数据作为模型输入，用以训练模型，使用网格搜索法优化超参数，选取诊断精度最高的训练模型，作为最终诊断模型，并固定该模型；Step 3: Use the training set data of the source domain and the target domain as model input to train the model, use the grid search method to optimize the hyperparameters, select the training model with the highest diagnostic accuracy as the final diagnostic model, and fix the model;

步骤4：在目标域的测试集数据上对步骤3中所得最终模型进行性能测试。Step 4: Perform a performance test on the final model obtained in step 3 on the test set data of the target domain.

所述嵌有通道注意力机制和长度注意力机制的一维可分离卷积为：本发明应用一维可分离卷积来替代传统卷积，并在其中嵌入两种注意力机制。该模型主要包括以下几部分：对每个通道中的信号样本进行滤波，并具有特定卷积核大小的空间卷积；通过聚合各通道特征信息，从而获取信号各片段在故障诊断中的重要程度的长度注意力机制模块；使用一组卷积核，学习通道间关系，同时用于提高特征维度的通道卷积；通过捕捉各通道间的相互依存关系，从而判断特征重要性，促进网络关注重要特征的通道注意力模块。The one-dimensional separable convolution embedded with channel attention mechanism and length attention mechanism is: the present invention applies one-dimensional separable convolution to replace traditional convolution, and embeds two attention mechanisms therein. The model mainly includes the following parts: filtering the signal samples in each channel, and spatial convolution with a specific convolution kernel size; a length attention mechanism module that obtains the importance of each signal segment in fault diagnosis by aggregating the feature information of each channel; a channel convolution that uses a set of convolution kernels to learn the relationship between channels and is used to improve the feature dimension; a channel attention module that captures the interdependence between channels to judge the importance of features and promotes the network to pay attention to important features.

所述空间卷积的特定卷积核大小等同于所替代的传统卷积的卷积核大小。The specific convolution kernel size of the spatial convolution is equivalent to the convolution kernel size of the replaced traditional convolution.

所述由局部注意力域适应模块和全局注意力域适应模块构成的迁移学习模块为：The transfer learning module composed of the local attention domain adaptation module and the global attention domain adaptation module is:

本发明应用局部注意力域适应模块和全局注意力域适应模块的基本结构参考了DANN模型中的域分类器；从特征提取器中输出的特征会被分为K个片段，并输入局部注意力域适应模块，局部注意力域适应模块包含K个域分类器，即每个信号片段对应一个域分类器

以计算相应片段属于源域的概率，从而判断该片段的可迁移能力，并使用熵函数计算局部注意力值，实现对特征进行加权；同时为保证源域及目标域间特征差异过大而导致负迁移，在输入局部注意力域适应模块前，本发明使用最大均值差异(MMD)来减小源域及目标域的分布差异；局部注意力加权特征会被输入到全局注意力域适应模块，以衡量那些信号的可迁移性更强，全局注意力域适应模块由一个域分类器G_D构成，其输出代表相应输入信号属于源域的概率，同样用过熵函数计算注意力值，并加权到一个分类损失中，以便辅助分类。The basic structure of the local attention domain adaptation module and the global attention domain adaptation module used in the present invention refers to the domain classifier in the DANN model; the features output from the feature extractor will be divided into K segments and input into the local attention domain adaptation module, which contains K domain classifiers, that is, each signal segment corresponds to a domain classifier

The probability that the corresponding segment belongs to the source domain is calculated to judge the transferability of the segment, and the entropy function is used to calculate the local attention value to achieve weighting of the features; at the same time, in order to ensure that the feature difference between the source domain and the target domain is too large to cause negative transfer, before inputting the local attention domain adaptation module, the present invention uses the maximum mean difference (MMD) to reduce the distribution difference between the source domain and the target domain; the local attention weighted features will be input into the global attention domain adaptation module to measure which signals are more transferable. The global attention domain adaptation module is composed of a domain classifier G _D , whose output represents the probability that the corresponding input signal belongs to the source domain. The attention value is also calculated by the entropy function and weighted to a classification loss to assist classification.

所述局部注意力域适应模块中对输入特征的分块数K是依据特征提取器最后一层的输出尺寸决定，应为特征图的高度及宽度的乘积，由于本发明针对的是一维振动信号，故而高度为1，分块数K等于输入特征图宽度。The number of blocks K for the input features in the local attention domain adaptation module is determined based on the output size of the last layer of the feature extractor, which should be the product of the height and width of the feature map. Since the present invention is aimed at a one-dimensional vibration signal, the height is 1, and the number of blocks K is equal to the width of the input feature map.

所述对该模型进行训练的过程中的全局参数设置为：学习率为0.001， batchsize为32，训练次数为500次。The global parameters in the process of training the model are set as follows: learning rate is 0.001, batch size is 32, and the number of training times is 500.

如图1所示为本发明的基于注意力机制的滚动轴承域适应故障诊断方法的模型结构图。As shown in FIG1 , this is a model structure diagram of the rolling bearing domain adaptive fault diagnosis method based on the attention mechanism of the present invention.

本发明提出的基于注意力机制的滚动轴承域适应故障诊断方法，该网络结构有两部分构成：特征提取器及域适应模块；特征提取器提由嵌有通道注意力机制和长度注意力机制的一维可分离式卷积堆叠构成，用于提取故障相关特征；域适应模块由局部注意力域适应模块和全局注意力域适应模块构成，筛选可迁移性好的信号及信号片段，从而提高模型的泛化能力，使模型能够更好地应对变工况故障诊断问题。The present invention proposes a rolling bearing domain adaptation fault diagnosis method based on attention mechanism, and the network structure consists of two parts: a feature extractor and a domain adaptation module; the feature extractor is composed of a one-dimensional separable convolution stack embedded with a channel attention mechanism and a length attention mechanism, which is used to extract fault-related features; the domain adaptation module is composed of a local attention domain adaptation module and a global attention domain adaptation module, which screens signals and signal fragments with good transferability, thereby improving the generalization ability of the model, so that the model can better cope with variable working condition fault diagnosis problems.

具体步骤如下：The specific steps are as follows:

步骤1：数据的获取与预处理。Step 1: Data acquisition and preprocessing.

针对同一轴承系统，分别采集不同工况条件下的各类轴承健康状况的振动监测信号，并依据故障类型为数据打好标签，从而得到针对不同工况的轴承数据集，选取所得不同工况下的数据集中的两种分别作为源域及目标域，并划分相应的训练集与测试集；For the same bearing system, vibration monitoring signals of various bearing health conditions under different working conditions are collected respectively, and the data are labeled according to the fault type, so as to obtain bearing data sets for different working conditions. Two of the data sets under different working conditions are selected as the source domain and target domain respectively, and the corresponding training set and test set are divided;

步骤2：通过优化后特征提取器的提取故障特征。Step 2: Extract fault features through the optimized feature extractor.

应用一维可分离卷积来替代传统卷积，可以在不对诊断精度造成影响的条件下减少模型参数量。同时在一维可分离卷积的两个步骤后分布嵌入长度注意力机制及通道注意力机制。其原因是一维可分离卷积中的空间卷积应用相互独立卷积核分别过滤不同的通道，使每个通道都有自己独特的特征。而长度注意力机制主要汇总了不同通道的特征信息，将其放置在空间卷积层之后可以发挥更丰富的作用。同时，空间卷积不会改变特征维度。本发明中，一维可分离卷积中的通道卷积不仅用于通过聚合空间卷积的输出来学习通道之间的关系，同时用于增加特征空间的维数。本发明将通道注意力机制置于通道卷积之后可以确保将此注意力模块应用于更高维的特征空间。输入的信号样本x_i经过特征提取器G_f后可以很好地得到故障相关特征Applying one-dimensional separable convolution to replace traditional convolution can reduce the number of model parameters without affecting the diagnostic accuracy. At the same time, the length attention mechanism and the channel attention mechanism are distributed and embedded after the two steps of the one-dimensional separable convolution. The reason is that the spatial convolution in the one-dimensional separable convolution uses independent convolution kernels to filter different channels respectively, so that each channel has its own unique characteristics. The length attention mechanism mainly summarizes the feature information of different channels, and it can play a richer role when placed after the spatial convolution layer. At the same time, spatial convolution does not change the feature dimension. In the present invention, the channel convolution in the one-dimensional separable convolution is not only used to learn the relationship between channels by aggregating the output of the spatial convolution, but also to increase the dimension of the feature space. The present invention places the channel attention mechanism after the channel convolution to ensure that this attention module is applied to a higher-dimensional feature space. The input signal sample x _i can obtain the fault-related features well after passing through the feature extractor G _f .

步骤3：通过优化后的域适应模块提取域不变性特征。Step 3: Extract domain invariant features through the optimized domain adaptation module.

信号样本x_i经过特征提取器G_f后的特征被分为了K个片段，每个片段的特征为

将其输入对应的局部域分类器

后，可得到该片段属于源域的概率

域分类器与特征提取器形成对抗关系，通过训练使得域分类器无法分辨特征提取器生成的特征属于源域还是目标域，从而实现对域不变性特征的提取。当概率值

接近0或1时，表示该信号片段可以被域分类器分类，其可迁移性较差。该概率值不能直接用于对特征进行加权，故而使用熵函数

计算每个片段局部注意力值，其中x为离散随机变量X 中的变量值，p为离散随机变量X概率分布函数，每个片段局部注意力值的最终计算公式如下：The features of the signal sample x _i after passing through the feature extractor G _f are divided into K segments, and the features of each segment are

Input it into the corresponding local domain classifier

After that, the probability that the segment belongs to the source domain can be obtained.

The domain classifier and the feature extractor form an adversarial relationship. Through training, the domain classifier cannot distinguish whether the features generated by the feature extractor belong to the source domain or the target domain, thereby realizing the extraction of domain invariant features.

When it is close to 0 or 1, it means that the signal segment can be classified by the domain classifier, and its transferability is poor. This probability value cannot be used directly to weight the feature, so the entropy function is used

Calculate the local attention value of each fragment, where x is the variable value in the discrete random variable X, and p is the probability distribution function of the discrete random variable X. The final calculation formula for the local attention value of each fragment is as follows:

同时，局部域分类器添加了残差连接模块，以防止错误注意力值导致的负迁移，故而，最终各片段经局部注意力加权后的特征为：At the same time, the local domain classifier adds a residual connection module to prevent negative transfer caused by incorrect attention values. Therefore, the final features of each segment after local attention weighting are:

然后按照分割前各片段的位置对相应经局部注意力加权后的片段进行维度合并，所得特征h_i的维度与分割前特征的维度一致。将局部注意力加权特征h_i输入全局域分类器G_D后，可得到该信号属于源域的概率

同样使用熵函数计算注意力值，得到全局注意力值为：Then, the corresponding segments after local attention weighting are merged according to the positions of the segments before segmentation, and the dimension of the obtained feature _hi is consistent with the dimension of the feature before segmentation. After the local attention weighted feature _hi is input into the global domain classifier G _D , the probability that the signal belongs to the source domain can be obtained.

The entropy function is also used to calculate the attention value, and the global attention value is:

步骤4：损失函数设置。Step 4: Loss function setting.

本发明的损失函数一共包括以下四类：The loss functions of the present invention include the following four categories:

其中，n_S为源域样本数量，D_S为源域样本空间，

其中，n_t为目标域样本数量，

和

来自源域与目标域中的信号样本；φ(·)是用于将源域及目标域样本映射到同一Hilbert空间的非线性映射，k(·,·)为核函数，用于计算两映射的内积，如上式中

为源域样本和目标域样本经映射后的内积，即

Among them, n _t is the number of samples in the target domain,

and

Signal samples from the source domain and the target domain; φ(·) is a nonlinear mapping used to map the source domain and target domain samples to the same Hilbert space, and k(·,·) is a kernel function used to calculate the inner product of the two mappings, as shown in the above formula.

is the inner product of the source domain sample and the target domain sample after mapping, that is,

3)用于提取从信号片段和信号样本中分别提取局部和全局域不变性特征的局部注意力域适应损失

及全局注意力域适应损失

3) Local attention domain adaptation loss for extracting local and global domain invariant features from signal segments and signal samples respectively

and global attention domain adaptation loss

所述信号片段为按照一定数量对信号样本进行均匀分割后的样本，分割数量与特征提取器对故障特征进行分割时的分块数一致，The signal segments are samples obtained by evenly dividing the signal samples according to a certain number, and the number of divisions is consistent with the number of blocks when the feature extractor divides the fault features.

其中，D_S表示源域样本空间；D_T表示目标域样本空间；d_i表示振动信号样本x_i的域标签；

是用于域分类的交叉熵损失；Where D _S represents the source domain sample space; D _T represents the target domain sample space; d _i represents the domain label of the vibration signal sample _xi ;

is the cross entropy loss for domain classification;

故而，本发明的总损失函数为：Therefore, the total loss function of the present invention is:

其中，θ_f，θ_y，θ_d和

分别为特征提取器、标签分类器、全局域分类器及局部域分类器的模型参数，η，γ和λ分别为修正权重。Among them, θ _f , θ _y , θ _d and

are the model parameters of feature extractor, label classifier, global domain classifier and local domain classifier respectively, and η, γ and λ are the correction weights respectively.

步骤5：优化策略设置。Step 5: Optimize policy settings.

网络使用标准的反向传播及Adam优化算法完成训练过程时总损失函数最小化。各网络结构的参数根据如下公式进行更新：The network uses standard back propagation and Adam optimization algorithms to complete the training process while minimizing the total loss function. The parameters of each network structure are updated according to the following formula:

其中，ε为学习率。Among them, ε is the learning rate.

步骤6：对目标域测试集进行故障诊断。Step 6: Perform fault diagnosis on the target domain test set.

使用网格搜索法寻找最优超超参数组合，并固定在最优超超参数组合下的训练模型，在目标域测试集进行故障诊断测试模型性能。The grid search method is used to find the optimal hyper-parameter combination, and the training model is fixed under the optimal hyper-hyperparameter combination. The fault diagnosis model performance is tested on the target domain test set.

Claims

1. A rolling bearing domain adaptation fault diagnosis method based on attention mechanism, characterized by comprising the following steps:

For the same bearing system, vibration monitoring data of the health status of various types of bearings under different working conditions are collected respectively, and the data are labeled according to the fault type to obtain bearing data sets for different working conditions;

The data in the bearing dataset is divided into a source domain dataset and a target domain dataset, and then divided into a training set and a validation set respectively;

Construct a rolling bearing domain adaptive fault diagnosis model, and train the model using the training set of the source domain dataset and the training set of the target domain dataset;

Optimize the domain-adaptive fault diagnosis model for rolling bearings;

The test set of the target domain data is input into the optimized rolling bearing domain adaptive fault diagnosis model to obtain the fault type of the target domain data.

2. According to a rolling bearing domain adaptive fault diagnosis method based on an attention mechanism according to claim 1, it is characterized in that the source domain data is vibration monitoring data with a quantity greater than a threshold and complete labels, and the target domain data is vibration monitoring data under the working conditions to be tested.

3. According to the method for rolling bearing domain adaptation fault diagnosis based on attention mechanism in claim 1, it is characterized in that the construction of rolling bearing domain adaptation fault diagnosis model comprises:

A feature extractor is used to extract fault features in the bearing dataset and divide the fault features into multiple segments;

The local attention domain adaptation module is used to calculate the local attention value of each segment fault feature, perform weighted processing on it, and dimensionally merge the weighted local attention value of each segment;

The global attention domain adaptation module is used to calculate the global attention value based on the local attention values after dimension merging, and weight it into a classification loss to complete the fault type classification of the input data.

4. According to a rolling bearing domain adaptive fault diagnosis method based on attention mechanism according to claim 3, it is characterized in that the feature extractor includes sequentially connected spatial convolution, length attention module, channel convolution, channel attention module and pooling layer.

5. According to claim 3, a rolling bearing domain adaptation fault diagnosis method based on an attention mechanism is characterized in that the local attention domain adaptation module includes: a maximum mean difference module, multiple domain classifiers and a residual connection module, wherein the fault features of multiple fragments output by the feature extractor are processed by the maximum mean difference module, and the fault features of each fragment are input into a domain classifier, and the probability that the corresponding fragment belongs to the source domain is calculated, and the entropy function is used to calculate the local attention value of the fault feature of each fragment to weight the fault feature, and the weighted fault features of all fragments are dimensionally merged and output through the residual connection module.

6. According to a rolling bearing domain adaptation fault diagnosis method based on an attention mechanism as described in claim 3, it is characterized in that the global attention domain adaptation module includes: a domain classifier, which inputs the weighted fault feature into the domain classifier, calculates the probability that the fault feature belongs to the source domain, and uses an entropy function to calculate the global attention value of the fault feature to weight the fault feature.

7. A rolling bearing domain adaptation fault diagnosis method based on attention mechanism according to claim 3, characterized in that in the construction of the rolling bearing domain adaptation fault diagnosis model, four types of loss functions are set, specifically:

1) Bearing fault classification loss of source domain dataset:

Among them, n _S is the number of source domain samples, D _S is the source domain sample space,

2) MMD loss used to reduce the classification difference between the source domain and the target domain:

Among them, n _t is the number of samples in the target domain,

and

and global attention domain adaptation loss

Where D _S represents the source domain sample space; D _T represents the target domain sample space, and d _i represents the domain label of the signal sample _xi ;

is the cross entropy loss for domain classification;

4) Global weighted entropy loss for auxiliary classification:

Where c represents the number of fault types; pi _,j represents the probability value of signal sample _xi being classified as label j.

8. The rolling bearing domain adaptation fault diagnosis method based on attention mechanism according to claim 7, characterized in that the total loss function is:

where, θ _f , θ _y , θ _d and

9. A rolling bearing domain adaptation fault diagnosis method based on an attention mechanism according to claim 1 or 8, characterized in that the rolling bearing domain adaptation fault diagnosis model is optimized using back propagation and Adam optimization algorithms, that is, the total loss function is minimized.

10. A rolling bearing domain adaptation fault diagnosis method based on attention mechanism according to claim 8, characterized in that the model parameters are updated using the following formula:

Among them, ε is the learning rate.