WO2024021246A1 - 基于持续学习的跨设备增量轴承故障诊断方法 - Google Patents

基于持续学习的跨设备增量轴承故障诊断方法 Download PDF

Info

Publication number
WO2024021246A1
WO2024021246A1 PCT/CN2022/118373 CN2022118373W WO2024021246A1 WO 2024021246 A1 WO2024021246 A1 WO 2024021246A1 CN 2022118373 W CN2022118373 W CN 2022118373W WO 2024021246 A1 WO2024021246 A1 WO 2024021246A1
Authority
WO
WIPO (PCT)
Prior art keywords
model
stage
diagnosis
incremental
cross
Prior art date
Application number
PCT/CN2022/118373
Other languages
English (en)
French (fr)
Inventor
沈长青
陈博戬
李林
孔林
谭陆洋
王冬
石娟娟
黄伟国
朱忠奎
Original Assignee
苏州大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 苏州大学 filed Critical 苏州大学
Publication of WO2024021246A1 publication Critical patent/WO2024021246A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01MTESTING STATIC OR DYNAMIC BALANCE OF MACHINES OR STRUCTURES; TESTING OF STRUCTURES OR APPARATUS, NOT OTHERWISE PROVIDED FOR
    • G01M13/00Testing of machine parts
    • G01M13/04Bearings
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the present invention relates to the technical fields of bearing fault diagnosis and deep learning, and in particular, to a cross-equipment incremental bearing fault diagnosis method based on continuous learning.
  • fault diagnosis based on machine learning generally includes steps such as signal collection, feature extraction, fault identification and prediction. This method greatly simplifies the fault diagnosis process and improves the diagnosis efficiency.
  • steps such as signal collection, feature extraction, fault identification and prediction.
  • This method greatly simplifies the fault diagnosis process and improves the diagnosis efficiency.
  • most of them are shallow networks with simple structures and limited levels, their effectiveness depends on the effectiveness of feature extraction in early preprocessing.
  • the device status signal has limited processing capabilities. Therefore, many researchers have used the excellent adaptive feature learning and extraction capabilities of deep learning to overcome the shortcomings of shallow models that are difficult to represent the complex mapping relationship between signals and health conditions, and have achieved good results.
  • transfer learning With the rapid development of transfer learning, with the help of its cross-domain and cross-distribution knowledge mining and transfer capabilities, transfer learning solutions for problems with limited label samples (very small samples or no samples) or changing working conditions are widely used in machinery.
  • the field of fault diagnosis has also developed.
  • transfer learning can only satisfy the fault diagnosis of a single target task, that is, one migration can be completed under given conditions in the source domain and target domain. Due to the diversity of mechanical equipment faults and operating conditions, when facing new tasks, The model's generalization ability is greatly reduced and its versatility is poor; on the other hand, transfer learning does not involve the accumulation of knowledge, and often performs poorly when faced with the task of identifying equipment status under the corresponding working conditions of source domain data, which is different from engineering. The actual requirements do not match.
  • the technical problem to be solved by the present invention is to overcome the problems existing in the existing technology and propose a cross-equipment incremental bearing fault diagnosis method based on continuous learning to solve the problem that the existing fault diagnosis model based on deep learning and transfer learning cannot Resolve cross-equipment bearing failure issues.
  • the present invention provides a cross-equipment incremental bearing fault diagnosis method based on continuous learning, which includes the following steps:
  • S101 Use acceleration sensors to collect bearing vibration signals on multiple different devices to build a cross-device incremental bearing health status data set, and divide the data set into different stages of bearing fault diagnosis tasks according to equipment;
  • S102 In the initial stage, use the bearing fault diagnosis task data of the first equipment to train ResNet-32, build an initial diagnosis model, and screen typical examples of each fault type in this stage;
  • S103 Introduce neuron-level fine-tuning in the incremental stage to modify the initial diagnostic model, obtain a two-branch residual adaptive aggregation network, and use the nearest neighbor classifier or cosine normalized classifier to replace the fully connected layer classification of the initial diagnostic model device to obtain the diagnostic model;
  • S104 Combine the typical example with the bearing fault diagnosis task data of the next equipment to jointly train the diagnosis model, and use the loss function of the incremental stage to reduce the diagnosis model of the current stage and the diagnosis model of the previous stage based on the diagnosis task data of the previous stage. Performance differences, and optimize the aggregation weights and model parameters through a two-layer optimization scheme. After training is completed, typical examples of each fault type at this stage are screened;
  • step S105 Repeat step S104. After completing the learning of the current stage task, use the current diagnostic model to diagnose bearing faults of all learned tasks, obtain bearing fault diagnosis results, and verify the ability of the diagnostic model to overcome catastrophic forgetting.
  • acceleration sensors are used to collect bearing vibration signals on multiple different devices to construct a cross-device incremental bearing health status data set, and the data set is divided into different stages of bearing fault diagnosis tasks according to equipment.
  • T n the diagnostic task of the n-th device
  • Pn the number of fault data samples for task Tn
  • T n the data of T n
  • Pn the number of fault data samples for task Tn
  • T n the i-th sample in T n
  • C n The health status label of _ _ _ Number of fault types C n .
  • the bearing fault diagnosis task data of the first equipment is used to train ResNet-32, build an initial diagnosis model, and select typical examples of each fault type at this stage for storage, including:
  • the feature extractor F 0 is used to select training samples through the herding algorithm as typical examples of the fault types learned at this stage.
  • the feature extractor F 0 is used to select training samples through the herding algorithm as typical examples of fault types learned at this stage, including:
  • neuron-level fine-tuning is introduced to modify the initial stage diagnosis model to obtain a two-branch residual adaptive aggregation network, which includes:
  • a dual-branch residual adaptive aggregation network is used to replace the single-branch ResNet-32 of the initial stage diagnosis model.
  • the dual-branch residual adaptive aggregation network includes a dynamic branch and a steady-state branch.
  • the dynamic branch uses parameter-level fine-tuning
  • the steady-state branch uses Neuron-level fine-tuning.
  • the dynamic branch uses parameter-level fine-tuning
  • the steady-state branch uses neuron-level fine-tuning, including:
  • the parameter-level fine-tuning used by the dynamic branch uses the initial diagnostic model parameters to initialize during training and then uses the training data to fine-tune all parameters of the branch;
  • the steady-state branch uses neuron-level fine-tuning to freeze the model network parameters after initialization using the initial diagnostic model parameters, and gives each neuron a scaling weight, and uses each stage of task training fine-tuning.
  • the method of using the nearest neighbor classifier or the cosine normalized classifier to replace the fully connected layer classifier of the initial diagnostic model for classification includes:
  • the cosine normalized classifier classifies by calculating the cosine similarity between features and prototypes obtained by various types of learning.
  • ⁇ 0 is the fully connected classification layer parameter in the initial stage
  • ⁇ n is the prototype of each class obtained through learning
  • eta is the learnable scaling parameter.
  • the stored typical examples and the bearing fault diagnosis task data of the next equipment are jointly trained to train the diagnosis model, including:
  • the two-branch residual adaptive aggregation network is trained using the typical examples stored in the initial stage and the bearing fault diagnosis task data of the next equipment, and each residual block layer is given adaptive aggregation of dynamic residual blocks and steady-state residual blocks respectively.
  • the two-branch residual adaptive aggregation network given adaptive aggregation weights is used to perform feature extraction on the training data x [0] .
  • the dynamic residual block and the steady-state residual block are extracted.
  • the characteristics are The aggregated features of the nth residual block layer are obtained as Where W 0 represents the frozen parameters in the initial stage, f [n] is the feature extraction process of the nth residual block layer,
  • the loss function in the incremental stage includes a classification cross-entropy loss function, a classification-level knowledge distillation loss function and a feature-level knowledge distillation loss function;
  • the classification cross-entropy loss function is The classification-level knowledge distillation loss function in and are the soft labels of the old model and the soft predictions of the new model respectively, and T represents the temperature parameter;
  • the feature-level knowledge distillation loss is in and are the normalized features extracted by the diagnostic model of the current stage and the diagnostic model of the previous stage, respectively. The cosine similarity between the two is measured.
  • the aggregation weights and model parameters are optimized through a two-layer optimization scheme, including:
  • the two-layer optimization scheme includes upper-layer problems and lower level issues
  • the underlying problem is passed Update the model parameters ⁇ n , where ⁇ 1 is the learning rate of the lower layer problem;
  • the upper-level problem is obtained by randomly sampling the task data D n learned at this stage.
  • Build balanced data pass Update the adaptive aggregation weights, where ⁇ 2 is the upper-level problem learning rate.
  • the diagnostic model trained in the incremental phase needs to be able to complete all learned tasks, so the test data includes all learned fault classes C 0:n to verify that the diagnostic model overcomes catastrophic The ability to forget.
  • This invention uses a continuous learning method to build a diagnostic model that continuously accumulates and reuses knowledge to meet the needs of incremental bearing fault diagnosis across equipment. Compared with traditional deep learning methods, this invention can solve the problem of catastrophic forgetting and is more consistent with Practical scenarios of industrial applications.
  • Figure 1 is a flow chart of the cross-equipment incremental bearing fault diagnosis method based on continuous learning provided by the present invention.
  • Figure 2 is a schematic structural diagram of the dual-branch residual adaptive aggregation network provided by the present invention.
  • Figure 1 is a flow chart of a specific embodiment of the cross-equipment incremental bearing fault diagnosis method based on continuous learning provided by the present invention.
  • the specific operation steps are as follows:
  • Step S101 Use acceleration sensors to collect bearing vibration signals on multiple different devices to construct a cross-device incremental bearing health status data set, and divide the data set into different stages of bearing fault diagnosis tasks according to equipment;
  • Step S102 In the initial stage, use the bearing fault diagnosis task data of the first equipment to train ResNet-32, build an initial diagnosis model, and screen typical examples of each fault type in this stage;
  • Step S103 Introduce neuron-level fine-tuning in the incremental stage to modify the initial diagnosis model, obtain a two-branch residual adaptive aggregation network, and use the nearest neighbor classifier or cosine normalized classifier to replace the fully connected layer of the initial diagnosis model Classifier to obtain the diagnostic model;
  • Step S104 Jointly train the diagnosis model with the typical example and the bearing fault diagnosis task data of the next equipment, and use the loss function of the incremental stage to reduce the diagnosis model of the current stage and the diagnosis model of the previous stage based on the diagnosis task data of the previous stage.
  • the difference in performance on the old task data is compared with the performance difference on the old task data, and the aggregation weights and model parameters are optimized through a two-layer optimization scheme.
  • typical examples of each fault type at this stage are screened;
  • Step S105 Repeat step S104. After completing the learning of the current stage task, use the current diagnostic model to diagnose bearing faults of all learned tasks, obtain bearing fault diagnosis results, and verify the ability of the diagnostic model to overcome catastrophic forgetting.
  • the cross-device incremental bearing fault diagnosis method provided by the present invention based on continuous learning first collects bearing vibration signals on multiple different devices through acceleration sensors to construct a cross-device incremental bearing health status data set, and divides diagnosis into different stages according to equipment.
  • Task simulate the occurrence of unexpected sub-machine failures in actual scenarios, resulting in an increase in diagnostic tasks due to cross-equipment bearing failures; use the bearing fault diagnosis task data of the first device to train ResNet-32, build an initial diagnosis model, and screen each fault type Typical examples are stored; then neuron-level fine-tuning is introduced to modify the initial diagnosis model, and a two-branch residual adaptive aggregation network is obtained, which is used as a feature extractor in the subsequent incremental stage to ensure that the model maintains the plasticity of the model when it continues to learn new tasks.
  • the diagnostic model is trained together with the fault data to awaken the model's memory of old knowledge, overcome the catastrophic forgetting of the deep learning model, maintain the balance of the model's plasticity and stability by aggregating weights, and use the loss function in the incremental stage to shrink the performance of the new and old models.
  • the diagnostic model diagnoses the bearing faults of all learned tasks, obtains the bearing fault diagnosis results, and verifies the diagnostic model's ability to overcome catastrophic forgetting; that is, the present invention uses a continuous learning method to build a diagnostic model that continuously accumulates and reuses knowledge. To meet the needs of incremental bearing fault diagnosis across equipment.
  • step S101 specifically includes the following steps:
  • Step S101.1 Use multiple test benches to collect the required experimental data and build a cross-equipment incremental bearing health status data set.
  • each data set has 6 fault types, totaling 18 fault types. There are 100 training samples and 100 test samples for each fault type.
  • the three datasets are studied sequentially. For example, ABC represents the completion of the diagnostic tasks of data sets A, B, and C in phase 0, incremental phase 1, and incremental phase 2 respectively.
  • step S102 specifically includes the following steps:
  • step S103 specifically includes the following steps:
  • S103.1 Redesign and modify the initial diagnosis model, introduce neuron-level fine-tuning to characterize the model stability, and obtain a two-branch residual adaptive aggregation network. Its structure is shown in Figure 2. The two-branch residual adaptive aggregation The network serves as a feature extractor in subsequent incremental stages to ensure that the model maintains a balance between plasticity and stability when it continues to learn new tasks.
  • the parameter-level fine-tuning used by the dynamic branch is shown in Figure 2(a).
  • the initial model parameters are used to initialize and then the training data is used to fine-tune all parameters ⁇ of the branch.
  • the dynamic branch can quickly learn new tasks and therefore can characterize the model. plasticity
  • the neuron-level fine-tuning used in the steady-state branch is shown in Figure 2(b).
  • the model network parameters will be frozen, and each neuron will be given a scaling weight ⁇ , and then each stage task will be used.
  • the original ResNet-32 is replaced by a dual-branch residual adaptive aggregation network.
  • the dual-branch aggregation network structure is shown in Figure 2(c).
  • S103.2 Use two classifiers that can avoid the class bias problem, namely the nearest neighbor classifier and the cosine normalized classifier, instead of the original fully connected layer classifier.
  • Both the nearest neighbor classifier and the cosine normalized classifier can avoid the old and new class classification bias problem.
  • step S104 specifically includes the following steps:
  • the training data x [0] extracts features through a dual-branch aggregation network.
  • the features extracted by the dynamic residual block and the steady-state residual block are respectively Where W 0 comes from the frozen parameters of stage 0, and f [n] is the feature extraction process of the nth residual block layer;
  • the aggregated feature of the nth residual block layer is in
  • the loss function in the incremental stage is categorical cross-entropy loss and knowledge distillation loss in, and For the soft label of the old model in the old fault class and the hard label of the new model in the old fault class, the temperature T is usually greater than 1.
  • the difference between the performance of the new model and the old model on the old fault class C 0:n-1 is narrowed through the knowledge distillation loss.
  • the similarity distribution of the old class in the new model is approximately constrained to the similarity distribution of the old class in the old model.
  • the loss function of the incremental stage is where 0 ⁇ 1.
  • the loss function in the incremental stage is composed of a classification cross-entropy loss function, a classification-level knowledge distillation loss function and a feature-level knowledge distillation loss function;
  • the classification-level knowledge distillation loss function in and They are the soft label of the old model and the soft prediction of the new model respectively.
  • the temperature parameter T is usually greater than 1.
  • the predictions of the old and new models on the same sample are constrained to be similar to overcome catastrophic forgetting;
  • the feature-level knowledge distillation loss is in and are the normalized features extracted by the new and old models respectively, The cosine similarity between the two is measured. Feature-level knowledge distillation loss encourages the similarity of features of the same sample extracted by the old and new models, further overcoming the catastrophic forgetting of the model;
  • the loss function of the incremental stage is As the number of learning tasks increases, the degree of old knowledge that needs to be preserved will also increase, so the scaling parameter is
  • the adaptive aggregation weight and the optimization of model parameters mutually restrict each other, that is, the update of the diagnostic model ⁇ n parameters requires the fixed adaptive aggregation weight, and the update of the adaptive self-aggregation weight ⁇ n requires the fixed model parameters, so a two-layer optimization scheme is adopted ;
  • the two-layer optimization scheme is divided into upper-level problems and lower level issues
  • the underlying problem is passed Update the model parameters ⁇ n , where ⁇ 1 is the learning rate of the lower layer problem;
  • the update of the adaptive aggregation weight in the upper-level problem is to balance the plasticity and stability of the model, which is obtained by randomly sampling the task data D n learned in this stage.
  • Build balanced data pass Update the adaptive aggregation weights, where ⁇ 2 is the upper-level problem learning rate.
  • step S105 specifically includes the following steps:
  • the diagnostic model ⁇ n trained in the incremental phase n (also referred to as incremental phase 2) needs to be able to complete all learned tasks, so the test data contains all learned fault classes C 0:n to verify that the model overcomes catastrophic forgetting Ability.
  • the present invention designs a method for incremental bearing fault diagnosis across equipment based on a continuous learning method. Compared with traditional deep learning methods, this invention can solve the problem of catastrophic forgetting and is more in line with actual scenarios of industrial applications.
  • embodiments of the present application may be provided as methods, systems, or computer program products. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment that combines software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk memory, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
  • computer-usable storage media including, but not limited to, disk memory, CD-ROM, optical storage, etc.
  • These computer program instructions may also be stored in a computer-readable memory that causes a computer or other programmable data processing apparatus to operate in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction means, the instructions
  • the device implements the functions specified in a process or processes of the flowchart and/or a block or blocks of the block diagram.
  • These computer program instructions may also be loaded onto a computer or other programmable data processing device, causing a series of operating steps to be performed on the computer or other programmable device to produce computer-implemented processing, thereby executing on the computer or other programmable device.
  • Instructions provide steps for implementing the functions specified in a process or processes of a flowchart diagram and/or a block or blocks of a block diagram.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Testing Of Devices, Machine Parts, Or Other Structures Thereof (AREA)

Abstract

一种基于持续学习的跨设备增量轴承故障诊断方法,包括构建跨设备增量轴承健康状态数据集,按设备划分不同阶段的诊断任务(S101);使用第一个设备的诊断任务数据构建初始诊断模型,筛选典例(S102);基于初始诊断模型引入神经元级微调和分类器得到诊断模型(S103);将典例与下一个设备的轴承故障诊断任务数据共同训练诊断模型,使用损失函数缩小当前阶段的诊断模型与上一阶段的诊断模型在上一阶段诊断任务数据上表现的差异,筛选典例(S104);重复步骤S104,使用当前的诊断模型诊断所有已学习任务的轴承故障,得到轴承故障诊断结果(S105)。上述方法采用持续学习方法构建一个不断积累和复用知识的诊断模型,能解决灾难性遗忘问题,以适应跨设备增量轴承故障诊断的需求。

Description

基于持续学习的跨设备增量轴承故障诊断方法 技术领域
本发明涉及轴承故障诊断和深度学习技术领域,尤其是指一种基于持续学习的跨设备增量轴承故障诊断方法。
背景技术
目前旋转机械设备的高速、重载以及自动化程度要求不断提高所表现出来的动态信号更加复杂,现代状态监测技术已经能够实现对复杂装备多测点、全寿命的数据采集,进而获得海量的数据,但同时导致对动态信号的处理和其中健康状态信息的特征提取带来较大难度。传统的故障诊断方法有基于振动信号提取故障特征频率、短时傅里叶变换、经验模式分解、稀疏表示方法等。这些方法已经较为成熟,但对目前的机械设备状态信号而言,基于信号处理的方法不具备处理大量信号数据中故障数据密度低、干扰强且在变工况下表现多样性的能力。
近年来,随着人工智能、机器学习领域的快速发展,越来越多的基于机器学习的旋转机械智能故障诊断方法被提出。基于机器学习的故障诊断一般包括信号采集,特征提取,故障识别与预测等步骤。这种方法大大简化了故障诊断过程,提高了诊断效率,但是由于多为浅层网络,结构简单,层次有限,其有效性取决于前期预处理提取特征的有效性,在面对大量结构复杂的设备状态信号时处理能力有限。因此,许多学者利用深度学习优秀的自适应特征学习与提取能力,克服了浅层模型难以表征信号与健康状况之间复杂的映射关系的不足,取得了不错的效果。然而,这些方法都是基于两个假设:训练数据与测试数据同分布且训练数据足够多。但是在实际工程中,机械设备运行工况多变且故障发生具有偶然性,获得的样本难以满足以上两个假设,因而直接影响故障诊断的结果。
随着迁移学习的快速发展,借助其在跨领域、跨分布间的知识挖掘和迁 移能力,针对标签样本受限(极小样本或无样本)问题或者变工况问题的迁移学习解决方案在机械故障诊断领域也得到了发展。但是迁移学习仅能满足单一目标任务的故障诊断,即源域和目标域给定条件下完成一次迁移即可,由于机械设备故障本身与运行工况的多样性,在面对新的任务时,模型的泛化能力大大下降,通用性较差;另一方面,迁移学习不涉及对知识的积累,在面对源域数据对应工况下的设备状态识别任务时,往往表现较差,与工程实际中的需求不符。由于运行工况的复杂多变,一个机械系统中多个子机器经常会产生意外故障,导致跨设备增量诊断问题的出现,使得通过预收集半完备的故障数据训练的深度诊断模型和深度迁移诊断模型失效,因此需要重新训练模型以识别新的故障类型。然而,使用新类型的数据直接训练深度模型将导致旧故障类的识别表现断崖式下降,这称为灾难性遗忘。灾难性遗忘一直是深度学习领域的一个重要问题,同样,在故障诊断领域,需要研究解决由意外故障引起的深度诊断模型的灾难性遗忘问题,以建立可靠性、泛化性和通用性更强的持续故障诊断模型。
发明内容
为此,本发明所要解决的技术问题在于克服现有技术存在的问题,提出一种基于持续学习的跨设备增量轴承故障诊断方法,以解决现有基于深度学习和迁移学习的故障诊断模型不能解决跨设备轴承故障问题。
为解决上述技术问题,本发明提供一种基于持续学习的跨设备增量轴承故障诊断方法,包括以下步骤:
S101:利用加速度传感器采集多个不同设备上的轴承振动信号构建跨设备增量轴承健康状态数据集,并将数据集按照设备划分为不同阶段的轴承故障诊断任务;
S102:在初始阶段使用第一个设备的轴承故障诊断任务数据训练ResNet-32,构建初始诊断模型,并筛选该阶段各个故障类型的典例;
S103:在增量阶段引入神经元级微调对初始诊断模型进行修改,得到双分支残差自适应聚合网络,并使用最近邻典例分类器或余弦标准化分类器取代初始诊断模型的全连接层分类器,得到诊断模型;
S104:将典例与下一个设备的轴承故障诊断任务数据共同训练所述诊断 模型,使用增量阶段的损失函数缩小当前阶段的诊断模型与上一阶段的诊断模型在上一阶段诊断任务数据上表现的差异,并通过双层优化方案优化聚合权重和模型参数,训练完成后筛选该阶段各个故障类型的典例;
S105:重复步骤S104,当完成当前阶段任务的学习后,使用当前的诊断模型诊断所有已学习任务的轴承故障,得到轴承故障诊断结果,并验证诊断模型克服灾难性遗忘的能力。
在本发明的一个实施例中,所述利用加速度传感器采集多个不同设备上的轴承振动信号构建跨设备增量轴承健康状态数据集,并将数据集按照设备划分为不同阶段的轴承故障诊断任务,包括:
利用加速度传感器采集多个不同设备上的轴承振动信号构建跨设备增量轴承健康状态数据集D;
按照设备划分不同阶段的诊断任务,假设第n个设备的诊断任务用T n表示,T n的数据为
Figure PCTCN2022118373-appb-000001
其中P n是任务T n的故障数据样本的数量,
Figure PCTCN2022118373-appb-000002
表示T n中的第i个样本,
Figure PCTCN2022118373-appb-000003
表示
Figure PCTCN2022118373-appb-000004
的健康状态标签,J n表示在任务T n之前学习的故障类型C 0:n-1={C 0,C 1,K,C n-1}的数量,K n表示任务T n中学习的故障类型C n的数量。
在本发明的一个实施例中,所述使用第一个设备的轴承故障诊断任务数据训练ResNet-32,构建初始诊断模型,并筛选该阶段各个故障类型的典例进行存储,包括:
在初始阶段利用任务T 0的数据
Figure PCTCN2022118373-appb-000005
训练ResNet-32学习故障类型C 0得到初始诊断模型Θ 0,其中
Figure PCTCN2022118373-appb-000006
初始诊断模型Θ 0的损失函数为:
Figure PCTCN2022118373-appb-000007
其中δ是真实标签;
在完成此阶段模型训练后,使用特征提取器F 0通过herding算法选取训练样本作为此阶段学习的故障类型的典例。
在本发明的一个实施例中,所述使用特征提取器F 0通过herding算法选取训练样本作为此阶段学习的故障类型的典例,包括:
使用
Figure PCTCN2022118373-appb-000008
表示故障类型c的训练样本,则c的类平均为
Figure PCTCN2022118373-appb-000009
其中P c是类c的训练样本的数量,每个典例ε通过
Figure PCTCN2022118373-appb-000010
计算得到类c的典例 m c=(ε 01,K,ε t-1),其中t表示典例数量。
在本发明的一个实施例中,引入神经元级微调对初始阶段诊断模型进行修改,得到双分支残差自适应聚合网络,包括:
利用双分支残差自适应聚合网络替换初始阶段诊断模型的单分支ResNet-32,其中,双分支残差自适应聚合网络包含动态分支和稳态分支,动态分支使用参数级微调,稳态分支使用神经元级微调。
在本发明的一个实施例中,动态分支使用参数级微调,稳态分支使用神经元级微调,包括:
所述动态分支使用的参数级微调在训练时使用初始诊断模型参数初始化后用训练数据微调该分支所有参数;
所述稳态分支使用神经元级微调在使用初始诊断模型参数初始化后将冻结模型网络参数,并赋予每个神经元一个缩放权重,并使用各个阶段任务训练微调。
在本发明的一个实施例中,所述使用最近邻典例分类器或余弦标准化分类器取代初始诊断模型的全连接层分类器进行分类的方法,包括:
所述最近邻典例分类器通过计算各类典例的特征均值
Figure PCTCN2022118373-appb-000011
进行分类,其中y=0,1,K,J n+K n-1,P y是m y中样本的数量,F(·)是当前阶段的特征提取器,假设输入故障样本x,则通过
Figure PCTCN2022118373-appb-000012
预测类型;
所述余弦标准化分类器通过计算特征与各类学习得到的原型之间的余弦相似度进行分类,在初始阶段通过
Figure PCTCN2022118373-appb-000013
计算输入x为类c的预测概率,其中θ 0为初始阶段全连接分类层参数,h 0=F 0(x)为初始阶段提取的特征;在增量阶段通过
Figure PCTCN2022118373-appb-000014
计算输入x为类c的预测概率,其中θ n为学习得到的每个类的原型,h n=F n(x)为增量阶段n提取的特征,
Figure PCTCN2022118373-appb-000015
表示l 2归一化,
Figure PCTCN2022118373-appb-000016
η是可学习缩放参数。
在本发明的一个实施例中,存储的典例与下一个设备的轴承故障诊断任务数据共同训练所述诊断模型,包括:
利用初始阶段存储的典例和下一个设备的轴承故障诊断任务数据训练双 分支残差自适应聚合网络,分别赋予每个残差块层的动态残差块和稳态残差块的自适应聚合权重ω α和ω β
使用赋予了自适应聚合权重的所述双分支残差自适应聚合网络对训练数据x [0]进行特征提取,在第n个残差块层,动态残差块和稳态残差块提取的特征分别为
Figure PCTCN2022118373-appb-000017
得到第n个残差块层的聚合特征为
Figure PCTCN2022118373-appb-000018
其中W 0表示初始阶段被冻结的参数,f [n]为第n个残差块层的特征提取过程,
Figure PCTCN2022118373-appb-000019
在本发明的一个实施例中,所述增量阶段的损失函数包括分类交叉熵损失函数、分类级知识蒸馏损失函数和特征级知识蒸馏损失函数;所述分类交叉熵损失函数为
Figure PCTCN2022118373-appb-000020
所述分类级知识蒸馏损失函数
Figure PCTCN2022118373-appb-000021
其中
Figure PCTCN2022118373-appb-000022
Figure PCTCN2022118373-appb-000023
Figure PCTCN2022118373-appb-000024
分别为旧模型的软标签和新模型的软预测,T表示温度参数;所述特征级知识蒸馏损失为
Figure PCTCN2022118373-appb-000025
其中
Figure PCTCN2022118373-appb-000026
Figure PCTCN2022118373-appb-000027
分别为当前阶段的诊断模型与上一阶段的诊断模型所提取的归一化特征,
Figure PCTCN2022118373-appb-000028
度量了二者之间的余弦相似度。
在本发明的一个实施例中,通过双层优化方案优化聚合权重和模型参数,包括:
所述双层优化方案包括上层问题
Figure PCTCN2022118373-appb-000029
和下层问题
Figure PCTCN2022118373-appb-000030
所述下层问题通过
Figure PCTCN2022118373-appb-000031
更新模型参数Θ n,其中μ 1是下层问题学习率;
所述上层问题利用对该阶段学习的任务数据D n随机采样得到
Figure PCTCN2022118373-appb-000032
构建平衡数据
Figure PCTCN2022118373-appb-000033
通过
Figure PCTCN2022118373-appb-000034
更新自适应聚合权重,其中μ 2是上层问题学习率。
在本发明的一个实施例中,所述增量阶段训练得到的诊断模型需能完成所有已学任务,因此测试数据包含所有已学习故障类C 0:n,以验证所述诊断模型克服灾难性遗忘的能力。
本发明的上述技术方案相比现有技术具有以下优点:
本发明采用持续学习方法构建一个不断积累和复用知识的诊断模型,以适应跨设备增量轴承故障诊断的需求,与传统的深度学习方法相比,本发明能解决灾难性遗忘问题,更加符合工业应用的实际场景。
附图说明
为了使本发明的内容更容易被清楚的理解,下面根据本发明的具体实施例并结合附图,对本发明作进一步详细的说明。
图1为本发明所提供的基于持续学习的跨设备增量轴承故障诊断方法的流程图。
图2为本发明所提供的双分支残差自适应聚合网络的结构示意图。
具体实施方式
下面结合附图和具体实施例对本发明作进一步说明,以使本领域的技术人员可以更好地理解本发明并能予以实施,但所举实施例不作为对本发明的限定。
请参考图1,图1为本发明所提供的基于持续学习的跨设备增量轴承故障诊断方法的具体实施例的流程图,具体操作步骤如下:
步骤S101:利用加速度传感器采集多个不同设备上的轴承振动信号构建跨设备增量轴承健康状态数据集,并将数据集按照设备划分为不同阶段的轴承故障诊断任务;
步骤S102:在初始阶段使用第一个设备的轴承故障诊断任务数据训练ResNet-32,构建初始诊断模型,并筛选该阶段各个故障类型的典例;
步骤S103:在增量阶段引入神经元级微调对初始诊断模型进行修改,得到 双分支残差自适应聚合网络,并使用最近邻典例分类器或余弦标准化分类器取代初始诊断模型的全连接层分类器,得到诊断模型;
步骤S104:将典例与下一个设备的轴承故障诊断任务数据共同训练所述诊断模型,使用增量阶段的损失函数缩小当前阶段的诊断模型与上一阶段的诊断模型在上一阶段诊断任务数据上表现的差异在旧任务数据上表现的差异,并通过双层优化方案优化聚合权重和模型参数,训练完成后筛选该阶段各个故障类型的典例;
步骤S105:重复步骤S104,当完成当前阶段任务的学习后,使用当前的诊断模型诊断所有已学习任务的轴承故障,得到轴承故障诊断结果,并验证诊断模型克服灾难性遗忘的能力。
本发明所提供的基于持续学习的跨设备增量轴承故障诊断方法,首先通过加速度传感器采集多个不同设备上的轴承振动信号构建跨设备增量轴承健康状态数据集,按设备划分不同阶段的诊断任务,模拟实际场景中子机器意外故障的发生造成了跨设备轴承故障而导致的诊断任务增加;使用第一个设备的轴承故障诊断任务数据训练ResNet-32,构建初始诊断模型,筛选各故障类型的典例存储;接着引入神经元级微调对初始诊断模型进行修改,得到双分支残差自适应聚合网络,作为后续增量阶段的特征提取器,以保证模型持续学习新任务时维持模型的可塑性和稳定性的平衡,并使用最近邻典例分类器或余弦标准化分类器取代初始诊断模型的全连接层分类器,得到诊断模型,从而能够避免类偏置问题;存储的典例与下一个设备的故障数据一同训练诊断模型,以唤醒模型对旧知识的记忆,克服深度学习模型的灾难性遗忘,通过聚合权重维持模型的可塑性和稳定性的平衡,使用增量阶段的损失函数缩小新旧模型在旧任务数据上表现的差异,并通过双层优化方案优化聚合权重和模型参数,完成训练后,选取此阶段数据的典例存储;重复上述训练过程,当完成当前阶段任务的学习后,使用当前的诊断模型诊断所有已学习任务的轴承故障,得到轴承故障诊断结果,并验证诊断模型克服灾难性遗忘的能力;即本发明采用了持续学习方法,构建一个不断积累和复用知识的诊断模型,以适应跨设备增量轴承故障诊断的需求。
进一步的,所述步骤S101具体包括以下步骤:
步骤S101.1:使用多个试验台采集所需实验数据,构建跨设备增量轴承健康状态数据集。
步骤S101.2:根据实际场景,划分不同阶段的诊断任务。假设第n个机器设备诊断任务用T n表示,T n的数据为
Figure PCTCN2022118373-appb-000035
其中P n是任务T n的故障数据样本的数量。用J n表示在任务T n之前学习的故障类型
Figure PCTCN2022118373-appb-000036
的数量,K n表示任务T n中学习的故障类型C n的数量,那么J n+1=K n+J n,所以
Figure PCTCN2022118373-appb-000037
表示T n中的第i个样本,
Figure PCTCN2022118373-appb-000038
表示
Figure PCTCN2022118373-appb-000039
的健康状态标签。
如表1所列,使用三个机器设备的轴承故障数据集划分不同阶段的诊断任务。每个数据集有6种故障类型,共计18种故障类型。每个故障类型有100个训练样本和100个测试样本。三个数据集按顺序进行学习。例如,ABC代表在阶段0、增量阶段1和增量阶段2分别完成数据集A、B、C的诊断任务。
表1跨设备增量轴承故障数据集描述
Figure PCTCN2022118373-appb-000040
进一步的,所述步骤S102具体包括以下步骤:
S102.1:在初始阶段(也指阶段0)利用诊断任务T 0的数据
Figure PCTCN2022118373-appb-000041
训练原始的ResNet-32学习故障类C 0得到初始诊断模型Θ 0,ResNet-32的详细结构如表2所示。模型的损失函数为分类交叉熵损失函数:
Figure PCTCN2022118373-appb-000042
其中δ是真实标签。所述初始模型参数Θ 0的更新过程为常规的
Figure PCTCN2022118373-appb-000043
表2骨干网络ResNet-32的结构化参数
Figure PCTCN2022118373-appb-000044
S102.2:训练完成后,利用分类层前的特征提取器F 0通过herding算法选取一定数量的典例M 0。用
Figure PCTCN2022118373-appb-000045
表示故障类型c的训练样本,则c的类平均为
Figure PCTCN2022118373-appb-000046
其中P c是类c的训练样本的数量。选取的典例数量为5,则每个典例ε通过
Figure PCTCN2022118373-appb-000047
计算得到类c的典例m c=(ε 01,K,ε t-1)。
进一步的,所述步骤S103具体包括以下步骤:
S103.1:对初始诊断模型进行重新设计和修改,引入神经元级微调表征模型稳定性,得到双分支残差自适应聚合网络,其结构如图2所示,将双分支残差自适应聚合网络作为后续增量阶段的特征提取器,以保证模型持续学习新任务时维持模型的可塑性和稳定性的平衡。
所述动态分支使用的参数级微调,如图2(a)所示,在训练时使用初始模型参数初始化后用训练数据微调该分支所有参数α,动态分支可以快速学习新任务,因此可表征模型的可塑性
所述稳态分支使用的神经元级微调,如图2(b)所示,在使用初始模型参数初始化后将冻结模型网络参数,并赋予每个神经元一个缩放权重β,接着用各阶段任务训练微调β。假设稳态分支第k层卷积神经网络包含Q个神经元,所述神经元权重为初始模型冻结的参数
Figure PCTCN2022118373-appb-000048
所述缩放权重为
Figure PCTCN2022118373-appb-000049
则第k层卷积神经网络的输入为x k-1,输出为x k=(W kk)x k-1,其中e为哈达玛积。稳态分支由于冻结了初始阶段的所有参数,且待学习的参数远少于动态分支,因此具有克服灾难性遗忘的能力,能表征模型的稳定性。
利用双分支残差自适应聚合网络替换原始的ResNet-32,双分支聚合网络结构如图2(c)所示。
S103.2:使用两种可以避免类偏置问题的分类器,即最近邻典例分类器和余弦标准化分类器,代替原始的全连接层分类器。
所述最近邻典例分类器通过计算各类典例的特征均值
Figure PCTCN2022118373-appb-000050
进行分类,其中y=0,1,K,J n+K n-1,P y是m y中样本的数量,F(·)是当前阶段的特征提取器,假设输入故障样本x,那么通过
Figure PCTCN2022118373-appb-000051
预测类型;
所述余弦标准化分类器通过计算特征与各类学习得到的原型之间的余弦相似度进行分类。在初始阶段通过
Figure PCTCN2022118373-appb-000052
计算输入x为类c的预测概率,其中θ 0为初始阶段全连接分类层参数,h 0=F 0(x)为初始阶段提取的特征;在增量阶段n,使用余弦标准化分类器,即通过
Figure PCTCN2022118373-appb-000053
计算输入x为类c的预测概率,其中θ n为学习得到的每个类的原型,h n=F n(x)为增量阶段n提取的特征,
Figure PCTCN2022118373-appb-000054
表示l 2归一化,
Figure PCTCN2022118373-appb-000055
η是可学习缩放参数,通过η控制经过softmax处理的预测概率在[-1,1]范围内;
通过最近邻典例分类器和余弦标准化分类器均可避免新旧类分类偏置问题。
进一步的,所述步骤S104具体包括以下步骤:
S104.1:以增量阶段1为例,利用所述初始阶段保留的典例M 0和该阶段 任务数据D 1训练双分支残差自适应聚合网络,每个残差块层的动态残差块和稳态残差块表征的可塑性和稳定性需要平衡,因此,分别赋予自适应聚合权重ω α和ω β,如图2(c)所示;
所述训练数据x [0]通过双分支聚合网络提取特征,在第n个残差块层,动态残差块和稳态残差块提取的特征分别为
Figure PCTCN2022118373-appb-000056
其中W 0来自阶段0的被冻结的参数,f [n]为第n个残差块层的特征提取过程;
所述第n个残差块层的聚合特征为
Figure PCTCN2022118373-appb-000057
其中
Figure PCTCN2022118373-appb-000058
S104.2:所述增量阶段的损失函数为分类交叉熵损失
Figure PCTCN2022118373-appb-000059
和知识蒸馏损失
Figure PCTCN2022118373-appb-000060
其中,
Figure PCTCN2022118373-appb-000061
Figure PCTCN2022118373-appb-000062
分别为旧模型在旧故障类的软标签和新模型在旧故障类的硬标签,温度T通常大于1。通过知识蒸馏损失缩小了新模型在旧故障类C 0:n-1上的表现与旧模型的差异,新模型中旧类的相似性分布被近似约束为旧模型中旧类的相似性分布。所述增量阶段的损失函数为
Figure PCTCN2022118373-appb-000063
其中0<λ≤1。
S104.2:所述增量阶段的损失函数为分类交叉熵损失函数,分类级知识蒸馏损失函数和特征级知识蒸馏损失函数组成;
所述分类交叉熵损失函数为
Figure PCTCN2022118373-appb-000064
所述分类级知识蒸馏损失函数
Figure PCTCN2022118373-appb-000065
其中
Figure PCTCN2022118373-appb-000066
Figure PCTCN2022118373-appb-000067
分别为旧模型的软标签和新模型的软预测,温度参数T通常大于1,新旧模型在同一样本的预测被约束为相似以克服灾难性遗忘;
所述特征级知识蒸馏损失为
Figure PCTCN2022118373-appb-000068
其中
Figure PCTCN2022118373-appb-000069
Figure PCTCN2022118373-appb-000070
分别为新旧模型提取的归一化特征,
Figure PCTCN2022118373-appb-000071
度量了二者之间的余弦相似度。特征级知识蒸馏损失鼓励新旧模型提取的同一样本的特征相似,进一步克服模 型的灾难性遗忘;
所述增量阶段的损失函数为
Figure PCTCN2022118373-appb-000072
其中随着学习任务的数量增加,需要保存的旧知识的程度也会增加,因此缩放参数为
Figure PCTCN2022118373-appb-000073
所述自适应聚合权重和模型参数的最优化互相制约,即诊断模型Θ n参数的更新需要固定自适应聚合权重,而适应自聚合权重ω n的更新需要固定模型参数,因此采取双层优化方案;
所述双层优化方案分为上层问题
Figure PCTCN2022118373-appb-000074
和下层问题
Figure PCTCN2022118373-appb-000075
所述下层问题通过
Figure PCTCN2022118373-appb-000076
更新模型参数Θ n,其中μ 1是下层问题学习率;
所述上层问题中的自适应聚合权重的更新为平衡模型的可塑性与稳定性,利用对该阶段学习的任务数据D n随机采样得到
Figure PCTCN2022118373-appb-000077
构建平衡数据
Figure PCTCN2022118373-appb-000078
通过
Figure PCTCN2022118373-appb-000079
更新自适应聚合权重,其中μ 2是上层问题学习率。
进一步的,所述步骤S105具体包括以下步骤:
所述增量阶段n(也指增量阶段2)训练得到的诊断模型Θ n需能完成所有已学任务,因此测试数据包含所有已学习故障类C 0:n,以验证模型克服灾难性遗忘的能力。
表3六种任务顺序下的诊断精度
Figure PCTCN2022118373-appb-000080
如表3所示,在六种任务顺序下,重复五次实验,两种微调的诊断精度体现了不采取持续学习的深度学习诊断模型的灾难性遗忘,而本发明方法能有效解决灾难性遗忘,实现连续的跨设备增量轴承故障诊断。
综上所述,本发明基于持续学习方法设计了一种能实现跨设备增量轴承故障诊断方法。与传统的深度学习方法相比,本发明能解决灾难性遗忘问题,更合符工业应用的实际场景。
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘 存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
显然,上述实施例仅仅是为清楚地说明所作的举例,并非对实施方式的限定。对于所属领域的普通技术人员来说,在上述说明的基础上还可以做出其它不同形式变化或变动。这里无需也无法对所有的实施方式予以穷举。而由此所引申出的显而易见的变化或变动仍处于本发明创造的保护范围之中。

Claims (10)

  1. 一种基于持续学习的跨设备增量轴承故障诊断方法,其特征在于,包括以下步骤:
    S101:利用加速度传感器采集多个不同设备上的轴承振动信号构建跨设备增量轴承健康状态数据集,并将数据集按照设备划分为不同阶段的轴承故障诊断任务;
    S102:在初始阶段使用第一个设备的轴承故障诊断任务数据训练ResNet-32,构建初始诊断模型,并筛选该阶段各个故障类型的典例;
    S103:在增量阶段引入神经元级微调对初始诊断模型进行修改,得到双分支残差自适应聚合网络,并使用最近邻典例分类器或余弦标准化分类器取代初始诊断模型的全连接层分类器,得到诊断模型;
    S104:将典例与下一个设备的轴承故障诊断任务数据共同训练所述诊断模型,使用增量阶段的损失函数缩小当前阶段的诊断模型与上一阶段的诊断模型在上一阶段诊断任务数据上表现的差异,并通过双层优化方案优化聚合权重和模型参数,训练完成后筛选该阶段各个故障类型的典例;
    S105:重复步骤S104,当完成当前阶段任务的学习后,使用当前的诊断模型诊断所有已学习任务的轴承故障,得到轴承故障诊断结果,并验证诊断模型克服灾难性遗忘的能力。
  2. 如权利要求1所述的基于持续学习的跨设备增量轴承故障诊断方法,其特征在于,所述利用加速度传感器采集多个不同设备上的轴承振动信号构建跨设备增量轴承健康状态数据集,并将数据集按照设备划分为不同阶段的轴承故障诊断任务,包括:
    利用加速度传感器采集多个不同设备上的轴承振动信号构建跨设备增量轴承健康状态数据集D;
    按照设备划分不同阶段的诊断任务,假设第n个设备的诊断任务用T n表示,T n的数据为
    Figure PCTCN2022118373-appb-100001
    其中P n是任务T n的故障数据样本的数量,
    Figure PCTCN2022118373-appb-100002
    表示T n中的第i个样本,
    Figure PCTCN2022118373-appb-100003
    表示
    Figure PCTCN2022118373-appb-100004
    的健康状态标签,J n表示在任务T n之前学习的故障类型C 0:n-1={C 0,C 1,K,C n-1}的数量,K n表示任务T n中学习的故障类型C n的数量。
  3. 如权利要求2所述的基于持续学习的跨设备增量轴承故障诊断方法,其特征在于,所述使用第一个设备的轴承故障诊断任务数据训练ResNet-32,构建初始诊断模型,并筛选该阶段各个故障类型的典例进行存储,包括:
    在初始阶段利用任务T 0的数据
    Figure PCTCN2022118373-appb-100005
    训练ResNet-32学习故障类型C 0得到初始诊断模型Θ 0,其中
    Figure PCTCN2022118373-appb-100006
    初始诊断模型Θ 0的损失函数为:
    Figure PCTCN2022118373-appb-100007
    其中δ是真实标签;
    在完成此阶段模型训练后,使用特征提取器F 0通过herding算法选取训练样本作为此阶段学习的故障类型的典例。
  4. 如权利要求3所述的基于持续学习的跨设备增量轴承故障诊断方法,其特征在于,所述使用特征提取器F 0通过herding算法选取训练样本作为此阶段学习的故障类型的典例,包括:
    使用
    Figure PCTCN2022118373-appb-100008
    表示故障类型c的训练样本,则c的类平均为
    Figure PCTCN2022118373-appb-100009
    其中P c是类c的训练样本的数量,每个典例ε通过
    Figure PCTCN2022118373-appb-100010
    计算得到类c的典例
    Figure PCTCN2022118373-appb-100011
    其中t表示典例数量。
  5. 如权利要求1所述的基于持续学习的跨设备增量轴承故障诊断方法,其特征在于:引入神经元级微调对初始阶段诊断模型进行修改,得到双分支残差自适应聚合网络,包括:
    利用双分支残差自适应聚合网络替换初始阶段诊断模型的单分支ResNet-32,其中,双分支残差自适应聚合网络包含动态分支和稳态分支,动态分支使用参数级微调,稳态分支使用神经元级微调。
  6. 如权利要求5所述的基于持续学习的跨设备增量轴承故障诊断方法,其特征在于:动态分支使用参数级微调,稳态分支使用神经元级微调,包括:
    所述动态分支使用的参数级微调在训练时使用初始诊断模型参数初始化后用训练数据微调该分支所有参数;
    所述稳态分支使用神经元级微调在使用初始诊断模型参数初始化后将冻结模型网络参数,并赋予每个神经元一个缩放权重,并使用各个阶段任务训练微调。
  7. 如权利要求1所述的基于持续学习的跨设备增量轴承故障诊断方法,其特征在于,所述使用最近邻典例分类器或余弦标准化分类器取代初始诊断模型的全连接层分类器进行分类的方法,包括:
    所述最近邻典例分类器通过计算各类典例的特征均值
    Figure PCTCN2022118373-appb-100012
    进行分类,其中y=0,1,K,J n+K n-1,P y是m y中样本的数量,F(·)是当前阶段的特征提取器,假设输入故障样本x,则通过
    Figure PCTCN2022118373-appb-100013
    预测类型;
    所述余弦标准化分类器通过计算特征与各类学习得到的原型之间的余弦相似度进行分类,在初始阶段通过
    Figure PCTCN2022118373-appb-100014
    计算输入x为类c的预测概率,其中θ 0为初始阶段全连接分类层参数,h 0=F 0(x)为初始阶段提取的特征;在增量阶段通过
    Figure PCTCN2022118373-appb-100015
    计算输入x为类c的预测概率,其中θ n为学习得到的每个类的原型,h n=F n(x)为增量阶段n提取的特征,
    Figure PCTCN2022118373-appb-100016
    表示l 2归一化,
    Figure PCTCN2022118373-appb-100017
    η是可学习缩放参数。
  8. 如权利要求6所述的基于持续学习的跨设备增量轴承故障诊断方法,其特征在于,存储的典例与下一个设备的轴承故障诊断任务数据共同训练所述诊断模型,包括:
    利用初始阶段存储的典例和下一个设备的轴承故障诊断任务数据训练双分支残差自适应聚合网络,分别赋予每个残差块层的动态残差块和稳态残差块的自适应聚合权重ω α和ω β
    使用赋予了自适应聚合权重的所述双分支残差自适应聚合网络对训练数据x [0]进行特征提取,在第n个残差块层,动态残差块和稳态残差块提取的特征分别为
    Figure PCTCN2022118373-appb-100018
    得到第n个残差块层的聚合特征为
    Figure PCTCN2022118373-appb-100019
    其中W 0表示初始阶段被冻结的参数,f [n]为第n个残差块层的特征提取过程,
    Figure PCTCN2022118373-appb-100020
  9. 如权利要求1所述的基于持续学习的跨设备增量轴承故障诊断方法,其特征在于,所述增量阶段的损失函数包括分类交叉熵损失函数、分类级知识蒸馏损失函数和特征级知识蒸馏损失函数;所述分类交叉熵损失函数为
    Figure PCTCN2022118373-appb-100021
    所述分类级知识蒸馏损失函数
    Figure PCTCN2022118373-appb-100022
    其中
    Figure PCTCN2022118373-appb-100023
    Figure PCTCN2022118373-appb-100024
    Figure PCTCN2022118373-appb-100025
    分别为旧模型的软标签和新模型的软预测,T表示温度参数;所述特征级知识蒸馏损失为
    Figure PCTCN2022118373-appb-100026
    其中
    Figure PCTCN2022118373-appb-100027
    Figure PCTCN2022118373-appb-100028
    分别为当前阶段的诊断模型与上一阶段的诊断模型所新旧模型提取的归一化特征,
    Figure PCTCN2022118373-appb-100029
    度量了二者之间的余弦相似度。
  10. 如权利要求1所述的基于持续学习的跨设备增量轴承故障诊断方法,其特征在于,通过双层优化方案优化聚合权重和模型参数,包括:
    所述双层优化方案包括上层问题
    Figure PCTCN2022118373-appb-100030
    和下层问题
    Figure PCTCN2022118373-appb-100031
    所述下层问题通过
    Figure PCTCN2022118373-appb-100032
    更新模型参数Θ n,其中μ 1是下层问题学习率;
    所述上层问题利用对该阶段学习的任务数据D n随机采样得到
    Figure PCTCN2022118373-appb-100033
    构建平衡数据
    Figure PCTCN2022118373-appb-100034
    通过
    Figure PCTCN2022118373-appb-100035
    更新自适应聚合权重,其中μ 2是上层问题学习率。
PCT/CN2022/118373 2022-07-25 2022-09-13 基于持续学习的跨设备增量轴承故障诊断方法 WO2024021246A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210879607.6A CN115270956B (zh) 2022-07-25 2022-07-25 基于持续学习的跨设备增量轴承故障诊断方法
CN202210879607.6 2022-07-25

Publications (1)

Publication Number Publication Date
WO2024021246A1 true WO2024021246A1 (zh) 2024-02-01

Family

ID=83770047

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/118373 WO2024021246A1 (zh) 2022-07-25 2022-09-13 基于持续学习的跨设备增量轴承故障诊断方法

Country Status (2)

Country Link
CN (1) CN115270956B (zh)
WO (1) WO2024021246A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117668623A (zh) * 2024-02-02 2024-03-08 中国海洋大学 船舶管道阀门泄漏多传感器跨域故障诊断方法
CN117872038A (zh) * 2024-03-11 2024-04-12 浙江大学 一种基于图论的直流微电网失稳故障源定位方法及装置

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115965057B (zh) * 2022-11-28 2023-09-29 北京交通大学 一种面向列车传动系统的类脑持续学习故障诊断方法
CN116089883B (zh) * 2023-01-30 2023-12-19 北京邮电大学 用于提高已有类别增量学习新旧类别区分度的训练方法
CN116399589B (zh) * 2023-03-29 2024-01-12 哈尔滨理工大学 基于rcmwe的滚动轴承微弱信号特征提取方法
CN116625689B (zh) * 2023-05-24 2023-12-22 石家庄铁道大学 基于smder的滚动轴承故障诊断方法及系统
CN117313000B (zh) * 2023-09-19 2024-03-15 北京交通大学 一种基于样本表征拓扑的电机类脑学习故障诊断方法
CN117150377B (zh) * 2023-11-01 2024-02-02 北京交通大学 基于全自主动机偏移的电机故障诊断阶梯式学习方法
CN117313251B (zh) * 2023-11-30 2024-03-15 北京交通大学 基于非滞后渐进学习的列车传动装置全局故障诊断方法
CN117407797B (zh) * 2023-12-15 2024-03-29 山东能源数智云科技有限公司 基于增量学习的设备故障诊断方法及模型的构建方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110162018A (zh) * 2019-05-31 2019-08-23 天津开发区精诺瀚海数据科技有限公司 基于知识蒸馏与隐含层共享的增量式设备故障诊断方法
CN112183581A (zh) * 2020-09-07 2021-01-05 华南理工大学 基于自适应迁移神经网络的半监督机械故障诊断方法
US20210190882A1 (en) * 2019-12-10 2021-06-24 Wuhan University Transformer failure identification and location diagnosis method based on multi-stage transfer learning
CN113935406A (zh) * 2021-09-27 2022-01-14 苏州大学 基于对抗流模型的机械设备无监督故障诊断方法
CN114429153A (zh) * 2021-12-31 2022-05-03 苏州大学 基于终身学习的齿轮箱增量故障诊断方法及系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110162018A (zh) * 2019-05-31 2019-08-23 天津开发区精诺瀚海数据科技有限公司 基于知识蒸馏与隐含层共享的增量式设备故障诊断方法
US20210190882A1 (en) * 2019-12-10 2021-06-24 Wuhan University Transformer failure identification and location diagnosis method based on multi-stage transfer learning
CN112183581A (zh) * 2020-09-07 2021-01-05 华南理工大学 基于自适应迁移神经网络的半监督机械故障诊断方法
CN113935406A (zh) * 2021-09-27 2022-01-14 苏州大学 基于对抗流模型的机械设备无监督故障诊断方法
CN114429153A (zh) * 2021-12-31 2022-05-03 苏州大学 基于终身学习的齿轮箱增量故障诊断方法及系统

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117668623A (zh) * 2024-02-02 2024-03-08 中国海洋大学 船舶管道阀门泄漏多传感器跨域故障诊断方法
CN117668623B (zh) * 2024-02-02 2024-05-14 中国海洋大学 船舶管道阀门泄漏多传感器跨域故障诊断方法
CN117872038A (zh) * 2024-03-11 2024-04-12 浙江大学 一种基于图论的直流微电网失稳故障源定位方法及装置
CN117872038B (zh) * 2024-03-11 2024-05-17 浙江大学 一种基于图论的直流微电网失稳故障源定位方法及装置

Also Published As

Publication number Publication date
CN115270956A (zh) 2022-11-01
CN115270956B (zh) 2023-10-27

Similar Documents

Publication Publication Date Title
WO2024021246A1 (zh) 基于持续学习的跨设备增量轴承故障诊断方法
Ren et al. A novel model with the ability of few-shot learning and quick updating for intelligent fault diagnosis
CN110361176B (zh) 一种基于多任务特征共享神经网络的智能故障诊断方法
Lu et al. Deep model based domain adaptation for fault diagnosis
Hu et al. An imbalance modified deep neural network with dynamical incremental learning for chemical fault diagnosis
Li et al. Data-driven bearing fault identification using improved hidden Markov model and self-organizing map
CN110609524B (zh) 一种工业设备剩余寿命预测模型及其构建方法和应用
He et al. Deep variational autoencoder classifier for intelligent fault diagnosis adaptive to unseen fault categories
CN111680446A (zh) 一种基于改进多粒度级联森林的滚动轴承剩余寿命预测方法
CN113505655A (zh) 面向数字孪生系统的轴承故障智能诊断方法
CN112784872B (zh) 一种基于开放集联合迁移学习的跨工况故障诊断方法
Lv et al. Deep transfer network with multi-kernel dynamic distribution adaptation for cross-machine fault diagnosis
Xia et al. Dual-ensemble multi-feedback neural network for gearbox fault diagnosis
CN116593157A (zh) 少样本下基于匹配元学习的复杂工况齿轮故障诊断方法
CN111709577B (zh) 基于长程相关性gan-lstm的rul预测方法
Zhang et al. Intelligent machine fault diagnosis using convolutional neural networks and transfer learning
WO2019178930A1 (zh) 一种机械设备故障诊断方法
Senanayaka et al. Similarity-based multi-source transfer learning approach for time series classification
CN115051929A (zh) 基于自监督目标感知神经网络的网络故障预测方法及装置
Liu et al. Helical fault diagnosis model based on data-driven incremental mergence
Lyu et al. A novel multiview sampling-based meta self-paced learning approach for class-imbalanced intelligent fault diagnosis
Jaworski et al. On resources optimization in fuzzy clustering of data streams
Li et al. Class imbalanced fault diagnosis via combining K-means clustering algorithm with generative adversarial networks
Li et al. Rotating machinery fault diagnosis based on spatial-temporal GCN
Singh et al. Predicting the remaining useful life of ball bearing under dynamic loading using supervised learning