CN115081468A - Multi-task convolutional neural network fault diagnosis method based on knowledge migration - Google Patents

Multi-task convolutional neural network fault diagnosis method based on knowledge migration Download PDF

Info

Publication number
CN115081468A
CN115081468A CN202110276577.5A CN202110276577A CN115081468A CN 115081468 A CN115081468 A CN 115081468A CN 202110276577 A CN202110276577 A CN 202110276577A CN 115081468 A CN115081468 A CN 115081468A
Authority
CN
China
Prior art keywords
knowledge
grained
coarse
fine
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110276577.5A
Other languages
Chinese (zh)
Inventor
刘若楠
蒲宇胜
王煜
胡清华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN202110276577.5A priority Critical patent/CN115081468A/en
Publication of CN115081468A publication Critical patent/CN115081468A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a multi-task convolutional neural network fault diagnosis method based on knowledge migration, which comprises the following steps of: step 1: preprocessing data; step 2: learning from a coarse structure to a fine structure; and 3, step 3: multitasking is migrated from coarse to fine knowledge. The invention shows significant advantages in large scale fault diagnosis. Since a good initialization of the CNN parameters obtained from the coarse-grained task can effectively avoid poor local minima. Meanwhile, effective judgment information is reserved and transmitted to a fine-grained task to realize effective fault identification, and compared with a flat CNN, the PKT-MCNN provided by the application converges to a better local minimum value, so that the remarkable influence of gradual knowledge transfer on the learning of the CNN is verified.

Description

Multi-task convolutional neural network fault diagnosis method based on knowledge migration
Technical Field
The invention relates to the technical field of intelligent fault diagnosis, in particular to a multi-task convolutional neural network fault diagnosis method based on knowledge migration.
Background
Fault diagnosis is becoming increasingly important in modern society as an effective tool to maintain safe operation of industrial systems and reduce maintenance costs of unnecessary routine shutdowns. Therefore, various diagnostic methods have been proposed to detect faults early and accurately. In recent years, with the development of sensors and information technology, industry data is rapidly accumulated, and Deep Learning (DL) based diagnostic methods are being promoted. Based on a deep framework and a plurality of nonlinear layers, the DL algorithm can adaptively learn high-level representation characteristics, and the inherent defects of the traditional diagnosis method are overcome.
Most deep learning based methods work well for small numbers of fault diagnoses, but fail to converge to satisfactory results when dealing with large scale fault diagnoses, because the large number of fault types can lead to inter-class distance imbalances and local minima in the neural network. For example, in a large-scale diagnosis task for handling a large number of fault types, the DL-based diagnosis method has a drawback that, first, the DL-based method tends to fall into a local minimum when random initialization is performed, and those methods that need to handle a large number of fault types may aggravate the problem due to an increase in parameters and a complication of the structure. Furthermore, as the label space increases, the upper bound of the generalization error increases, resulting in a decrease in the final diagnostic performance, and, from the perspective of algorithm design, an increase in the fault category may result in the problem of imbalance of the intra-class/inter-class distances, i.e., large intra-class distances and small inter-class distances of faults. For example, the distance between similar failures of one component is small and difficult to distinguish, while the distance between failures of different subsystems is large and can be easily classified. Therefore, in such large-scale fault diagnosis tasks, the inter-class distance of some similar faults may be even smaller than the intra-class distance of other faults, which makes the conventional deep learning method difficult to apply. Therefore, large-scale fault diagnosis of complex systems with multiple fault types has been a difficult problem to crack.
Disclosure of Invention
The application provides a multi-task convolutional neural network fault diagnosis method based on knowledge migration, which can overcome the current defects of fault diagnosis of a large complex system, and learns fault information from coarse to fine into a network by using known fault information in a migration learning mode, so that better performance is obtained.
The invention provides a multi-task convolutional neural network fault diagnosis method based on knowledge migration,
the method comprises the following steps:
step 1: preprocessing data;
step 2: learning from a coarse structure to a fine structure;
and step 3: multitasking is migrated from coarse to fine knowledge.
Preferably, step 1 specifically comprises: and processing the input sample signal to obtain an input sample matrix.
Preferably, step 2 specifically comprises:
step 21: similar graph construction for fault types;
step 22: spectral clustering of fault types;
step 23: and outputting the knowledge structure.
Preferably, steps 21 and 22 specifically include: and clustering similar fault types to form coarse-grained knowledge.
Preferably, step 23 specifically includes:
step 231: training a neural network for extracting coarse-grained knowledge;
step 232: extracting coarse grain knowledge;
step 233: obtaining a network with coarse-grained knowledge extracted;
step 244: and transferring the knowledge of the network to a fine-grained network through transfer learning.
Preferably, step 3 specifically comprises:
step 31: constructing a multitask convolution neural network based on progressive knowledge transfer;
step 32: training PKT-MCNN model transfer by using progressive knowledge;
step 33: testing a new sample by using the trained PKT-MCNN model;
step 34: and outputting the fault type of the test sample.
Preferably, in step 3, the PKT-MCNN training process includes three stages:
(1) training a coarse-grained task to master coarse-grained knowledge;
(2) simultaneously training coarse-grained and fine-grained tasks in a multi-task mode;
(4) fine-grained tasks are trained individually by fine-tuning the parameters updated in stage (2).
Preferably, in step 3, the PKT algorithm is embedded on top of the multitasking CNN,
the PKT adjusts the different attentions for the two tasks, the training coarse-grained task and the training fine-grained task by a weight parameter lambda,
in particular, the amount of the solvent to be used,
in the first phase, λ is set to 1, learning only coarse-grained tasks, since the weight of the fine-grained tasks is 0, the loss stops from back propagation;
in the second stage, the lambda is gradually reduced to 0 according to the number of training epochs, so that the two tasks are learned simultaneously, and coarse-grained knowledge is gradually transferred to fine-grained knowledge; second stage λ, noted λ 2 Calculated from the following equation:
Figure BDA0002976882490000031
b1 and B3 are the training epochs of the first stage and the second stage respectively; b is the current epoch number, Bmax is the maximum epoch number;
finally, the value of λ at the third stage is opposite to the value of λ at the first stage, and λ is set to 0 to fine tune the fine-grained task without updating the parameters of the coarse-grained task.
Compared with the prior art, the invention has the advantages that,
the invention shows significant advantages in large scale fault diagnosis. Since a good initialization of the CNN parameters obtained from the coarse-grained task can effectively avoid poor local minima. Meanwhile, effective judgment information is reserved and transmitted to a fine-grained task to realize effective fault identification, and compared with a flat CNN, the PKT-MCNN provided by the application converges to a better local minimum value, so that the remarkable influence of gradual knowledge transfer on the learning of the CNN is verified.
The method can intelligently learn a reasonable knowledge structure consistent with the physical composition of the nuclear power system, is very good at handling large-scale fault diagnosis under a special physical background, and extracts and transfers coarse-grained knowledge to a final fine-grained diagnosis task. The method provided by the application can become a promising tool in future industrial big data analysis research.
Drawings
FIG. 1 is a flow chart of a method of the present application;
FIG. 2 is a flow chart of a coarse-to-fine structure learning method of the present application;
FIG. 3 is a diagram of the PKT-MCNN model architecture of the present application.
Detailed Description
It should be noted that, in the present application, the embodiments and features of the embodiments may be combined with each other without conflict.
The invention is described in further detail below with reference to the figures and specific examples. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
As shown in fig. 1, a method for diagnosing a fault of a multitask convolutional neural network based on knowledge migration according to an embodiment of the present invention includes the following steps:
step 1: preprocessing data;
the input signal refers to an original signal of an input device, and is usually vibration data, and the input signal is processed by means of noise reduction, spectrum extraction, and the like to obtain a matrix, which is input to a subsequent algorithm and is called an input sample matrix.
And 2, step: learning from a coarse structure to a fine structure;
as shown in fig. 2, the method specifically comprises the following steps:
step 21: similar graph construction for fault types;
in the present embodiment, the construction of the failure type similarity graph adopts the following method:
to automatically extract the coarse-grained knowledge structure, the information in the data is used to group similar fault types into one common superior node of the structure, which represents the coarse-grained concept of a fault.
Give the jth sample of the ith fault
Figure BDA0002976882490000051
Similarity graph G ═ V, E for the error types contains similarity information for each of the two fault types, where each vertex V is a vertex i e.V is a failure type, and each edge e ij E is two connected vertices v i And v j The similarity between them. To construct this graph, the similarity of each pair of fault types should be calculated. Unlike instance-level clustering, each fault type needs to be represented before calculating the similarity. In this work, each fault type is vectorized and represented by the centroid of all its samples. Mathematically, the vertex v in the similarity graph i Is shown as
Figure BDA0002976882490000052
Wherein M is i The number of samples of the ith fault. This representation has two advantages for large scale fault diagnosis. On the one hand, for many fault samples, it is efficient to compute the first order statistics, i.e. the mean vector. On the other hand, representing the failure type with a mean vector may mitigate the adverse effects of noise samples that are typically present in large data sets. Through the expression of the vertex, the similarity information e between the two fault types i and j can be calculated through Gaussian similarity ij . The formula is as follows:
Figure BDA0002976882490000053
wherein σ is the sum of e ij Value of (A) to (B)To be [0,1]The scaling factor of (c).
Step 22: spectral clustering of fault types;
using the similarity graph G, the fault types are assigned to different coarse-grained fault concepts by clustering techniques. The method adopts a normalized cut (NCut) algorithm to cut the graph into a plurality of subgraphs, and can formally optimize the following objective functions:
Figure BDA0002976882490000054
s·t·H'DH=I
in the formula, k is the number of the coarse-grained fault concepts to be clustered. If vi ∈ Cj, then h ij =1/|C j I, otherwise, h ij =0,h ij Are elements in the matrix H. i ∈ {1,2, ·, N }; j ∈ {1,2, ·, k }; L-D-G is a laplacian matrix; d is a secondary matrix of G; i is the identity matrix, |, is the cardinality of the set, and Shi and Malik demonstrate that this goal can be approximated with the eigenvector of L associated with the second smallest eigenvalue. Thus, k coarse-grained fault concepts C 1 ,C 2 ,..,C k And constructing a two-level knowledge structure T, wherein each coarse-grained concept comprises a plurality of fine-grained fault types.
And clustering similar fault types through the two steps to form coarse-grained knowledge.
Step 23: and outputting the knowledge structure.
Wherein, step 23 specifically comprises the following steps:
step 231: training a neural network for extracting coarse-grained knowledge;
step 232: extracting coarse grain knowledge;
step 233: obtaining a network with coarse-grained knowledge extracted;
step 244: and transferring the knowledge of the network to a fine-grained network through transfer learning.
And step 3: multitasking is migrated from coarse to fine knowledge.
The method specifically comprises the following steps:
step 31: constructing a multitask convolution neural network based on progressive knowledge transfer;
step 32: training PKT-MCNN model transfer by using progressive knowledge;
step 33: testing a new sample by using the trained PKT-MCNN model;
step 34: and outputting the fault type of the test sample.
In a preferred embodiment, as shown in fig. 3, for the architecture diagram of the PKT-MCNN model of the present application, in step 3, the training process of the PKT-MCNN includes three stages:
(1) training a coarse-grained task to master coarse-grained knowledge;
(2) simultaneously training coarse-grained and fine-grained tasks in a multi-task mode;
(3) fine-grained tasks are trained individually by fine-tuning the parameters updated in stage (2).
The PKT controls and switches the stage through the weight of the loss layer, and the weight represents the current attention degree of the corresponding task. The loss function used here is the cross entropy loss, noted
Figure BDA0002976882490000071
Wherein r is i Is 0 or 1, corresponding to a true tag; p is a radical of i (. cndot.) produces a predicted score, which is a learnable parameter W f As a function of (c). Therefore, the loss function L of PKT-MCNN M Defined as a weighted combination of coarse and fine grained tasks:
Figure BDA0002976882490000072
Figure BDA0002976882490000073
Figure BDA0002976882490000074
preferably, in step 3, the PKT algorithm is embedded on top of the multitasking CNN,
the PKT adjusts the different attentions to the two tasks, the coarse-grained task and the fine-grained task,
in particular, the amount of the solvent to be used,
in the first phase, λ is set to 1, learning only coarse-grained tasks, since the weight of the fine-grained tasks is 0, the loss stops from back propagation;
in the second stage, the lambda is gradually reduced to 0 according to the number of training epochs, so that the two tasks are learned simultaneously, and coarse-grained knowledge is gradually transferred to fine-grained knowledge; second stage λ, noted λ 2 Calculated from the following equation:
Figure BDA0002976882490000075
wherein, B1 and B3 are respectively the training epoch numbers of the first stage and the second stage; b is the current epoch number, Bmax is the maximum epoch number;
finally, the value of λ at the third stage is opposite to the value of λ at the first stage, and λ is set to 0 to fine tune the fine-grained task without updating the parameters of the coarse-grained task.
The technical means not described in detail in the present application are known techniques.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (8)

1. A multi-task convolutional neural network fault diagnosis method based on knowledge migration is characterized in that,
the method comprises the following steps:
step 1: preprocessing data;
step 2: learning from a coarse structure to a fine structure;
and step 3: multitasking is migrated from coarse to fine knowledge.
2. The knowledge migration based multitask convolutional neural network fault diagnosis method of claim 1,
the step 1 specifically comprises the following steps: and processing the input sample signal to obtain an input sample matrix.
3. The knowledge migration based multitask convolutional neural network fault diagnosis method of claim 1,
the step 2 specifically comprises the following steps:
step 21: similar graph construction for fault types;
step 22: spectral clustering of fault types;
step 23: and outputting the knowledge structure.
4. The knowledge migration based multitask convolutional neural network fault diagnosis method of claim 3,
the steps 21 and 22 specifically include: and clustering similar fault types to form coarse-grained knowledge.
5. The knowledge migration based multitask convolutional neural network fault diagnosis method of claim 3,
step 23 specifically includes:
step 231: training a neural network for extracting coarse-grained knowledge;
step 232: extracting coarse grain knowledge;
step 233: obtaining a network with coarse-grained knowledge extracted;
step 244: and transferring the knowledge of the network to a fine-grained network through transfer learning.
6. The knowledge migration based multitask convolutional neural network fault diagnosis method of claim 1,
the step 3 specifically comprises the following steps:
step 31: constructing a multitask convolution neural network based on progressive knowledge transfer;
step 32: training PKT-MCNN model transfer by using progressive knowledge;
step 33: testing a new sample by using the trained PKT-MCNN model;
step 34: and outputting the fault type of the test sample.
7. The knowledge migration based multitask convolutional neural network fault diagnosis method of claim 6,
in step 3, the PKT-MCNN training process includes three stages:
(1) training a coarse-grained task to master coarse-grained knowledge;
(2) simultaneously training coarse-grained and fine-grained tasks in a multi-task mode;
(3) fine-grained tasks are trained individually by fine-tuning the parameters updated in stage (2).
8. The knowledge migration based multitask convolutional neural network fault diagnosis method of claim 6,
in step 3, the PKT algorithm is embedded on top of the multitasking CNN,
the PKT adjusts the different attentions to the two tasks, the coarse-grained task and the fine-grained task,
in particular, the amount of the solvent to be used,
in the first phase, λ is set to 1, learning only coarse-grained tasks, since the weight of the fine-grained tasks is 0, the loss stops from back propagation;
in the second stage, the lambda is gradually reduced to 0 according to the number of training epochs, so that the two tasks are learned simultaneously, and coarse-grained knowledge is gradually transferred to fine-grained knowledge; second stage λ, denoted λ 2 Calculated from the following equation:
Figure FDA0002976882480000021
b1 and B3 are the training epochs of the first stage and the second stage respectively; b is the current epoch number, Bmax is the maximum epoch number;
finally, the value of λ at the third stage is opposite to the value of λ at the first stage, and λ is set to 0 to fine tune the fine-grained task without updating the parameters of the coarse-grained task.
CN202110276577.5A 2021-03-15 2021-03-15 Multi-task convolutional neural network fault diagnosis method based on knowledge migration Pending CN115081468A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110276577.5A CN115081468A (en) 2021-03-15 2021-03-15 Multi-task convolutional neural network fault diagnosis method based on knowledge migration

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110276577.5A CN115081468A (en) 2021-03-15 2021-03-15 Multi-task convolutional neural network fault diagnosis method based on knowledge migration

Publications (1)

Publication Number Publication Date
CN115081468A true CN115081468A (en) 2022-09-20

Family

ID=83240950

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110276577.5A Pending CN115081468A (en) 2021-03-15 2021-03-15 Multi-task convolutional neural network fault diagnosis method based on knowledge migration

Country Status (1)

Country Link
CN (1) CN115081468A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120137367A1 (en) * 2009-11-06 2012-05-31 Cataphora, Inc. Continuous anomaly detection based on behavior modeling and heterogeneous information analysis
CN110031227A (en) * 2019-05-23 2019-07-19 桂林电子科技大学 A kind of Rolling Bearing Status diagnostic method based on binary channels convolutional neural networks
US20200167659A1 (en) * 2018-11-27 2020-05-28 Electronics And Telecommunications Research Institute Device and method for training neural network
CN111221340A (en) * 2020-02-10 2020-06-02 电子科技大学 Design method of migratable visual navigation based on coarse-grained features
CN111444889A (en) * 2020-04-30 2020-07-24 南京大学 Fine-grained action detection method of convolutional neural network based on multi-stage condition influence
CN112488241A (en) * 2020-12-18 2021-03-12 贵州大学 Zero sample picture identification method based on multi-granularity fusion network

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120137367A1 (en) * 2009-11-06 2012-05-31 Cataphora, Inc. Continuous anomaly detection based on behavior modeling and heterogeneous information analysis
US20200167659A1 (en) * 2018-11-27 2020-05-28 Electronics And Telecommunications Research Institute Device and method for training neural network
CN110031227A (en) * 2019-05-23 2019-07-19 桂林电子科技大学 A kind of Rolling Bearing Status diagnostic method based on binary channels convolutional neural networks
CN111221340A (en) * 2020-02-10 2020-06-02 电子科技大学 Design method of migratable visual navigation based on coarse-grained features
CN111444889A (en) * 2020-04-30 2020-07-24 南京大学 Fine-grained action detection method of convolutional neural network based on multi-stage condition influence
CN112488241A (en) * 2020-12-18 2021-03-12 贵州大学 Zero sample picture identification method based on multi-granularity fusion network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ZHOUQI MA ET AL.: "Learning via Social Preference A Coarse-to-Fine Training Strategy for Style Transfer Systems", 《2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW)》, 31 December 2018 (2018-12-31) *
陶启生 等: "用于轴承故障诊断的两步迁移学习法", 《计算机工程与应用》, 14 January 2021 (2021-01-14) *

Similar Documents

Publication Publication Date Title
CN108830296B (en) Improved high-resolution remote sensing image classification method based on deep learning
CN110132598B (en) Fault noise diagnosis algorithm for rolling bearing of rotating equipment
CN107505133B (en) The probability intelligent diagnosing method of rolling bearing fault based on adaptive M RVM
CN106895975B (en) Bearing fault diagnosis method based on Stacked SAE deep neural network
CN112926641B (en) Three-stage feature fusion rotating machine fault diagnosis method based on multi-mode data
CN111898095A (en) Deep migration learning intelligent fault diagnosis method and device, storage medium and equipment
CN111914883B (en) Spindle bearing state evaluation method and device based on deep fusion network
CN114048568B (en) Rotary machine fault diagnosis method based on multisource migration fusion shrinkage framework
CN110609524B (en) Industrial equipment residual life prediction model and construction method and application thereof
CN109299741B (en) Network attack type identification method based on multi-layer detection
CN110188047B (en) Double-channel convolutional neural network-based repeated defect report detection method
CN106557782B (en) Hyperspectral image classification method and device based on class dictionary
CN107957946B (en) Software defect prediction method based on neighborhood embedding protection algorithm support vector machine
CN109409425B (en) Fault type identification method based on neighbor component analysis
CN108537257B (en) Zero sample image classification method based on discriminant dictionary matrix pair
CN105678343A (en) Adaptive-weighted-group-sparse-representation-based diagnosis method for noise abnormity of hydroelectric generating set
CN114004252A (en) Bearing fault diagnosis method, device and equipment
CN114355240A (en) Power distribution network ground fault diagnosis method and device
CN110363230A (en) Stacking integrated sewage handling failure diagnostic method based on weighting base classifier
CN110414587A (en) Depth convolutional neural networks training method and system based on progressive learning
CN115112372A (en) Bearing fault diagnosis method and device, electronic equipment and storage medium
Gu et al. Identification of concurrent control chart patterns with singular spectrum analysis and learning vector quantization
CN112464990A (en) Method and device for sensing vibration data based on current and voltage sensor
CN116992953B (en) Model training method, fault diagnosis method and device
CN114357372A (en) Aircraft fault diagnosis model generation method based on multi-sensor data driving

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination