CN116150668A - Rotating equipment fault diagnosis method based on double-stage alignment partial migration network - Google Patents

Rotating equipment fault diagnosis method based on double-stage alignment partial migration network Download PDF

Info

Publication number
CN116150668A
CN116150668A CN202211531891.4A CN202211531891A CN116150668A CN 116150668 A CN116150668 A CN 116150668A CN 202211531891 A CN202211531891 A CN 202211531891A CN 116150668 A CN116150668 A CN 116150668A
Authority
CN
China
Prior art keywords
data
domain
source domain
target domain
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211531891.4A
Other languages
Chinese (zh)
Other versions
CN116150668B (en
Inventor
俞昆
战启冉
王雪松
程玉虎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China University of Mining and Technology CUMT
Original Assignee
China University of Mining and Technology CUMT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Mining and Technology CUMT filed Critical China University of Mining and Technology CUMT
Priority to CN202211531891.4A priority Critical patent/CN116150668B/en
Publication of CN116150668A publication Critical patent/CN116150668A/en
Application granted granted Critical
Publication of CN116150668B publication Critical patent/CN116150668B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/20Administration of product repair or maintenance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/04Manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Tourism & Hospitality (AREA)
  • Strategic Management (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Manufacturing & Machinery (AREA)
  • Primary Health Care (AREA)
  • Image Analysis (AREA)
  • Testing And Monitoring For Control Systems (AREA)

Abstract

The invention discloses a rotating equipment fault diagnosis method based on a double-stage alignment partial migration network, which is used for rotating equipment fault classification tasks. In terms of network architecture, the partial migration network consists of three basic units, namely a feature extractor, a domain discriminator and a classifier, wherein the feature extractor is constructed by utilizing a ViT network, and the domain discriminator and the classifier are constructed by utilizing two independent three-layer fully-connected neural networks. In the aspect of network parameter updating, 1) constructing a weighted balance mechanism to restrict the characteristic alignment process of the source domain and the target domain, and realizing the sharing class edge distribution alignment of the source domain and the target domain; 2) And (3) utilizing various measurement learning measures to pull up characteristic distances among all subclasses of the source domain and the target domain sharing class, and realizing state distribution alignment among the source domain and the target domain sharing class.

Description

Rotating equipment fault diagnosis method based on double-stage alignment partial migration network
Technical Field
The invention belongs to the field of fault diagnosis of rotating equipment, and particularly relates to a rotating equipment fault diagnosis method based on a double-stage alignment partial migration network.
Background
With the continuous improvement of the intelligent manufacturing industry and the continuous growth of the modern heavy equipment scale, the rotary equipment is developing towards the directions of complicating, intelligentizing, accelerating and refining, so that the probability of potential faults is gradually increased, and the maintenance work difficulty is improved. In the present industrial Internet age, the multi-source sensor information acquisition network of the rotating equipment is staggered, the volume of the monitoring data is increasingly increased, and the accurate diagnosis of the possible hidden trouble of the rotating equipment by utilizing the intelligent fault diagnosis technology is very necessary and urgent.
The deep learning is used as a new exhibition in the intelligent fault diagnosis field, can autonomously mine the representative diagnosis information hidden in the original data, directly establishes the accurate mapping relation between the original data and the running state, and gets rid of the dependence on artificial feature design and engineering diagnosis experience to a great extent. At present, researchers provide a large number of intelligent fault diagnosis methods for rotating equipment based on deep learning, and a good diagnosis effect is achieved. However, these methods are mostly based on the following two assumptions: 1) The test data and the training data are required to be distributed independently and uniformly; 2) The task to be diagnosed has sufficient label fault samples. In an industrial field, the operating speed, load, environmental noise and other working condition information of the rotating equipment change at any moment, and the distribution of monitoring data collected by the sensor also changes continuously; the degradation process of the rotating equipment slowly and gradually changes along with time, so that the difference of the characteristics of the monitoring data between the health state and the early weak damage state of the rotating equipment is difficult to detect, and the time and labor consuming work is realized by collecting fault samples and effectively labeling. The intelligent fault diagnosis method based on deep learning cannot be directly used for solving the fault diagnosis problem of the industrial field rotating equipment.
The intelligent fault diagnosis method based on transfer learning relaxes the constraint that test data and training data in deep learning must obey independent same distribution. The transfer learning effectively solves the problems that the training data amount is insufficient and the characteristic distribution of the training data and the test data is not matched in the new field by learning the knowledge of the previous task and applying the knowledge to the new task. At present, researchers propose a plurality of intelligent fault diagnosis methods based on transfer learning for realizing the transfer of diagnosis knowledge among different sensor positions and different mechanical equipment under different operation conditions. However, existing studies all assume that the source domain data and the target domain data have the same tag space. In an actual industrial scenario, it is difficult to find source domain and target domain data with identical tag information. The running state information covered by the source domain data is often far more than that of the target domain data, and the tag space of the target domain data is mostly a subset of the source domain data. The intelligent fault diagnosis method based on transfer learning is slightly insufficient in solving the fault diagnosis problem of the rotating equipment in the actual industrial scene.
The partial migration learning strengthens the characteristic migration effect of the shared class data of the source domain and the target domain by limiting the contribution degree of different class data of the source domain in the characteristic alignment process, weakens the influence of the outlier class data of the source domain on the characteristic migration, and can migrate the diagnostic model from a large domain containing rich running state information to a small domain containing only a small amount of running state information. The existing partial migration diagnosis method still has the following two problems:
1) Most of the existing methods utilize target domain data tag information given by a classifier to construct weight information, and restrict the edge distribution alignment process between the sharing categories of the source domain and the target domain. However, existing approaches ignore the alignment of state distributions among the subclasses of the source and target domain sharing categories.
2) Most of the existing methods use convolutional neural networks as basic network architecture. The special local receptive field structure of the convolutional neural network focuses on the local characteristics of different segments of the monitoring data, and can not effectively capture the correlation of the characteristics among the different segments.
Disclosure of Invention
The invention aims to: aiming at the prior art, the rotating equipment fault diagnosis method based on the double-stage alignment partial migration network is provided for the rotating equipment fault classification task, and the fault identification accuracy of the existing partial migration diagnosis method for the rotating equipment is improved.
The technical scheme is as follows: 1. a rotating equipment fault diagnosis method based on a double-stage alignment partial migration network is characterized by comprising the following steps:
step 1: collecting data with more fault categories under a certain working condition as source domain data, wherein the source domain data are all labeled data;
step 2: collecting data with fewer fault categories under other working conditions as target domain data, wherein the target domain data are label-free data;
step 3: the source domain data and the target domain data are simultaneously input into a constructed two-stage alignment part migration network, and the edge distribution alignment of the sharing category of the source domain and the target domain and the state distribution alignment among all subclasses of the sharing category of the source domain and the target domain are realized by updating the internal parameters of the two-stage alignment part migration network;
step 4: and selecting part of data from the data with the same working condition as the target domain data as test data, inputting the test data into the trained two-stage alignment part migration network, and predicting the category information of the test data.
2. The rotating equipment fault diagnosis method based on the two-stage alignment partial migration network according to claim 1, wherein in the step 1, the labeled source domain data set is represented as
Figure BDA0003974406190000021
wherein />
Figure BDA0003974406190000022
Represents the i-th sample of the source domain, +.>
Figure BDA0003974406190000031
Failure tag indicating ith sample, n s Representing the number of source domain samples; source Domain dataset +.>
Figure BDA0003974406190000032
Is Y s The corresponding edge state distribution is P s
3. The fault diagnosis method based on the dual-stage alignment partial migration network according to claim 2, wherein in the step 2, the unlabeled target domain data set is expressed as
Figure BDA0003974406190000033
wherein />
Figure BDA00039744061900000317
Represents the i-th sample of the target domain, n t Representing the number of target domain samples; target Domain dataset +.>
Figure BDA0003974406190000034
Is Y t The corresponding edge state distribution is P t The method comprises the steps of carrying out a first treatment on the surface of the The target domain label space is a subset of the source domain label space, i.e. +.>
Figure BDA0003974406190000035
The edge state distribution of the source domain data and the target domain data is different, namely P s ≠P t The edge state distribution of the source domain sharing category data and the target domain data also has a difference, namely P s,c ≠P t
4. The rotating equipment fault diagnosis method based on the two-stage alignment part migration network according to claim 3, wherein in the step 3, the network structure of the two-stage alignment part migration network comprises three basic units, namely a feature extractor G, a domain discriminator D and a classifier C; the feature extractor G is constructed by utilizing a ViT network, and the domain discriminator D and the classifier C are all three-layer fully-connected neural networks;
the constructed ViT network comprises an input processing module, a transducer encoder module and a classification module; in the input processing module, the collected monitoring data is set as one-dimensional vibration signals
Figure BDA0003974406190000036
Figure BDA0003974406190000037
Representing L-dimensional space, firstly, equally dividing a vibration signal into Z sections, namely Z token sections, wherein each section comprises sampling points S, ensuring L=Z×S, and obtaining input data of ViT network>
Figure BDA0003974406190000038
Secondly, the cut vibration signal is put into a fully-connected neural network with an input node of S and an output node of N to obtain the processed characteristic +.>
Figure BDA0003974406190000039
Then, feature x f Class token feature trainable with parameters->
Figure BDA00039744061900000310
Splicing to obtain combined characteristic->
Figure BDA00039744061900000311
Finally, trainable parameters +.>
Figure BDA00039744061900000312
Considered as position embedding, combined with feature x cb Adding to obtain input features of a subsequent transducer encoder module
Figure BDA00039744061900000313
The transducer encoder module consists of a multi-head self-attention mechanism, residual error connection, layer standardization and a full-connection neural network, and utilizes the multi-head self-attention mechanism to mine global characteristic information in monitoring data; the output characteristics processed by the transducer encoder module are the same as the input characteristics in dimension, and the output characteristics are expressed as
Figure BDA00039744061900000314
In the classification module, selecting the feature of the classification token position in the output features of the transducer encoder module>
Figure BDA00039744061900000315
Inputting the fully-connected neural network to obtain ViT network output characteristics ∈>
Figure BDA00039744061900000316
5. The rotating equipment fault diagnosis method based on the two-stage alignment partial migration network according to claim 4, wherein in the step 3, a weighted balance mechanism is utilized to restrict the characteristic alignment process of the source domain and the target domain, so as to realize the sharing type edge distribution alignment of the source domain and the target domain; in a weighted balance mechanism, constructing a class weight coefficient according to the target domain data tag probability distribution given by the classifier, and evaluating the possibility that different classes of a source domain are sharing classes; the method comprises the steps that a weight coefficient is used for restraining a part of migration network parameter updating process, a larger weight is given to source domain sharing type data, and the characteristic migration effect of the source domain sharing type data and target domain sharing type data is enhanced; and (3) giving smaller weight to the source domain outlier category data, and weakening the influence of the source domain outlier category data on the characteristic migration.
6. The rotating equipment fault diagnosis method based on the two-stage alignment partial migration network according to claim 5, wherein the edge distribution alignment process specifically comprises the following steps:
firstly, the accumulation classifier gives all data tag information of a target domain, and a class weight coefficient gamma is constructed:
Figure BDA0003974406190000041
wherein ,
Figure BDA0003974406190000042
target domain data tag probability distribution given for classifier softmax layer, K is the source domain data tag class number, and +.>
Figure BDA0003974406190000043
The target domain data given to the classifier softmax layer is the probability distribution of the kth class;
then, regularization processing is carried out on the category weight coefficient:
Figure BDA0003974406190000044
wherein ,maxγ =max(γ)=max([γ 1 ,…,γ k ,…,γ K ]) Is the maximum value of category weight coefficient, gamma k Class weight coefficient of k class, gamma n The regularized category weight coefficient;
finally, the regularized class weight coefficient is utilized to restrict the countermeasure training process of a part of the migration network, so that the edge distribution alignment between the sharing classes of the source domain and the target domain is realized:
Figure BDA0003974406190000045
Figure BDA0003974406190000046
Figure BDA0003974406190000047
wherein ,θgdc Network parameters representing the feature extractor G, domain arbiter D and classifier C; d, d i A domain label representing an i-th sample; l (L) y and Ld Representing a classifier and domain arbiter cross entropy loss function; λ represents a weight parameter that measures two loss functions; and a gradient inversion layer is introduced between the feature extractor and the domain discriminator by the partial migration network, the gradient corresponding to the domain classification loss in the domain discriminator is automatically inverted before being reversely propagated to the parameters of the feature extractor, and the network parameter countermeasure training is realized in an end-to-end mode.
7. The rotating equipment fault diagnosis method based on the two-stage alignment partial migration network according to claim 6, wherein in the step 3, the state distribution alignment process specifically includes the following steps:
firstly, utilizing a triple loss function constraint feature extractor and classifier parameter updating process, reducing the probability distribution distance of the source domain and the class labels, amplifying the probability distribution distance of the source domain and the class data labels, and clustering the probability distribution of the source domain and the class labels into clusters:
Figure BDA0003974406190000051
Figure BDA0003974406190000052
wherein ,xa Representing randomly selected anchor samples, x p Representing positive samples identical to the anchor sample label, x n Representing a negative sample different from the anchor sample label, wherein margin is a preset distance between the positive sample and the negative sample;
secondly, processing the probability distribution of the source domain data labels by using K-means clustering, and identifying a clustering relation of the probability distribution of the source domain data labels; the updating process of the internal parameters of the K-means clustering model is as follows:
Figure BDA0003974406190000053
wherein ,θclu Representing the internal parameters of the K-means cluster model,
Figure BDA0003974406190000054
represents the kth class of source domain data, U k Representing a clustering center corresponding to the probability distribution of the kth type source domain data labels; in a K-means clustering model, setting the clustering cluster number equal to the source domain data tag class number, and enabling the source domain data tag probability of different classes to be divided by adjusting internal parametersThe cloth surrounds the vicinity of the corresponding cluster center; further, judging the membership between the clustering clusters of the target domain data and the source domain data by utilizing a proximity criterion:
Figure BDA0003974406190000055
Figure BDA0003974406190000056
wherein ,
Figure BDA0003974406190000057
the clustering center is a clustering center to which target domain data estimated by utilizing a proximity criterion belongs; when the probability distribution of the target domain data labels and the kth type source domain data clustering center U k When the L2 norm of the target domain data is minimum, the probability distribution of the target domain data label belongs to a kth cluster, and the class label of the target domain data is k;
finally, the L2 norm is utilized to pull the distance between the target domain label probability distribution and the clustering center of the source domain of the same category, so as to realize the state distribution alignment between the sharing categories of the source domain and the target domain:
Figure BDA0003974406190000061
the beneficial effects are that: 1) The source domain and the target domain share category status distribution alignment is realized by utilizing various measurement learning measures. The method utilizes a triple loss function constraint part migration network parameter updating process to cluster the probability distribution of each label of a source domain into clusters; identifying a source domain data clustering structure by using K-means clustering, and judging the membership between the target domain data and the source domain data clustering cluster by using a proximity criterion; and (3) utilizing the L2 norm to pull the distance between the target domain label probability distribution and the clustering center of the source domain of the same category to realize the state distribution alignment between the source domain and the target domain sharing category.
2) The global characteristic information of the monitoring data is extracted by using a Vision Transformer (ViT) network. ViT network uses self-attention mechanism to extract global characteristic information without depending on convolution, and has good classification effect. The method improves the original ViT network input processing module, and ensures that the characteristics of the one-dimensional monitoring data processed by the input processing module meet the input requirements of a subsequent transducer encoder module. And utilizing a ViT network to fully mine the characteristic correlation among different fragments of the monitoring data, and extracting the global characteristic information of the monitoring data.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a block diagram of a dual stage alignment partial migration network;
fig. 3 is a schematic diagram of an edge distribution alignment process and a state distribution alignment process.
Detailed Description
The invention is further explained below with reference to the drawings.
As shown in fig. 1, a rotating equipment fault diagnosis method based on a two-stage alignment partial migration network includes the following steps:
step 1: and collecting data with more fault categories under a certain working condition as source domain data, wherein the source domain data are all labeled data.
The labeled source domain dataset is represented as
Figure BDA0003974406190000062
wherein />
Figure BDA0003974406190000063
Represents the i-th sample of the source domain, +.>
Figure BDA0003974406190000064
Failure tag indicating ith sample, n s Representing the number of source domain samples; source Domain dataset +.>
Figure BDA0003974406190000065
Is Y s The corresponding edge state distribution is P s
Step 2: and collecting data with fewer fault categories under other working conditions as target domain data, wherein the target domain data are unlabeled data.
The unlabeled target domain dataset is represented as
Figure BDA0003974406190000066
wherein />
Figure BDA0003974406190000067
Represents the i-th sample of the target domain, n t Representing the number of target domain samples; target Domain dataset +.>
Figure BDA0003974406190000071
Is Y t The corresponding edge state distribution is P t The method comprises the steps of carrying out a first treatment on the surface of the The target domain label space is a subset of the source domain label space, i.e. +.>
Figure BDA0003974406190000072
The edge state distribution of the source domain data and the target domain data is different, namely P s ≠P t The edge state distribution of the source domain sharing category data and the target domain data also has a difference, namely P s,c ≠P t
Step 3: and simultaneously inputting the source domain data and the target domain data into the constructed double-stage alignment part migration network, and realizing the edge distribution alignment of the sharing category of the source domain and the target domain and the state distribution alignment among all subclasses of the sharing category of the source domain and the target domain by updating the internal parameters of the double-stage alignment part migration network.
As shown in fig. 2, the network structure of the dual-stage alignment partial migration network includes three basic units, namely a feature extractor G, a domain discriminator D and a classifier C; the feature extractor G is constructed by using a ViT (Vision Transformer) network, and the domain discriminator D and the classifier C are all three-layer fully-connected neural networks.
The constructed ViT network comprises an input processing module, a transducer encoder module and a classification module; in the input processing module, the acquired monitoring data is considered to be a one-dimensional vibration signal, so as to satisfyThe following transducer encoder module inputs the characteristic requirement and is required to intercept the one-dimensional vibration signal; let the collected monitoring data be one-dimensional vibration signals
Figure BDA0003974406190000073
Figure BDA0003974406190000074
Representing L-dimensional space, firstly, equally dividing a vibration signal into Z sections, namely Z token sections, wherein each section comprises sampling points S, ensuring L=Z×S, and obtaining input data of ViT network>
Figure BDA0003974406190000075
Secondly, the cut vibration signal is put into a fully-connected neural network with an input node of S and an output node of N to obtain the processed characteristic +.>
Figure BDA0003974406190000076
Then, feature x f Class token feature trainable with parameters->
Figure BDA0003974406190000077
Splicing to obtain combined characteristic->
Figure BDA0003974406190000078
Finally, trainable parameters +.>
Figure BDA0003974406190000079
Considered as position embedding, combined with feature x cb Adding to obtain the input feature of the subsequent transducer encoder module->
Figure BDA00039744061900000710
The constructed ViT network subsequent transducer encoder modules, classification modules are consistent with the transducer encoder modules, classification modules in the conventional ViT network. Transformer encoder modules are composed of multi-headed self-attention mechanisms, residual connections, layer normalization, and fully connected neural networksThe network composition is used for excavating global characteristic information in the monitoring data by utilizing a multi-head self-attention mechanism; the output characteristics processed by the transducer encoder module are the same as the input characteristics in dimension, and the output characteristics are expressed as
Figure BDA00039744061900000711
In the classification module, selecting the feature of the classification token position in the output features of the transducer encoder module>
Figure BDA00039744061900000712
Inputting the fully-connected neural network to obtain ViT network output characteristics ∈>
Figure BDA00039744061900000713
As shown in fig. 3, the source domain and target domain feature alignment process is constrained by using a weighted balance mechanism, so that the source domain and target domain sharing category edge distribution alignment is realized; in a weighted balance mechanism, constructing a class weight coefficient according to the target domain data tag probability distribution given by the classifier, and evaluating the possibility that different classes of a source domain are sharing classes; the method comprises the steps that a weight coefficient is used for restraining a part of migration network parameter updating process, a larger weight is given to source domain sharing type data, and the characteristic migration effect of the source domain sharing type data and target domain sharing type data is enhanced; and (3) giving smaller weight to the source domain outlier category data, and weakening the influence of the source domain outlier category data on the characteristic migration. The edge distribution alignment process specifically comprises the following steps:
firstly, accumulating all data label information of a target domain by a classifier, constructing a class weight coefficient gamma, and reducing the influence of single prediction error of the classifier:
Figure BDA0003974406190000081
wherein ,
Figure BDA0003974406190000082
target domain data tag probability distribution given for classifier softmax layer, K is sourceDomain data tag category number,/>
Figure BDA0003974406190000083
The target domain data given to the classifier softmax layer is the probability distribution of the kth class.
Then, regularization processing is carried out on the category weight coefficient, and the difference between the sharing category of the source domain and the target domain and the outlier category of the source domain is amplified:
Figure BDA0003974406190000084
wherein ,maxγ =max(γ)=max([γ 1 ,…,γ k ,…,γ K ]) Is the maximum value of category weight coefficient, gamma k Class weight coefficient of k class, gamma n And the regularized category weight coefficient.
Finally, the regularized class weight coefficient is utilized to restrict the countermeasure training process of the partial migration network, the characteristic alignment effect between the shared class data of the source domain and the target domain is strengthened, the influence of the outlier class data of the source domain on the characteristic migration is weakened, and the edge distribution alignment between the shared class of the source domain and the target domain is realized:
Figure BDA0003974406190000085
Figure BDA0003974406190000086
/>
Figure BDA0003974406190000087
wherein ,θgdc Network parameters representing the feature extractor G, domain arbiter D and classifier C; d, d i A domain label representing an i-th sample; l (L) y and Ld Representing a classifier and domain arbiter cross entropy loss function; lambda representsMeasuring weight parameters of two loss functions; and a gradient inversion layer is introduced between the feature extractor and the domain discriminator by the partial migration network, the gradient corresponding to the domain classification loss in the domain discriminator is automatically inverted before being reversely propagated to the parameters of the feature extractor, and the network parameter countermeasure training is realized in an end-to-end mode.
The state distribution alignment process specifically comprises the following steps:
firstly, utilizing a triple loss function constraint feature extractor and classifier parameter updating process, reducing the probability distribution distance of the source domain and the class labels, amplifying the probability distribution distance of the source domain and the class data labels, and clustering the probability distribution of the source domain and the class labels into clusters:
Figure BDA0003974406190000091
wherein ,xa Representing randomly selected anchor samples, x p Representing positive samples identical to the anchor sample label, x n Representing a negative sample different from the anchor sample label, wherein margin is a preset distance between the positive sample and the negative sample; the triple loss function is utilized to guide a part of the migration network optimization process, so that the tag probability distribution distance between the anchor sample and the positive sample of the same category can be effectively reduced, and the tag probability distribution distance between the anchor sample and the negative sample of different categories can be enlarged.
Secondly, processing the probability distribution of the source domain data labels by using K-means clustering, and identifying a clustering relation of the probability distribution of the source domain data labels; the updating process of the internal parameters of the K-means clustering model is as follows:
Figure BDA0003974406190000093
wherein ,θclu Representing the internal parameters of the K-means cluster model,
Figure BDA0003974406190000094
represents the kth class of source domain data, U k Representing the aggregation corresponding to the probability distribution of the kth type source domain data tagA class center; setting the clustering cluster number equal to the source domain data tag class number in a K-means clustering model, and enabling the source domain data tag probability distribution of different classes to surround the vicinity of the corresponding clustering center by adjusting internal parameters; further, judging the membership between the clustering clusters of the target domain data and the source domain data by utilizing a proximity criterion:
Figure BDA0003974406190000095
Figure BDA0003974406190000096
wherein ,
Figure BDA0003974406190000097
the clustering center is a clustering center to which target domain data estimated by utilizing a proximity criterion belongs; when the probability distribution of the target domain data labels and the kth type source domain data clustering center U k And (3) when the L2 norm is minimum, considering that the target domain data label probability distribution belongs to a kth class cluster, and the class label of the target domain data is k.
Finally, the L2 norm is utilized to pull the distance between the target domain label probability distribution and the clustering center of the source domain of the same category, so as to realize the state distribution alignment between the sharing categories of the source domain and the target domain:
Figure BDA0003974406190000101
/>
step 4: and selecting part of data from the data with the same working condition as the target domain data as test data, inputting the test data into the trained two-stage alignment part migration network, and predicting the category information of the test data.
Bearing fault data at two rotating speeds are collected through a rotating equipment experiment table and are used for verifying the effectiveness of the method. The experiment table consists of parts such as a motor, a rotor system, a load block, a support bearing and the like. The experimental bearing is positioned in the bearing seat on the right side of the experimental table, and the acceleration sensor adsorbed in front of the bearing seat is utilized to collect bearing fault data.
In a fault experiment, the sampling frequency is set to be 10KHz, and bearing fault data at two rotating speeds of 900r/min and 1500r/min are obtained by adjusting the rotating speed of a motor. The bearing failure data at each rotational speed contains 4 types of bearing health status: normal (N), inner race failure (IF), ball failure (RF), and outer race failure (OF), 200 per class OF bearing failure samples, a single sample data length OF 10KHz. The experimental conditions shown in table 1 were obtained by combining the bearing failure data at each rotational speed in pairs, optionally in three combinations.
Table 1 bearing failure experimental data under different conditions
Figure BDA0003974406190000102
The effectiveness of the process of the present invention was verified by constructing the following 4 comparative methods to illustrate the necessity of the respective components of the process of the present invention.
Comparison method 1 (ViT network+loss function L 1 ): the method adopts the same network structure of the feature extractor, the domain discriminator and the classifier as the method of the invention. By using a loss function L shown in (3) 1 Updating the feature extractor, domain arbiter and classifier network parameters. The method only considers edge distribution alignment between source domain and target domain sharing class data.
Comparative method 2 (ViT network + normal combat losses): the method adopts the same network structure of the feature extractor, the domain discriminator and the classifier as the method of the invention. And (3) removing the constraint of the weight coefficient in the loss function formula (3) to obtain the common counterloss function. The feature extractor, domain arbiter and classifier network parameters are updated with the common countermeasures loss function. The method only considers edge distribution alignment between the source domain and the target domain overall data.
Comparative method 3 (ViT network + cross entropy): the method adopts the same network structure of the feature extractor and the classifier as the method of the invention. And updating network parameters of the feature extractor and the classifier by using the cross entropy loss, and extracting global feature information with better distinguishability from the source domain data. The method is used for verifying that the ViT network has good global feature information extraction capability.
Comparative method 4 (CNN network + loss function L): the method uses CNN network as feature extractor, domain discriminator and classifier are the same as three-layer fully connected neural network structure used in the method. The feature extractor, domain arbiter and classifier network parameters are updated with the loss function L shown in equation (8). The method is used for verifying ViT network again, and compared with CNN network, the method has better global characteristic information extraction capability.
The hardware environment of the experimental verification process is an Intel Corei9-10850KCPU processor, a 3.60GHz32GB memory, a single NVIDIAGeForceRTX3060 graphic processing unit, a Win10 operating system and a PyTorrch1.7.1 deep learning framework. In the training process of the method and the comparison method, the model updating frequency is set to 400 times, and the batch processing frequency is set to 20 times. The network parameters of the feature extractor, the domain discriminator and the classifier are updated by using an Adam optimization algorithm, wherein the initial learning rate is set to be 0.0001 in the Adam optimization algorithm, and the value of the initial learning rate is reduced to be 0.00001 after the model parameters are updated for 200 times. To reduce the randomness of the experimental results, each set of comparison experiments was repeated 5 times during the algorithm validation.
Table 2 shows the statistics of diagnostic correctness of the method of the invention as well as other comparative methods. Because the method, the comparison method 1 and the comparison method 2 of the invention apply various loss function constraints related to feature migration on the ViT network, compared with the comparison method 3 which only utilizes the ViT network to extract the global feature information of the source domain data, the diagnosis results of the three methods are obviously improved.
In the working conditions 1-6, the target domain data only contains two kinds of label information, the number of the source domain outlier labels is more than that in the working conditions 7-10, and the influence of the source domain outlier categories on the characteristic migration can be effectively filtered by using a weighted balance mechanism; in the working conditions 7-10, the difference between the number of the source domain data labels and the number of the target domain data labels is small, and in the characteristic migration process, the negative migration effect caused by the source domain outlier type data is small. Therefore, in the working conditions 1-6, the diagnosis accuracy obtained by the comparison method 1 is higher than that obtained by the comparison method 2; in the working conditions 7-10, the diagnosis accuracy obtained by using the comparison method 2 is higher than that obtained by using the comparison method 1.
Compared with the comparison method 1, the method considers the alignment process of the sub-class features of the sharing categories of the source domain and the target domain, clusters the features of different categories of the source domain into clusters by utilizing various measurement learning measures, and effectively shortens the feature distance between the sharing categories of the source domain and the target domain. Therefore, the diagnostic result of the method of the invention is significantly better than that of comparative method 1. Compared with the comparison method 2, the method of the invention considers the whole characteristic alignment process of the sharing category of the source domain and the target domain and the characteristic alignment process of each subclass of the sharing category of the source domain and the target domain, and the diagnosis accuracy is obviously improved compared with the comparison method 2.
The local receptive field structure in the CNN network is easy to generate an effective feature omission phenomenon when deep features are extracted, so that the deep features are poor in distinguishing property; the self-attention mechanism specific to the ViT network can fully mine global feature information in the monitored data, and deep features extracted by using the ViT network are high in distinguishability. Therefore, the diagnosis result of the method is obviously superior to that of the comparison method 4 based on the CNN network. Comparing the diagnosis results of the four methods, the method has the advantages of highest average diagnosis accuracy and minimum result fluctuation, thereby verifying the effectiveness of the method.
TABLE 2 statistics of the correctness of different diagnostic methods
Figure BDA0003974406190000121
The foregoing is merely a preferred embodiment of the present invention and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present invention, which are intended to be comprehended within the scope of the present invention.

Claims (7)

1. A rotating equipment fault diagnosis method based on a double-stage alignment partial migration network is characterized by comprising the following steps:
step 1: collecting data with more fault categories under a certain working condition as source domain data, wherein the source domain data are all labeled data;
step 2: collecting data with fewer fault categories under other working conditions as target domain data, wherein the target domain data are label-free data;
step 3: the source domain data and the target domain data are simultaneously input into a constructed two-stage alignment part migration network, and the edge distribution alignment of the sharing category of the source domain and the target domain and the state distribution alignment among all subclasses of the sharing category of the source domain and the target domain are realized by updating the internal parameters of the two-stage alignment part migration network;
step 4: and selecting part of data from the data with the same working condition as the target domain data as test data, inputting the test data into the trained two-stage alignment part migration network, and predicting the category information of the test data.
2. The rotating equipment fault diagnosis method based on the two-stage alignment partial migration network according to claim 1, wherein in the step 1, the labeled source domain data set is represented as
Figure FDA0003974406180000011
wherein />
Figure FDA0003974406180000012
Represents the i-th sample of the source domain, +.>
Figure FDA0003974406180000013
Failure tag indicating ith sample, n s Representing the number of source domain samples; source Domain dataset +.>
Figure FDA0003974406180000014
Is Y s The corresponding edge state distribution is P s
3. The dual stage alignment based partial migration network of claim 2In step 2, the unlabeled target domain data set is expressed as
Figure FDA0003974406180000015
wherein />
Figure FDA0003974406180000016
Represents the i-th sample of the target domain, n t Representing the number of target domain samples; target Domain dataset +.>
Figure FDA0003974406180000017
Is Y t The corresponding edge state distribution is P t The method comprises the steps of carrying out a first treatment on the surface of the The target domain label space is a subset of the source domain label space, i.e. +.>
Figure FDA0003974406180000018
The edge state distribution of the source domain data and the target domain data is different, namely P s ≠P t The edge state distribution of the source domain sharing category data and the target domain data also has a difference, namely P s,c ≠P t
4. The rotating equipment fault diagnosis method based on the two-stage alignment part migration network according to claim 3, wherein in the step 3, the network structure of the two-stage alignment part migration network comprises three basic units, namely a feature extractor G, a domain discriminator D and a classifier C; the feature extractor G is constructed by utilizing a ViT network, and the domain discriminator D and the classifier C are all three-layer fully-connected neural networks;
the constructed ViT network comprises an input processing module, a transducer encoder module and a classification module; in the input processing module, the collected monitoring data is set as one-dimensional vibration signals
Figure FDA0003974406180000019
Figure FDA00039744061800000110
Representing L-dimensional space, firstly, equally dividing a vibration signal into Z sections, namely Z token sections, wherein each section comprises sampling points S, ensuring L=Z×S, and obtaining input data of ViT network>
Figure FDA0003974406180000021
Secondly, the cut vibration signal is put into a fully-connected neural network with an input node of S and an output node of N to obtain the processed characteristic +.>
Figure FDA0003974406180000022
Then, feature x f Class token feature trainable with parameters
Figure FDA0003974406180000023
Splicing to obtain combined characteristic->
Figure FDA0003974406180000024
Finally, trainable parameters +.>
Figure FDA0003974406180000025
Considered as position embedding, and combining features s cb Adding to obtain the input feature of the subsequent transducer encoder module->
Figure FDA0003974406180000026
The transducer encoder module consists of a multi-head self-attention mechanism, residual error connection, layer standardization and a full-connection neural network, and utilizes the multi-head self-attention mechanism to mine global characteristic information in monitoring data; the output characteristics processed by the transducer encoder module are the same as the input characteristics in dimension, and the output characteristics are expressed as
Figure FDA0003974406180000027
Selecting a feature of the classified token position in the output features of the transducer encoder module in the classification module/>
Figure FDA0003974406180000028
Inputting the fully-connected neural network to obtain ViT network output characteristics ∈>
Figure FDA0003974406180000029
5. The rotating equipment fault diagnosis method based on the two-stage alignment partial migration network according to claim 4, wherein in the step 3, a weighted balance mechanism is utilized to restrict the characteristic alignment process of the source domain and the target domain, so as to realize the sharing type edge distribution alignment of the source domain and the target domain; in a weighted balance mechanism, constructing a class weight coefficient according to the target domain data tag probability distribution given by the classifier, and evaluating the possibility that different classes of a source domain are sharing classes; the method comprises the steps that a weight coefficient is used for restraining a part of migration network parameter updating process, a larger weight is given to source domain sharing type data, and the characteristic migration effect of the source domain sharing type data and target domain sharing type data is enhanced; and (3) giving smaller weight to the source domain outlier category data, and weakening the influence of the source domain outlier category data on the characteristic migration.
6. The rotating equipment fault diagnosis method based on the two-stage alignment partial migration network according to claim 5, wherein the edge distribution alignment process specifically comprises the following steps:
firstly, the accumulation classifier gives all data tag information of a target domain, and a class weight coefficient gamma is constructed:
Figure FDA00039744061800000210
wherein ,
Figure FDA00039744061800000211
target domain data tag probability distribution given for classifier softmax layer, K is the source domain data tag class number, and +.>
Figure FDA00039744061800000212
The target domain data given to the classifier softmax layer is the probability distribution of the kth class;
then, regularization processing is carried out on the category weight coefficient:
Figure FDA0003974406180000031
wherein ,maxγ =max(γ)=max([γ 1 ,…,γ k ,…,γ K ]) Is the maximum value of category weight coefficient, gamma k Class weight coefficient of k class, gamma n The regularized category weight coefficient;
finally, the regularized class weight coefficient is utilized to restrict the countermeasure training process of a part of the migration network, so that the edge distribution alignment between the sharing classes of the source domain and the target domain is realized:
Figure FDA0003974406180000032
wherein ,θgdc Network parameters representing the feature extractor G, domain arbiter D and classifier C; d, d i A domain label representing an i-th sample; l (L) y and Ld Representing a classifier and domain arbiter cross entropy loss function; λ represents a weight parameter that measures two loss functions; and a gradient inversion layer is introduced between the feature extractor and the domain discriminator by the partial migration network, the gradient corresponding to the domain classification loss in the domain discriminator is automatically inverted before being reversely propagated to the parameters of the feature extractor, and the network parameter countermeasure training is realized in an end-to-end mode.
7. The rotating equipment fault diagnosis method based on the two-stage alignment partial migration network according to claim 6, wherein in the step 3, the state distribution alignment process specifically includes the following steps:
firstly, utilizing a triple loss function constraint feature extractor and classifier parameter updating process, reducing the probability distribution distance of the source domain and the class labels, amplifying the probability distribution distance of the source domain and the class data labels, and clustering the probability distribution of the source domain and the class labels into clusters:
Figure FDA0003974406180000033
wherein ,xa Representing randomly selected anchor samples, x p Representing positive samples identical to the anchor sample label, x n Representing a negative sample different from the anchor sample label, wherein margin is a preset distance between the positive sample and the negative sample;
secondly, processing the probability distribution of the source domain data labels by using K-means clustering, and identifying a clustering relation of the probability distribution of the source domain data labels; the updating process of the internal parameters of the K-means clustering model is as follows:
Figure FDA0003974406180000041
wherein ,θclu Representing the internal parameters of the K-means cluster model,
Figure FDA0003974406180000042
represents the kth class of source domain data, U k Representing a clustering center corresponding to the probability distribution of the kth type source domain data labels; setting the clustering cluster number equal to the source domain data tag class number in a K-means clustering model, and enabling the source domain data tag probability distribution of different classes to surround the vicinity of the corresponding clustering center by adjusting internal parameters; further, judging the membership between the clustering clusters of the target domain data and the source domain data by utilizing a proximity criterion:
Figure FDA0003974406180000043
wherein ,
Figure FDA0003974406180000044
the clustering center is a clustering center to which target domain data estimated by utilizing a proximity criterion belongs; when the probability distribution of the target domain data labels and the kth type source domain data clustering center U k When the L2 norm of the target domain data is minimum, the probability distribution of the target domain data label belongs to a kth cluster, and the class label of the target domain data is k;
finally, the L2 norm is utilized to pull the distance between the target domain label probability distribution and the clustering center of the source domain of the same category, so as to realize the state distribution alignment between the sharing categories of the source domain and the target domain:
Figure FDA0003974406180000045
/>
CN202211531891.4A 2022-12-01 2022-12-01 Rotating equipment fault diagnosis method based on double-stage alignment partial migration network Active CN116150668B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211531891.4A CN116150668B (en) 2022-12-01 2022-12-01 Rotating equipment fault diagnosis method based on double-stage alignment partial migration network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211531891.4A CN116150668B (en) 2022-12-01 2022-12-01 Rotating equipment fault diagnosis method based on double-stage alignment partial migration network

Publications (2)

Publication Number Publication Date
CN116150668A true CN116150668A (en) 2023-05-23
CN116150668B CN116150668B (en) 2023-08-11

Family

ID=86357296

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211531891.4A Active CN116150668B (en) 2022-12-01 2022-12-01 Rotating equipment fault diagnosis method based on double-stage alignment partial migration network

Country Status (1)

Country Link
CN (1) CN116150668B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170083751A1 (en) * 2015-09-21 2017-03-23 Mitsubishi Electric Research Laboratories, Inc. Method for estimating locations of facial landmarks in an image of a face using globally aligned regression
CN112183581A (en) * 2020-09-07 2021-01-05 华南理工大学 Semi-supervised mechanical fault diagnosis method based on self-adaptive migration neural network
CN113673397A (en) * 2021-08-11 2021-11-19 山东科技大学 Local area adaptive mechanical fault diagnosis method based on class weighting alignment
CN113947725A (en) * 2021-10-26 2022-01-18 中国矿业大学 Hyperspectral image classification method based on convolution width migration network
CN114358123A (en) * 2021-12-03 2022-04-15 华南理工大学 Generalized open set fault diagnosis method based on deep countermeasure migration network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170083751A1 (en) * 2015-09-21 2017-03-23 Mitsubishi Electric Research Laboratories, Inc. Method for estimating locations of facial landmarks in an image of a face using globally aligned regression
CN112183581A (en) * 2020-09-07 2021-01-05 华南理工大学 Semi-supervised mechanical fault diagnosis method based on self-adaptive migration neural network
CN113673397A (en) * 2021-08-11 2021-11-19 山东科技大学 Local area adaptive mechanical fault diagnosis method based on class weighting alignment
CN113947725A (en) * 2021-10-26 2022-01-18 中国矿业大学 Hyperspectral image classification method based on convolution width migration network
CN114358123A (en) * 2021-12-03 2022-04-15 华南理工大学 Generalized open set fault diagnosis method based on deep countermeasure migration network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YONGCHAO ZHANG等: ""MMFNet: Multisensor Data and Multiscale Feature Fusion Model for Intelligent Cross-Domain Machinery Fault Diagnosis "", 《IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT》, pages 1 - 11 *

Also Published As

Publication number Publication date
CN116150668B (en) 2023-08-11

Similar Documents

Publication Publication Date Title
CN111914883B (en) Spindle bearing state evaluation method and device based on deep fusion network
CN113673346B (en) Motor vibration data processing and state identification method based on multiscale SE-Resnet
CN114358124B (en) New fault diagnosis method for rotary machinery based on deep countermeasure convolutional neural network
CN114358123B (en) Generalized open set fault diagnosis method based on deep countermeasure migration network
CN116304820B (en) Bearing fault type prediction method and system based on multi-source domain transfer learning
CN116894187A (en) Gear box fault diagnosis method based on deep migration learning
CN110377605B (en) Sensitive attribute identification and classification method for structured data
CN114048568A (en) Rotating machine fault diagnosis method based on multi-source migration fusion contraction framework
CN113375941A (en) Open set fault diagnosis method for high-speed motor train unit bearing
CN114676742A (en) Power grid abnormal electricity utilization detection method based on attention mechanism and residual error network
CN116026593A (en) Cross-working-condition rolling bearing fault targeted migration diagnosis method and system
CN114429152A (en) Rolling bearing fault diagnosis method based on dynamic index antagonism self-adaption
CN110659682A (en) Data classification method based on MCWD-KSMOTE-AdaBoost-DenseNet algorithm
CN115358259A (en) Self-learning-based unsupervised cross-working-condition bearing fault diagnosis method
CN115905976A (en) Method, system and equipment for diagnosing high way Bi-LSTM bearing fault based on attention mechanism
CN112763215B (en) Multi-working-condition online fault diagnosis method based on modular federal deep learning
CN112598666B (en) Cable tunnel anomaly detection method based on convolutional neural network
CN116150668B (en) Rotating equipment fault diagnosis method based on double-stage alignment partial migration network
CN116975718A (en) Rolling bearing cross-domain fault diagnosis method based on self-supervision learning
Saha et al. Enhancing bearing fault diagnosis using transfer learning and random forest classification: A comparative study on variable working conditions
CN113609480B (en) Multipath learning intrusion detection method based on large-scale network flow
CN113158537B (en) Aeroengine gas circuit fault diagnosis method based on LSTM combined attention mechanism
CN113723592A (en) Fault diagnosis method based on wind power gear box monitoring system
CN112269778A (en) Equipment fault diagnosis method
CN114383846B (en) Bearing composite fault diagnosis method based on fault label information vector

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant