CN116150668A - Rotating equipment fault diagnosis method based on double-stage alignment partial migration network - Google Patents
Rotating equipment fault diagnosis method based on double-stage alignment partial migration network Download PDFInfo
- Publication number
- CN116150668A CN116150668A CN202211531891.4A CN202211531891A CN116150668A CN 116150668 A CN116150668 A CN 116150668A CN 202211531891 A CN202211531891 A CN 202211531891A CN 116150668 A CN116150668 A CN 116150668A
- Authority
- CN
- China
- Prior art keywords
- data
- domain
- source domain
- target domain
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 112
- 238000013508 migration Methods 0.000 title claims abstract description 61
- 230000005012 migration Effects 0.000 title claims abstract description 61
- 238000003745 diagnosis Methods 0.000 title claims abstract description 43
- 230000036961 partial effect Effects 0.000 title claims abstract description 28
- 238000009826 distribution Methods 0.000 claims abstract description 88
- 230000008569 process Effects 0.000 claims abstract description 36
- 230000007246 mechanism Effects 0.000 claims abstract description 16
- 238000013528 artificial neural network Methods 0.000 claims abstract description 13
- 230000006870 function Effects 0.000 claims description 19
- 238000012545 processing Methods 0.000 claims description 16
- 238000012544 monitoring process Methods 0.000 claims description 15
- 238000012360 testing method Methods 0.000 claims description 12
- 238000012549 training Methods 0.000 claims description 11
- 238000003064 k means clustering Methods 0.000 claims description 10
- 230000000694 effects Effects 0.000 claims description 8
- 238000005070 sampling Methods 0.000 claims description 4
- 238000004138 cluster model Methods 0.000 claims description 3
- 230000000644 propagated effect Effects 0.000 claims description 3
- 230000000452 restraining effect Effects 0.000 claims description 3
- 230000003313 weakening effect Effects 0.000 claims description 3
- 238000009825 accumulation Methods 0.000 claims description 2
- 230000009977 dual effect Effects 0.000 claims description 2
- 238000005259 measurement Methods 0.000 abstract description 3
- 238000013527 convolutional neural network Methods 0.000 description 7
- 238000012733 comparative method Methods 0.000 description 6
- 238000013135 deep learning Methods 0.000 description 5
- 238000002474 experimental method Methods 0.000 description 4
- 238000013526 transfer learning Methods 0.000 description 4
- 238000005457 optimization Methods 0.000 description 3
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 230000002829 reductive effect Effects 0.000 description 2
- 101100478633 Escherichia coli O157:H7 stcE gene Proteins 0.000 description 1
- 230000001133 acceleration Effects 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000002405 diagnostic procedure Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 239000004744 fabric Substances 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 230000003862 health status Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 230000000670 limiting effect Effects 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000007670 refining Methods 0.000 description 1
- 101150115529 tagA gene Proteins 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/20—Administration of product repair or maintenance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/04—Manufacturing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Economics (AREA)
- General Business, Economics & Management (AREA)
- Marketing (AREA)
- Tourism & Hospitality (AREA)
- Strategic Management (AREA)
- Quality & Reliability (AREA)
- Operations Research (AREA)
- Entrepreneurship & Innovation (AREA)
- Manufacturing & Machinery (AREA)
- Primary Health Care (AREA)
- Image Analysis (AREA)
- Testing And Monitoring For Control Systems (AREA)
Abstract
The invention discloses a rotating equipment fault diagnosis method based on a double-stage alignment partial migration network, which is used for rotating equipment fault classification tasks. In terms of network architecture, the partial migration network consists of three basic units, namely a feature extractor, a domain discriminator and a classifier, wherein the feature extractor is constructed by utilizing a ViT network, and the domain discriminator and the classifier are constructed by utilizing two independent three-layer fully-connected neural networks. In the aspect of network parameter updating, 1) constructing a weighted balance mechanism to restrict the characteristic alignment process of the source domain and the target domain, and realizing the sharing class edge distribution alignment of the source domain and the target domain; 2) And (3) utilizing various measurement learning measures to pull up characteristic distances among all subclasses of the source domain and the target domain sharing class, and realizing state distribution alignment among the source domain and the target domain sharing class.
Description
Technical Field
The invention belongs to the field of fault diagnosis of rotating equipment, and particularly relates to a rotating equipment fault diagnosis method based on a double-stage alignment partial migration network.
Background
With the continuous improvement of the intelligent manufacturing industry and the continuous growth of the modern heavy equipment scale, the rotary equipment is developing towards the directions of complicating, intelligentizing, accelerating and refining, so that the probability of potential faults is gradually increased, and the maintenance work difficulty is improved. In the present industrial Internet age, the multi-source sensor information acquisition network of the rotating equipment is staggered, the volume of the monitoring data is increasingly increased, and the accurate diagnosis of the possible hidden trouble of the rotating equipment by utilizing the intelligent fault diagnosis technology is very necessary and urgent.
The deep learning is used as a new exhibition in the intelligent fault diagnosis field, can autonomously mine the representative diagnosis information hidden in the original data, directly establishes the accurate mapping relation between the original data and the running state, and gets rid of the dependence on artificial feature design and engineering diagnosis experience to a great extent. At present, researchers provide a large number of intelligent fault diagnosis methods for rotating equipment based on deep learning, and a good diagnosis effect is achieved. However, these methods are mostly based on the following two assumptions: 1) The test data and the training data are required to be distributed independently and uniformly; 2) The task to be diagnosed has sufficient label fault samples. In an industrial field, the operating speed, load, environmental noise and other working condition information of the rotating equipment change at any moment, and the distribution of monitoring data collected by the sensor also changes continuously; the degradation process of the rotating equipment slowly and gradually changes along with time, so that the difference of the characteristics of the monitoring data between the health state and the early weak damage state of the rotating equipment is difficult to detect, and the time and labor consuming work is realized by collecting fault samples and effectively labeling. The intelligent fault diagnosis method based on deep learning cannot be directly used for solving the fault diagnosis problem of the industrial field rotating equipment.
The intelligent fault diagnosis method based on transfer learning relaxes the constraint that test data and training data in deep learning must obey independent same distribution. The transfer learning effectively solves the problems that the training data amount is insufficient and the characteristic distribution of the training data and the test data is not matched in the new field by learning the knowledge of the previous task and applying the knowledge to the new task. At present, researchers propose a plurality of intelligent fault diagnosis methods based on transfer learning for realizing the transfer of diagnosis knowledge among different sensor positions and different mechanical equipment under different operation conditions. However, existing studies all assume that the source domain data and the target domain data have the same tag space. In an actual industrial scenario, it is difficult to find source domain and target domain data with identical tag information. The running state information covered by the source domain data is often far more than that of the target domain data, and the tag space of the target domain data is mostly a subset of the source domain data. The intelligent fault diagnosis method based on transfer learning is slightly insufficient in solving the fault diagnosis problem of the rotating equipment in the actual industrial scene.
The partial migration learning strengthens the characteristic migration effect of the shared class data of the source domain and the target domain by limiting the contribution degree of different class data of the source domain in the characteristic alignment process, weakens the influence of the outlier class data of the source domain on the characteristic migration, and can migrate the diagnostic model from a large domain containing rich running state information to a small domain containing only a small amount of running state information. The existing partial migration diagnosis method still has the following two problems:
1) Most of the existing methods utilize target domain data tag information given by a classifier to construct weight information, and restrict the edge distribution alignment process between the sharing categories of the source domain and the target domain. However, existing approaches ignore the alignment of state distributions among the subclasses of the source and target domain sharing categories.
2) Most of the existing methods use convolutional neural networks as basic network architecture. The special local receptive field structure of the convolutional neural network focuses on the local characteristics of different segments of the monitoring data, and can not effectively capture the correlation of the characteristics among the different segments.
Disclosure of Invention
The invention aims to: aiming at the prior art, the rotating equipment fault diagnosis method based on the double-stage alignment partial migration network is provided for the rotating equipment fault classification task, and the fault identification accuracy of the existing partial migration diagnosis method for the rotating equipment is improved.
The technical scheme is as follows: 1. a rotating equipment fault diagnosis method based on a double-stage alignment partial migration network is characterized by comprising the following steps:
step 1: collecting data with more fault categories under a certain working condition as source domain data, wherein the source domain data are all labeled data;
step 2: collecting data with fewer fault categories under other working conditions as target domain data, wherein the target domain data are label-free data;
step 3: the source domain data and the target domain data are simultaneously input into a constructed two-stage alignment part migration network, and the edge distribution alignment of the sharing category of the source domain and the target domain and the state distribution alignment among all subclasses of the sharing category of the source domain and the target domain are realized by updating the internal parameters of the two-stage alignment part migration network;
step 4: and selecting part of data from the data with the same working condition as the target domain data as test data, inputting the test data into the trained two-stage alignment part migration network, and predicting the category information of the test data.
2. The rotating equipment fault diagnosis method based on the two-stage alignment partial migration network according to claim 1, wherein in the step 1, the labeled source domain data set is represented as whereinRepresents the i-th sample of the source domain, +.>Failure tag indicating ith sample, n s Representing the number of source domain samples; source Domain dataset +.>Is Y s The corresponding edge state distribution is P s 。
3. The fault diagnosis method based on the dual-stage alignment partial migration network according to claim 2, wherein in the step 2, the unlabeled target domain data set is expressed as whereinRepresents the i-th sample of the target domain, n t Representing the number of target domain samples; target Domain dataset +.>Is Y t The corresponding edge state distribution is P t The method comprises the steps of carrying out a first treatment on the surface of the The target domain label space is a subset of the source domain label space, i.e. +.>The edge state distribution of the source domain data and the target domain data is different, namely P s ≠P t The edge state distribution of the source domain sharing category data and the target domain data also has a difference, namely P s,c ≠P t 。
4. The rotating equipment fault diagnosis method based on the two-stage alignment part migration network according to claim 3, wherein in the step 3, the network structure of the two-stage alignment part migration network comprises three basic units, namely a feature extractor G, a domain discriminator D and a classifier C; the feature extractor G is constructed by utilizing a ViT network, and the domain discriminator D and the classifier C are all three-layer fully-connected neural networks;
the constructed ViT network comprises an input processing module, a transducer encoder module and a classification module; in the input processing module, the collected monitoring data is set as one-dimensional vibration signals Representing L-dimensional space, firstly, equally dividing a vibration signal into Z sections, namely Z token sections, wherein each section comprises sampling points S, ensuring L=Z×S, and obtaining input data of ViT network>Secondly, the cut vibration signal is put into a fully-connected neural network with an input node of S and an output node of N to obtain the processed characteristic +.>Then, feature x f Class token feature trainable with parameters->Splicing to obtain combined characteristic->Finally, trainable parameters +.>Considered as position embedding, combined with feature x cb Adding to obtain input features of a subsequent transducer encoder module
The transducer encoder module consists of a multi-head self-attention mechanism, residual error connection, layer standardization and a full-connection neural network, and utilizes the multi-head self-attention mechanism to mine global characteristic information in monitoring data; the output characteristics processed by the transducer encoder module are the same as the input characteristics in dimension, and the output characteristics are expressed asIn the classification module, selecting the feature of the classification token position in the output features of the transducer encoder module>Inputting the fully-connected neural network to obtain ViT network output characteristics ∈>
5. The rotating equipment fault diagnosis method based on the two-stage alignment partial migration network according to claim 4, wherein in the step 3, a weighted balance mechanism is utilized to restrict the characteristic alignment process of the source domain and the target domain, so as to realize the sharing type edge distribution alignment of the source domain and the target domain; in a weighted balance mechanism, constructing a class weight coefficient according to the target domain data tag probability distribution given by the classifier, and evaluating the possibility that different classes of a source domain are sharing classes; the method comprises the steps that a weight coefficient is used for restraining a part of migration network parameter updating process, a larger weight is given to source domain sharing type data, and the characteristic migration effect of the source domain sharing type data and target domain sharing type data is enhanced; and (3) giving smaller weight to the source domain outlier category data, and weakening the influence of the source domain outlier category data on the characteristic migration.
6. The rotating equipment fault diagnosis method based on the two-stage alignment partial migration network according to claim 5, wherein the edge distribution alignment process specifically comprises the following steps:
firstly, the accumulation classifier gives all data tag information of a target domain, and a class weight coefficient gamma is constructed:
wherein ,target domain data tag probability distribution given for classifier softmax layer, K is the source domain data tag class number, and +.>The target domain data given to the classifier softmax layer is the probability distribution of the kth class;
then, regularization processing is carried out on the category weight coefficient:
wherein ,maxγ =max(γ)=max([γ 1 ,…,γ k ,…,γ K ]) Is the maximum value of category weight coefficient, gamma k Class weight coefficient of k class, gamma n The regularized category weight coefficient;
finally, the regularized class weight coefficient is utilized to restrict the countermeasure training process of a part of the migration network, so that the edge distribution alignment between the sharing classes of the source domain and the target domain is realized:
wherein ,θg ,θ d ,θ c Network parameters representing the feature extractor G, domain arbiter D and classifier C; d, d i A domain label representing an i-th sample; l (L) y and Ld Representing a classifier and domain arbiter cross entropy loss function; λ represents a weight parameter that measures two loss functions; and a gradient inversion layer is introduced between the feature extractor and the domain discriminator by the partial migration network, the gradient corresponding to the domain classification loss in the domain discriminator is automatically inverted before being reversely propagated to the parameters of the feature extractor, and the network parameter countermeasure training is realized in an end-to-end mode.
7. The rotating equipment fault diagnosis method based on the two-stage alignment partial migration network according to claim 6, wherein in the step 3, the state distribution alignment process specifically includes the following steps:
firstly, utilizing a triple loss function constraint feature extractor and classifier parameter updating process, reducing the probability distribution distance of the source domain and the class labels, amplifying the probability distribution distance of the source domain and the class data labels, and clustering the probability distribution of the source domain and the class labels into clusters:
wherein ,xa Representing randomly selected anchor samples, x p Representing positive samples identical to the anchor sample label, x n Representing a negative sample different from the anchor sample label, wherein margin is a preset distance between the positive sample and the negative sample;
secondly, processing the probability distribution of the source domain data labels by using K-means clustering, and identifying a clustering relation of the probability distribution of the source domain data labels; the updating process of the internal parameters of the K-means clustering model is as follows:
wherein ,θclu Representing the internal parameters of the K-means cluster model,represents the kth class of source domain data, U k Representing a clustering center corresponding to the probability distribution of the kth type source domain data labels; in a K-means clustering model, setting the clustering cluster number equal to the source domain data tag class number, and enabling the source domain data tag probability of different classes to be divided by adjusting internal parametersThe cloth surrounds the vicinity of the corresponding cluster center; further, judging the membership between the clustering clusters of the target domain data and the source domain data by utilizing a proximity criterion:
wherein ,the clustering center is a clustering center to which target domain data estimated by utilizing a proximity criterion belongs; when the probability distribution of the target domain data labels and the kth type source domain data clustering center U k When the L2 norm of the target domain data is minimum, the probability distribution of the target domain data label belongs to a kth cluster, and the class label of the target domain data is k;
finally, the L2 norm is utilized to pull the distance between the target domain label probability distribution and the clustering center of the source domain of the same category, so as to realize the state distribution alignment between the sharing categories of the source domain and the target domain:
the beneficial effects are that: 1) The source domain and the target domain share category status distribution alignment is realized by utilizing various measurement learning measures. The method utilizes a triple loss function constraint part migration network parameter updating process to cluster the probability distribution of each label of a source domain into clusters; identifying a source domain data clustering structure by using K-means clustering, and judging the membership between the target domain data and the source domain data clustering cluster by using a proximity criterion; and (3) utilizing the L2 norm to pull the distance between the target domain label probability distribution and the clustering center of the source domain of the same category to realize the state distribution alignment between the source domain and the target domain sharing category.
2) The global characteristic information of the monitoring data is extracted by using a Vision Transformer (ViT) network. ViT network uses self-attention mechanism to extract global characteristic information without depending on convolution, and has good classification effect. The method improves the original ViT network input processing module, and ensures that the characteristics of the one-dimensional monitoring data processed by the input processing module meet the input requirements of a subsequent transducer encoder module. And utilizing a ViT network to fully mine the characteristic correlation among different fragments of the monitoring data, and extracting the global characteristic information of the monitoring data.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a block diagram of a dual stage alignment partial migration network;
fig. 3 is a schematic diagram of an edge distribution alignment process and a state distribution alignment process.
Detailed Description
The invention is further explained below with reference to the drawings.
As shown in fig. 1, a rotating equipment fault diagnosis method based on a two-stage alignment partial migration network includes the following steps:
step 1: and collecting data with more fault categories under a certain working condition as source domain data, wherein the source domain data are all labeled data.
The labeled source domain dataset is represented as whereinRepresents the i-th sample of the source domain, +.>Failure tag indicating ith sample, n s Representing the number of source domain samples; source Domain dataset +.>Is Y s The corresponding edge state distribution is P s 。
Step 2: and collecting data with fewer fault categories under other working conditions as target domain data, wherein the target domain data are unlabeled data.
The unlabeled target domain dataset is represented as whereinRepresents the i-th sample of the target domain, n t Representing the number of target domain samples; target Domain dataset +.>Is Y t The corresponding edge state distribution is P t The method comprises the steps of carrying out a first treatment on the surface of the The target domain label space is a subset of the source domain label space, i.e. +.>The edge state distribution of the source domain data and the target domain data is different, namely P s ≠P t The edge state distribution of the source domain sharing category data and the target domain data also has a difference, namely P s,c ≠P t 。
Step 3: and simultaneously inputting the source domain data and the target domain data into the constructed double-stage alignment part migration network, and realizing the edge distribution alignment of the sharing category of the source domain and the target domain and the state distribution alignment among all subclasses of the sharing category of the source domain and the target domain by updating the internal parameters of the double-stage alignment part migration network.
As shown in fig. 2, the network structure of the dual-stage alignment partial migration network includes three basic units, namely a feature extractor G, a domain discriminator D and a classifier C; the feature extractor G is constructed by using a ViT (Vision Transformer) network, and the domain discriminator D and the classifier C are all three-layer fully-connected neural networks.
The constructed ViT network comprises an input processing module, a transducer encoder module and a classification module; in the input processing module, the acquired monitoring data is considered to be a one-dimensional vibration signal, so as to satisfyThe following transducer encoder module inputs the characteristic requirement and is required to intercept the one-dimensional vibration signal; let the collected monitoring data be one-dimensional vibration signals Representing L-dimensional space, firstly, equally dividing a vibration signal into Z sections, namely Z token sections, wherein each section comprises sampling points S, ensuring L=Z×S, and obtaining input data of ViT network>Secondly, the cut vibration signal is put into a fully-connected neural network with an input node of S and an output node of N to obtain the processed characteristic +.>Then, feature x f Class token feature trainable with parameters->Splicing to obtain combined characteristic->Finally, trainable parameters +.>Considered as position embedding, combined with feature x cb Adding to obtain the input feature of the subsequent transducer encoder module->
The constructed ViT network subsequent transducer encoder modules, classification modules are consistent with the transducer encoder modules, classification modules in the conventional ViT network. Transformer encoder modules are composed of multi-headed self-attention mechanisms, residual connections, layer normalization, and fully connected neural networksThe network composition is used for excavating global characteristic information in the monitoring data by utilizing a multi-head self-attention mechanism; the output characteristics processed by the transducer encoder module are the same as the input characteristics in dimension, and the output characteristics are expressed asIn the classification module, selecting the feature of the classification token position in the output features of the transducer encoder module>Inputting the fully-connected neural network to obtain ViT network output characteristics ∈>
As shown in fig. 3, the source domain and target domain feature alignment process is constrained by using a weighted balance mechanism, so that the source domain and target domain sharing category edge distribution alignment is realized; in a weighted balance mechanism, constructing a class weight coefficient according to the target domain data tag probability distribution given by the classifier, and evaluating the possibility that different classes of a source domain are sharing classes; the method comprises the steps that a weight coefficient is used for restraining a part of migration network parameter updating process, a larger weight is given to source domain sharing type data, and the characteristic migration effect of the source domain sharing type data and target domain sharing type data is enhanced; and (3) giving smaller weight to the source domain outlier category data, and weakening the influence of the source domain outlier category data on the characteristic migration. The edge distribution alignment process specifically comprises the following steps:
firstly, accumulating all data label information of a target domain by a classifier, constructing a class weight coefficient gamma, and reducing the influence of single prediction error of the classifier:
wherein ,target domain data tag probability distribution given for classifier softmax layer, K is sourceDomain data tag category number,The target domain data given to the classifier softmax layer is the probability distribution of the kth class.
Then, regularization processing is carried out on the category weight coefficient, and the difference between the sharing category of the source domain and the target domain and the outlier category of the source domain is amplified:
wherein ,maxγ =max(γ)=max([γ 1 ,…,γ k ,…,γ K ]) Is the maximum value of category weight coefficient, gamma k Class weight coefficient of k class, gamma n And the regularized category weight coefficient.
Finally, the regularized class weight coefficient is utilized to restrict the countermeasure training process of the partial migration network, the characteristic alignment effect between the shared class data of the source domain and the target domain is strengthened, the influence of the outlier class data of the source domain on the characteristic migration is weakened, and the edge distribution alignment between the shared class of the source domain and the target domain is realized:
wherein ,θg ,θ d ,θ c Network parameters representing the feature extractor G, domain arbiter D and classifier C; d, d i A domain label representing an i-th sample; l (L) y and Ld Representing a classifier and domain arbiter cross entropy loss function; lambda representsMeasuring weight parameters of two loss functions; and a gradient inversion layer is introduced between the feature extractor and the domain discriminator by the partial migration network, the gradient corresponding to the domain classification loss in the domain discriminator is automatically inverted before being reversely propagated to the parameters of the feature extractor, and the network parameter countermeasure training is realized in an end-to-end mode.
The state distribution alignment process specifically comprises the following steps:
firstly, utilizing a triple loss function constraint feature extractor and classifier parameter updating process, reducing the probability distribution distance of the source domain and the class labels, amplifying the probability distribution distance of the source domain and the class data labels, and clustering the probability distribution of the source domain and the class labels into clusters:
wherein ,xa Representing randomly selected anchor samples, x p Representing positive samples identical to the anchor sample label, x n Representing a negative sample different from the anchor sample label, wherein margin is a preset distance between the positive sample and the negative sample; the triple loss function is utilized to guide a part of the migration network optimization process, so that the tag probability distribution distance between the anchor sample and the positive sample of the same category can be effectively reduced, and the tag probability distribution distance between the anchor sample and the negative sample of different categories can be enlarged.
Secondly, processing the probability distribution of the source domain data labels by using K-means clustering, and identifying a clustering relation of the probability distribution of the source domain data labels; the updating process of the internal parameters of the K-means clustering model is as follows:
wherein ,θclu Representing the internal parameters of the K-means cluster model,represents the kth class of source domain data, U k Representing the aggregation corresponding to the probability distribution of the kth type source domain data tagA class center; setting the clustering cluster number equal to the source domain data tag class number in a K-means clustering model, and enabling the source domain data tag probability distribution of different classes to surround the vicinity of the corresponding clustering center by adjusting internal parameters; further, judging the membership between the clustering clusters of the target domain data and the source domain data by utilizing a proximity criterion:
wherein ,the clustering center is a clustering center to which target domain data estimated by utilizing a proximity criterion belongs; when the probability distribution of the target domain data labels and the kth type source domain data clustering center U k And (3) when the L2 norm is minimum, considering that the target domain data label probability distribution belongs to a kth class cluster, and the class label of the target domain data is k.
Finally, the L2 norm is utilized to pull the distance between the target domain label probability distribution and the clustering center of the source domain of the same category, so as to realize the state distribution alignment between the sharing categories of the source domain and the target domain:
step 4: and selecting part of data from the data with the same working condition as the target domain data as test data, inputting the test data into the trained two-stage alignment part migration network, and predicting the category information of the test data.
Bearing fault data at two rotating speeds are collected through a rotating equipment experiment table and are used for verifying the effectiveness of the method. The experiment table consists of parts such as a motor, a rotor system, a load block, a support bearing and the like. The experimental bearing is positioned in the bearing seat on the right side of the experimental table, and the acceleration sensor adsorbed in front of the bearing seat is utilized to collect bearing fault data.
In a fault experiment, the sampling frequency is set to be 10KHz, and bearing fault data at two rotating speeds of 900r/min and 1500r/min are obtained by adjusting the rotating speed of a motor. The bearing failure data at each rotational speed contains 4 types of bearing health status: normal (N), inner race failure (IF), ball failure (RF), and outer race failure (OF), 200 per class OF bearing failure samples, a single sample data length OF 10KHz. The experimental conditions shown in table 1 were obtained by combining the bearing failure data at each rotational speed in pairs, optionally in three combinations.
Table 1 bearing failure experimental data under different conditions
The effectiveness of the process of the present invention was verified by constructing the following 4 comparative methods to illustrate the necessity of the respective components of the process of the present invention.
Comparison method 1 (ViT network+loss function L 1 ): the method adopts the same network structure of the feature extractor, the domain discriminator and the classifier as the method of the invention. By using a loss function L shown in (3) 1 Updating the feature extractor, domain arbiter and classifier network parameters. The method only considers edge distribution alignment between source domain and target domain sharing class data.
Comparative method 2 (ViT network + normal combat losses): the method adopts the same network structure of the feature extractor, the domain discriminator and the classifier as the method of the invention. And (3) removing the constraint of the weight coefficient in the loss function formula (3) to obtain the common counterloss function. The feature extractor, domain arbiter and classifier network parameters are updated with the common countermeasures loss function. The method only considers edge distribution alignment between the source domain and the target domain overall data.
Comparative method 3 (ViT network + cross entropy): the method adopts the same network structure of the feature extractor and the classifier as the method of the invention. And updating network parameters of the feature extractor and the classifier by using the cross entropy loss, and extracting global feature information with better distinguishability from the source domain data. The method is used for verifying that the ViT network has good global feature information extraction capability.
Comparative method 4 (CNN network + loss function L): the method uses CNN network as feature extractor, domain discriminator and classifier are the same as three-layer fully connected neural network structure used in the method. The feature extractor, domain arbiter and classifier network parameters are updated with the loss function L shown in equation (8). The method is used for verifying ViT network again, and compared with CNN network, the method has better global characteristic information extraction capability.
The hardware environment of the experimental verification process is an Intel Corei9-10850KCPU processor, a 3.60GHz32GB memory, a single NVIDIAGeForceRTX3060 graphic processing unit, a Win10 operating system and a PyTorrch1.7.1 deep learning framework. In the training process of the method and the comparison method, the model updating frequency is set to 400 times, and the batch processing frequency is set to 20 times. The network parameters of the feature extractor, the domain discriminator and the classifier are updated by using an Adam optimization algorithm, wherein the initial learning rate is set to be 0.0001 in the Adam optimization algorithm, and the value of the initial learning rate is reduced to be 0.00001 after the model parameters are updated for 200 times. To reduce the randomness of the experimental results, each set of comparison experiments was repeated 5 times during the algorithm validation.
Table 2 shows the statistics of diagnostic correctness of the method of the invention as well as other comparative methods. Because the method, the comparison method 1 and the comparison method 2 of the invention apply various loss function constraints related to feature migration on the ViT network, compared with the comparison method 3 which only utilizes the ViT network to extract the global feature information of the source domain data, the diagnosis results of the three methods are obviously improved.
In the working conditions 1-6, the target domain data only contains two kinds of label information, the number of the source domain outlier labels is more than that in the working conditions 7-10, and the influence of the source domain outlier categories on the characteristic migration can be effectively filtered by using a weighted balance mechanism; in the working conditions 7-10, the difference between the number of the source domain data labels and the number of the target domain data labels is small, and in the characteristic migration process, the negative migration effect caused by the source domain outlier type data is small. Therefore, in the working conditions 1-6, the diagnosis accuracy obtained by the comparison method 1 is higher than that obtained by the comparison method 2; in the working conditions 7-10, the diagnosis accuracy obtained by using the comparison method 2 is higher than that obtained by using the comparison method 1.
Compared with the comparison method 1, the method considers the alignment process of the sub-class features of the sharing categories of the source domain and the target domain, clusters the features of different categories of the source domain into clusters by utilizing various measurement learning measures, and effectively shortens the feature distance between the sharing categories of the source domain and the target domain. Therefore, the diagnostic result of the method of the invention is significantly better than that of comparative method 1. Compared with the comparison method 2, the method of the invention considers the whole characteristic alignment process of the sharing category of the source domain and the target domain and the characteristic alignment process of each subclass of the sharing category of the source domain and the target domain, and the diagnosis accuracy is obviously improved compared with the comparison method 2.
The local receptive field structure in the CNN network is easy to generate an effective feature omission phenomenon when deep features are extracted, so that the deep features are poor in distinguishing property; the self-attention mechanism specific to the ViT network can fully mine global feature information in the monitored data, and deep features extracted by using the ViT network are high in distinguishability. Therefore, the diagnosis result of the method is obviously superior to that of the comparison method 4 based on the CNN network. Comparing the diagnosis results of the four methods, the method has the advantages of highest average diagnosis accuracy and minimum result fluctuation, thereby verifying the effectiveness of the method.
TABLE 2 statistics of the correctness of different diagnostic methods
The foregoing is merely a preferred embodiment of the present invention and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present invention, which are intended to be comprehended within the scope of the present invention.
Claims (7)
1. A rotating equipment fault diagnosis method based on a double-stage alignment partial migration network is characterized by comprising the following steps:
step 1: collecting data with more fault categories under a certain working condition as source domain data, wherein the source domain data are all labeled data;
step 2: collecting data with fewer fault categories under other working conditions as target domain data, wherein the target domain data are label-free data;
step 3: the source domain data and the target domain data are simultaneously input into a constructed two-stage alignment part migration network, and the edge distribution alignment of the sharing category of the source domain and the target domain and the state distribution alignment among all subclasses of the sharing category of the source domain and the target domain are realized by updating the internal parameters of the two-stage alignment part migration network;
step 4: and selecting part of data from the data with the same working condition as the target domain data as test data, inputting the test data into the trained two-stage alignment part migration network, and predicting the category information of the test data.
2. The rotating equipment fault diagnosis method based on the two-stage alignment partial migration network according to claim 1, wherein in the step 1, the labeled source domain data set is represented as whereinRepresents the i-th sample of the source domain, +.>Failure tag indicating ith sample, n s Representing the number of source domain samples; source Domain dataset +.>Is Y s The corresponding edge state distribution is P s 。
3. The dual stage alignment based partial migration network of claim 2In step 2, the unlabeled target domain data set is expressed as whereinRepresents the i-th sample of the target domain, n t Representing the number of target domain samples; target Domain dataset +.>Is Y t The corresponding edge state distribution is P t The method comprises the steps of carrying out a first treatment on the surface of the The target domain label space is a subset of the source domain label space, i.e. +.>The edge state distribution of the source domain data and the target domain data is different, namely P s ≠P t The edge state distribution of the source domain sharing category data and the target domain data also has a difference, namely P s,c ≠P t 。
4. The rotating equipment fault diagnosis method based on the two-stage alignment part migration network according to claim 3, wherein in the step 3, the network structure of the two-stage alignment part migration network comprises three basic units, namely a feature extractor G, a domain discriminator D and a classifier C; the feature extractor G is constructed by utilizing a ViT network, and the domain discriminator D and the classifier C are all three-layer fully-connected neural networks;
the constructed ViT network comprises an input processing module, a transducer encoder module and a classification module; in the input processing module, the collected monitoring data is set as one-dimensional vibration signals Representing L-dimensional space, firstly, equally dividing a vibration signal into Z sections, namely Z token sections, wherein each section comprises sampling points S, ensuring L=Z×S, and obtaining input data of ViT network>Secondly, the cut vibration signal is put into a fully-connected neural network with an input node of S and an output node of N to obtain the processed characteristic +.>Then, feature x f Class token feature trainable with parametersSplicing to obtain combined characteristic->Finally, trainable parameters +.>Considered as position embedding, and combining features s cb Adding to obtain the input feature of the subsequent transducer encoder module->
The transducer encoder module consists of a multi-head self-attention mechanism, residual error connection, layer standardization and a full-connection neural network, and utilizes the multi-head self-attention mechanism to mine global characteristic information in monitoring data; the output characteristics processed by the transducer encoder module are the same as the input characteristics in dimension, and the output characteristics are expressed asSelecting a feature of the classified token position in the output features of the transducer encoder module in the classification moduleInputting the fully-connected neural network to obtain ViT network output characteristics ∈>
5. The rotating equipment fault diagnosis method based on the two-stage alignment partial migration network according to claim 4, wherein in the step 3, a weighted balance mechanism is utilized to restrict the characteristic alignment process of the source domain and the target domain, so as to realize the sharing type edge distribution alignment of the source domain and the target domain; in a weighted balance mechanism, constructing a class weight coefficient according to the target domain data tag probability distribution given by the classifier, and evaluating the possibility that different classes of a source domain are sharing classes; the method comprises the steps that a weight coefficient is used for restraining a part of migration network parameter updating process, a larger weight is given to source domain sharing type data, and the characteristic migration effect of the source domain sharing type data and target domain sharing type data is enhanced; and (3) giving smaller weight to the source domain outlier category data, and weakening the influence of the source domain outlier category data on the characteristic migration.
6. The rotating equipment fault diagnosis method based on the two-stage alignment partial migration network according to claim 5, wherein the edge distribution alignment process specifically comprises the following steps:
firstly, the accumulation classifier gives all data tag information of a target domain, and a class weight coefficient gamma is constructed:
wherein ,target domain data tag probability distribution given for classifier softmax layer, K is the source domain data tag class number, and +.>The target domain data given to the classifier softmax layer is the probability distribution of the kth class;
then, regularization processing is carried out on the category weight coefficient:
wherein ,maxγ =max(γ)=max([γ 1 ,…,γ k ,…,γ K ]) Is the maximum value of category weight coefficient, gamma k Class weight coefficient of k class, gamma n The regularized category weight coefficient;
finally, the regularized class weight coefficient is utilized to restrict the countermeasure training process of a part of the migration network, so that the edge distribution alignment between the sharing classes of the source domain and the target domain is realized:
wherein ,θg ,θ d ,θ c Network parameters representing the feature extractor G, domain arbiter D and classifier C; d, d i A domain label representing an i-th sample; l (L) y and Ld Representing a classifier and domain arbiter cross entropy loss function; λ represents a weight parameter that measures two loss functions; and a gradient inversion layer is introduced between the feature extractor and the domain discriminator by the partial migration network, the gradient corresponding to the domain classification loss in the domain discriminator is automatically inverted before being reversely propagated to the parameters of the feature extractor, and the network parameter countermeasure training is realized in an end-to-end mode.
7. The rotating equipment fault diagnosis method based on the two-stage alignment partial migration network according to claim 6, wherein in the step 3, the state distribution alignment process specifically includes the following steps:
firstly, utilizing a triple loss function constraint feature extractor and classifier parameter updating process, reducing the probability distribution distance of the source domain and the class labels, amplifying the probability distribution distance of the source domain and the class data labels, and clustering the probability distribution of the source domain and the class labels into clusters:
wherein ,xa Representing randomly selected anchor samples, x p Representing positive samples identical to the anchor sample label, x n Representing a negative sample different from the anchor sample label, wherein margin is a preset distance between the positive sample and the negative sample;
secondly, processing the probability distribution of the source domain data labels by using K-means clustering, and identifying a clustering relation of the probability distribution of the source domain data labels; the updating process of the internal parameters of the K-means clustering model is as follows:
wherein ,θclu Representing the internal parameters of the K-means cluster model,represents the kth class of source domain data, U k Representing a clustering center corresponding to the probability distribution of the kth type source domain data labels; setting the clustering cluster number equal to the source domain data tag class number in a K-means clustering model, and enabling the source domain data tag probability distribution of different classes to surround the vicinity of the corresponding clustering center by adjusting internal parameters; further, judging the membership between the clustering clusters of the target domain data and the source domain data by utilizing a proximity criterion:
wherein ,the clustering center is a clustering center to which target domain data estimated by utilizing a proximity criterion belongs; when the probability distribution of the target domain data labels and the kth type source domain data clustering center U k When the L2 norm of the target domain data is minimum, the probability distribution of the target domain data label belongs to a kth cluster, and the class label of the target domain data is k;
finally, the L2 norm is utilized to pull the distance between the target domain label probability distribution and the clustering center of the source domain of the same category, so as to realize the state distribution alignment between the sharing categories of the source domain and the target domain:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211531891.4A CN116150668B (en) | 2022-12-01 | 2022-12-01 | Rotating equipment fault diagnosis method based on double-stage alignment partial migration network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211531891.4A CN116150668B (en) | 2022-12-01 | 2022-12-01 | Rotating equipment fault diagnosis method based on double-stage alignment partial migration network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116150668A true CN116150668A (en) | 2023-05-23 |
CN116150668B CN116150668B (en) | 2023-08-11 |
Family
ID=86357296
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211531891.4A Active CN116150668B (en) | 2022-12-01 | 2022-12-01 | Rotating equipment fault diagnosis method based on double-stage alignment partial migration network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116150668B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117312980A (en) * | 2023-08-22 | 2023-12-29 | 中国矿业大学 | Rotary equipment fault diagnosis method based on partial domain adaptation and knowledge distillation |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170083751A1 (en) * | 2015-09-21 | 2017-03-23 | Mitsubishi Electric Research Laboratories, Inc. | Method for estimating locations of facial landmarks in an image of a face using globally aligned regression |
CN112183581A (en) * | 2020-09-07 | 2021-01-05 | 华南理工大学 | Semi-supervised mechanical fault diagnosis method based on self-adaptive migration neural network |
CN113673397A (en) * | 2021-08-11 | 2021-11-19 | 山东科技大学 | Local area adaptive mechanical fault diagnosis method based on class weighting alignment |
CN113947725A (en) * | 2021-10-26 | 2022-01-18 | 中国矿业大学 | Hyperspectral image classification method based on convolution width migration network |
CN114358123A (en) * | 2021-12-03 | 2022-04-15 | 华南理工大学 | Generalized open set fault diagnosis method based on deep countermeasure migration network |
-
2022
- 2022-12-01 CN CN202211531891.4A patent/CN116150668B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170083751A1 (en) * | 2015-09-21 | 2017-03-23 | Mitsubishi Electric Research Laboratories, Inc. | Method for estimating locations of facial landmarks in an image of a face using globally aligned regression |
CN112183581A (en) * | 2020-09-07 | 2021-01-05 | 华南理工大学 | Semi-supervised mechanical fault diagnosis method based on self-adaptive migration neural network |
CN113673397A (en) * | 2021-08-11 | 2021-11-19 | 山东科技大学 | Local area adaptive mechanical fault diagnosis method based on class weighting alignment |
CN113947725A (en) * | 2021-10-26 | 2022-01-18 | 中国矿业大学 | Hyperspectral image classification method based on convolution width migration network |
CN114358123A (en) * | 2021-12-03 | 2022-04-15 | 华南理工大学 | Generalized open set fault diagnosis method based on deep countermeasure migration network |
Non-Patent Citations (1)
Title |
---|
YONGCHAO ZHANG等: ""MMFNet: Multisensor Data and Multiscale Feature Fusion Model for Intelligent Cross-Domain Machinery Fault Diagnosis "", 《IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT》, pages 1 - 11 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117312980A (en) * | 2023-08-22 | 2023-12-29 | 中国矿业大学 | Rotary equipment fault diagnosis method based on partial domain adaptation and knowledge distillation |
Also Published As
Publication number | Publication date |
---|---|
CN116150668B (en) | 2023-08-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111914883B (en) | Spindle bearing state evaluation method and device based on deep fusion network | |
CN114358124B (en) | New fault diagnosis method for rotary machinery based on deep countermeasure convolutional neural network | |
CN111562108A (en) | Rolling bearing intelligent fault diagnosis method based on CNN and FCMC | |
CN114048568A (en) | Rotating machine fault diagnosis method based on multi-source migration fusion contraction framework | |
CN114358123B (en) | Generalized open set fault diagnosis method based on deep countermeasure migration network | |
CN116304820B (en) | Bearing fault type prediction method and system based on multi-source domain transfer learning | |
CN116894187A (en) | Gear box fault diagnosis method based on deep migration learning | |
CN113375941A (en) | Open set fault diagnosis method for high-speed motor train unit bearing | |
CN113673346A (en) | Motor vibration data processing and state recognition method based on multi-scale SE-Resnet | |
CN114676742A (en) | Power grid abnormal electricity utilization detection method based on attention mechanism and residual error network | |
CN114429152A (en) | Rolling bearing fault diagnosis method based on dynamic index antagonism self-adaption | |
CN112763215B (en) | Multi-working-condition online fault diagnosis method based on modular federal deep learning | |
CN116026593A (en) | Cross-working-condition rolling bearing fault targeted migration diagnosis method and system | |
CN116150668B (en) | Rotating equipment fault diagnosis method based on double-stage alignment partial migration network | |
CN116975718A (en) | Rolling bearing cross-domain fault diagnosis method based on self-supervision learning | |
CN115905976A (en) | Method, system and equipment for diagnosing high way Bi-LSTM bearing fault based on attention mechanism | |
CN113609480B (en) | Multipath learning intrusion detection method based on large-scale network flow | |
CN113158537B (en) | Aeroengine gas circuit fault diagnosis method based on LSTM combined attention mechanism | |
CN112598666B (en) | Cable tunnel anomaly detection method based on convolutional neural network | |
Saha et al. | Enhancing bearing fault diagnosis using transfer learning and random forest classification: A comparative study on variable working conditions | |
CN113723592A (en) | Fault diagnosis method based on wind power gear box monitoring system | |
CN117191396A (en) | Gear box fault diagnosis method based on two-stage migration | |
CN114383846B (en) | Bearing composite fault diagnosis method based on fault label information vector | |
CN114139598B (en) | Fault diagnosis method and diagnosis framework based on deep cost sensitive convolution network | |
CN112269778A (en) | Equipment fault diagnosis method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |