CN116910652A - Equipment fault diagnosis method based on federal self-supervision learning - Google Patents
Equipment fault diagnosis method based on federal self-supervision learning Download PDFInfo
- Publication number
- CN116910652A CN116910652A CN202310893683.7A CN202310893683A CN116910652A CN 116910652 A CN116910652 A CN 116910652A CN 202310893683 A CN202310893683 A CN 202310893683A CN 116910652 A CN116910652 A CN 116910652A
- Authority
- CN
- China
- Prior art keywords
- client
- data
- data set
- local
- feature extractor
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 45
- 238000003745 diagnosis Methods 0.000 title claims abstract description 40
- 238000012549 training Methods 0.000 claims abstract description 31
- 239000011159 matrix material Substances 0.000 claims description 34
- 230000006870 function Effects 0.000 claims description 26
- 238000004364 calculation method Methods 0.000 claims description 7
- 238000012706 support-vector machine Methods 0.000 claims description 7
- 238000013527 convolutional neural network Methods 0.000 claims description 3
- 238000007781 pre-processing Methods 0.000 claims description 3
- 238000009826 distribution Methods 0.000 description 6
- 238000012360 testing method Methods 0.000 description 6
- 238000013528 artificial neural network Methods 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 238000011156 evaluation Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 238000013101 initial test Methods 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000004568 cement Substances 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000003066 decision tree Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000013100 final test Methods 0.000 description 1
- 235000003642 hunger Nutrition 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 230000037351 starvation Effects 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 208000024891 symptom Diseases 0.000 description 1
- 230000036962 time dependent Effects 0.000 description 1
- 238000009827 uniform distribution Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/10—Pre-processing; Data cleansing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Testing And Monitoring For Control Systems (AREA)
Abstract
The invention discloses a device fault diagnosis method based on federal self-supervision learning. Comprising the following steps: firstly, initializing the weight of a feature extractor by a server and transmitting the weight to each client; then, each client acquires signals generated when the local equipment works by using a sensor and records the signals as local vibration data, so as to obtain a non-tag data set and a tag data set; then, each client trains the local feature extractor under the federal self-supervision learning framework respectively, and further obtains the trained local feature extractor; respectively training classifiers by each client under a supervision and learning framework to obtain corresponding client classifiers, and connecting the feature extractor and the client classifiers to form a fault diagnosis model; and finally, performing equipment diagnosis by using a fault diagnosis model of the client. The invention solves the problems that the fault data set of the rotating equipment is smaller and dispersed, and the high-precision diagnosis model is difficult to train due to the lack of labels.
Description
Technical Field
The invention belongs to the field of fault diagnosis of rotating equipment, relates to an equipment fault diagnosis method in the fields of machine learning, deep learning and time sequence classification, and in particular relates to an equipment fault diagnosis method based on federal self-supervision learning.
Background
Rotating equipment has found widespread use in the modern industry and is becoming increasingly complex and more sophisticated, such as aeroengines, gas turbines. When the rotating equipment has faults such as bearing damage, blade fracture and the like, serious accidents can be caused, and huge economic losses are caused. Therefore, the method and the device have the advantages that the state of the running device is accurately identified, the device is intervened in time when early symptoms of faults occur, and the method and the device have important significance in the aspects of improving production efficiency, reducing disaster loss and the like.
At present, fault diagnosis of rotating equipment mostly belongs to a data driving method, and a mapping relation from a vibration signal to an equipment state is excavated through a large amount of data. Common methods include support vector machines, decision trees, neural networks, and the like. The deep neural network is one of the main current research directions because of the capability of automatically extracting the characteristics. The method disclosed in the patent 'a method for diagnosing faults of cement production rotating equipment based on machine learning' uses a one-dimensional convolutional neural network and a fully-connected neural network to extract vibration characteristics, and then uses ensemble learning to obtain diagnosis results from a plurality of classifiers. The method disclosed by the patent 'a rotary equipment fault diagnosis method, a system and a readable storage medium based on a depth residual error network' uses the depth residual error network to extract fault characteristics from vibration signals.
While existing methods reach a fairly high level of diagnostic accuracy, they use large-sized, fully labeled data set training models, which are limited in many instances. Data starvation annotation is the most common limitation. The vibration signal needs to be marked by an expert with field knowledge, the cost is high, and in practical application, only a small part of data is marked. Data security is also a consideration. Different customers use the device under different conditions. To ensure model robustness, data should be collected for all customers as much as possible. Customers may be reluctant to share their own data for training a more superior model for benefit considerations or concerns about risk of data leakage.
Disclosure of Invention
In order to solve the problems and needs in the background art, the invention provides a device fault diagnosis method based on federal self-supervision learning, which can train a fault diagnosis model by using a plurality of scattered and unshared small-sized data sets lacking labels and is used for online diagnosis. The method can train the efficient fault diagnosis model under the conditions of smaller fault data sets, scattered fault data sets and lacking labels.
The specific technical scheme of the invention comprises the following steps:
s1: the server initializes the weight of the feature extractor and transmits the weight to each client, and each client uses the weight as the initial weight of the local feature extractor;
s2: each client acquires signals generated when the local equipment works by using a sensor and records the signals as local vibration data, and then, the local vibration data are preprocessed to obtain a label-free data set and a label-bearing data set;
s3: under the federal self-supervision learning framework, each client trains a local feature extractor by using the unlabeled data set, so as to obtain a trained local feature extractor;
s4: each client trains the classifier by using the label data set under the supervision and learning framework to obtain a corresponding client classifier, and in each client, the current feature extractor is connected with the client classifier to form a fault diagnosis model;
s5: and preprocessing sensor data of the equipment to be diagnosed, and then accessing the sensor data into a fault diagnosis model of a corresponding client to obtain a corresponding equipment diagnosis result.
In the step S1, the feature extractor adopts a convolution neural network with residual connection.
In S2, each client performs the following steps:
s21: collecting signals generated by local equipment during working by using an accelerometer and recording the signals as local vibration data;
s22: randomly selecting local vibration data with preset proportion, marking the selected local vibration data according to the real state of the equipment to obtain an initial labeled data set, and marking the unselected local vibration data as an initial unlabeled data set;
s23: dividing all signals of an initial labeled data set and an initial unlabeled data set into a plurality of sections by using a sliding window to obtain divided labeled data sets and cut unlabeled data sets respectively;
s24: and respectively carrying out numerical scaling on the segmented labeled data set and the segmented unlabeled data set by using a maximum-minimum method to obtain a final unlabeled data set and a final labeled data set.
The step S3 is specifically as follows:
s31: in each round of training, each client trains a local feature extractor by using an unlabeled data set under a self-supervision learning framework, obtains the weights of the local feature extractors of each client after the current round of training and uploads the weights to a server;
s32: after the server aggregates the local feature extractor weights of all the clients, obtaining global feature extractor weights and transmitting the weights to the local feature extractors of all the clients;
s33: repeating S31-S32, updating the weights of the global feature extractors in multiple rounds until the preset rounds are reached, and transmitting the final weights of the global feature extractors to the local feature extractors of all clients, so that all clients obtain trained local feature extractors.
In S31, each client performs the following steps:
s311: carrying out data enhancement on each piece of non-tag data in the non-tag data set by using two different data enhancement methods to obtain a corresponding enhancement sample pair;
s312: and training the data set for one round by using the enhanced sample corresponding to the unlabeled data set under the self-supervision learning framework, obtaining the weight of the local feature extractor after the current round of training, and uploading the weight to the server.
In the S311, the first data enhancement sampleFirstly adding Gaussian noise to each piece of label-free data, and then scaling the data added with the noise to obtain the label-free data; second data enhancement sample->Is obtained by smoothly warping the interval of time steps of each piece of unlabeled data and then applying noise.
The local feature extractor outputs a first feature matrix in the training processAnd a first feature matrix->The feature extractor weights are optimized using a gradient descent algorithm, the loss function comprising a first loss function and a second loss function, the first loss function loss1 having the following formula:
wherein N is the batch size, alpha and beta are the first and second super parameters respectively,is an indication function; i and j respectively represent a first index and a second index of samples in the batch; l (L) c For the context contrast loss function, l t For time contrast loss function->Representing a first clipping feature matrix,/a>Representing a second clipping feature matrix, s1 and s2 representing first and second starting positions, respectively, and e1 and e2 representing first and second ending positions, respectively; t represents the length of the feature matrix time dimension; />Representing a second clipping feature matrix->The characteristics of sample i at time step t +.>Representing a first clipping feature matrix->The characteristics of sample i at time step t, < >>Representing a first clipping feature matrix->Sample i, of (b)>Representing a second clipping feature matrix->Sample i, of (b)>Representing a second clipping feature matrix->Sample j of (a);
the calculation formula of the second loss function loss2 is as follows:
wherein ,first feature matrix extracted by local feature extractor respectively representing kth client kth round>And a first feature matrix->First feature matrix extracted by local feature extractor respectively representing kth client kth round>And a first feature matrix->Representing features extracted by each client using global feature extractor weights received from the server at the time of the r-th round of training.
In S32, the server uses a weighted average method to aggregate the local feature extractor weights of all clients, and the calculation formula is as follows:
where k is the client index, |D|, |D k The I represents the data volume of the global client and the data volume of the kth client respectively, θ G ,θ k The weight of the global feature extractor and the weight of the local feature extractor of the kth client are represented, respectively.
In S4, each client performs the following steps:
s51: inputting the tag data set into a local feature extractor of the updated weight to obtain a feature matrix data set;
s52: and taking the feature matrix data set as the input of the support vector machine, and training the support vector machine to obtain the client classifier.
In the step S5, the signal segmentation is performed on the sensor data of the device to be diagnosed by using the sliding window, and the normalization processing is performed by using the maximum-minimum method, and then the signal segmentation is input into the fault diagnosis model of the corresponding client.
In the method of the invention, a feature extractor effective for all clients is trained from scattered client data by using federal learning; during the training process, self-supervised learning is used to learn useful knowledge from a large amount of unlabeled data; the final performance of the classifier is improved from a small amount of tag data using supervised learning.
Compared with the prior art, the invention has the beneficial effects that:
1. model training is performed locally at the client, the client data does not need to be uploaded to the server, and the client does not need to worry about data leakage.
2. The method can learn knowledge from a large amount of unlabeled data, and can play good performance under the condition that the current equipment fault diagnosis application generally lacks labels.
3. The invention adopts contrast type self-supervision learning to train a robust model from a small-sized unlabeled dataset; the fault feature extractor with global knowledge is aggregated from a plurality of clients by adopting federal learning, and local data sharing of the clients is avoided, so that the problems that a fault data set of the rotating equipment is small and scattered, and a high-precision diagnosis model is difficult to train due to lack of labels are solved.
Drawings
FIG. 1 is a flow chart showing the steps of the present invention.
FIG. 2 is a schematic diagram of training and use of the model according to the present invention.
FIG. 3 is a schematic diagram of an experimental bench according to an embodiment of the invention.
Fig. 4 shows three client data distribution cases according to an embodiment of the present invention.
Fig. 5 is a confusion matrix of the first client fault diagnosis result in the embodiment of the present invention.
Fig. 6 is a confusion matrix of fault diagnosis results of the second client according to the embodiment of the present invention.
Fig. 7 is a confusion matrix of fault diagnosis results of a third client according to an embodiment of the present invention.
Detailed Description
The invention is further described below with reference to the kesixi Chu Da bearing failure dataset (CWRU) as data for a specific example.
The test bed of CWRU dataset comprises motor, torque sensor and power tester. The shaft of the motor, on which the accelerometer is mounted and which samples the vibration signal at a frequency of 12KHZ for one second, is supported by the bearing to be measured, and a schematic of the data set of the laboratory table is shown in fig. 3. There are three types of faults in the bearing, namely, inner ring damage, outer ring damage and roller damage. Each fault type can also be subdivided into three severity levels. Therefore, in the normal state, there are ten states in total of the bearing data.
Fig. 1 and 2 illustrate the flow of the present invention, in combination with the implementation of a CWRU data set, specifically including the following steps:
s1: the server initializes the weight of the feature extractor and transmits the weight to each client, and each client uses the weight as the initial weight of the local feature extractor;
in S1, the feature extractor employs a convolutional neural network with residual connection (res net), which is a widely used deep learning model. The server and the feature extractor structure of the client remain identical and in order to accommodate time series data, the convolution kernel of Resnet is limited to sliding only along the time dimension.
S2: each client acquires signals generated when the local equipment works by using a sensor and records the signals as local vibration data, and then, the local vibration data are preprocessed to obtain a label-free data set and a label-bearing data set;
in S2, each client performs the following steps:
s21: collecting signals generated by local equipment during working by using an accelerometer and recording the signals as local vibration data; specifically, as shown in fig. 3, an accelerometer is mounted at a position of the motor near the bearing. The motor is operated and at some later time the accelerometer is activated to sample the ten second signal and the bearing condition is recorded.
S22: randomly selecting local vibration data with preset proportion, marking the selected local vibration data according to the real state of the equipment reflected by the data, and marking the selected local vibration data according to the following steps of 1:2, dividing the ratio into an initial labeled data set and an initial test set, and marking the unselected local vibration data as an initial unlabeled data set; in a specific implementation, the preset ratio is set to 30%.
S23: dividing all signals of an initial labeled data set, an initial unlabeled data set and an initial test set into misaligned sections by using a sliding window with the length of 1024 and the sliding step length of 1024 to respectively obtain a divided labeled data set, an initial unlabeled data set and a test set; the CWRU data set includes vibration data of 161 bearings, and in this embodiment, three clients are taken as an example to describe the vibration data of 161 bearings, so that the vibration data of 161 bearings are split and then divided into local data sets of the three clients according to a non-independent same distribution mode, and the data distribution is shown in fig. 4, so that the data distribution of the three clients has a large difference. The ratio of the amount of data of the local tagged data set, the untagged data set, and the test set for each client is about 1:7:2.
s24: respectively carrying out numerical scaling on the segmented labeled data set, the segmented unlabeled data set and the segmented test set by using a maximum-minimum method, and scaling to a [0,1] interval to obtain a final unlabeled data set, a final labeled data set and a final test set;
s3: under the federal self-supervision learning framework, each client trains a local feature extractor by using the unlabeled data set, so as to obtain a trained local feature extractor;
s3 specifically comprises the following steps:
s31: in each round of training, each client trains a local feature extractor by using an unlabeled data set under a self-supervision learning framework, obtains the weights of the local feature extractors of each client after the current round of training and uploads the weights to a server;
in S31, each client performs the following steps:
s311: carrying out data enhancement on each piece of non-tag data in the non-tag data set by using two different data enhancement methods to obtain a corresponding enhancement sample pair;
in S311, the first data enhancement sampleFirstly adding Gaussian noise to each piece of label-free data, and then scaling the data added with the noise to obtain the label-free data; second data enhancement sample->Is obtained by smoothly warping the interval of time steps of each piece of unlabeled data and then applying noise. A batch of unlabeled dataset is expressed as: x is x train ={x 1 ,x 2 ,...,x N -where x e R L N is the batch size. In this embodiment, data for one batch is sampled from a standard gaussian distribution using the same noise and scaling factor.
S312: and training the data set for one round by using the enhanced sample corresponding to the unlabeled data set under the self-supervision learning framework, obtaining the weight of the local feature extractor after the current round of training, and uploading the weight to the server. In this embodiment, resNet is used to extract a feature representation of the enhanced sample. The convolution kernel size of Resnet is fixed to [3,1] and is limited to sliding only along the time dimension. The output dimension of the Resnet, i.e. the length of the feature representation, is set to 64.
The local feature extractor f outputs a first feature matrix during trainingAnd a first feature matrix-> T represents the length of the first feature matrix time dimension, i.e. the number of time steps. Feature extractor weights are optimized using a gradient descent algorithm, which is performed by the present embodiment using an Adam optimizer, with a learning rate set to 3e -4 . The loss function includes a first loss function and a second loss function, and the calculation formula of the first loss function loss1 is as follows:
wherein N is the batch size, alpha and beta are respectively a first super parameter and a second super parameter, and are used for adjusting l c And/l t Is added to the weight of the contribution of (a).For the indicator function, when the condition in brackets is met, the value is 1, otherwise, the value is 0; i and j respectively represent a first index and a second index of samples in the batch; l (L) c For context contrast loss function, the loss function aims to reduce the feature distance between the enhanced sample pair output by the feature extractor and increase the feature distance between the enhanced sample and all other samples; l (L) t For the time contrast loss function, the function further constrains the output of the feature extractor in terms of similarity of time steps. Specifically, the feature distance of the same time step position between the enhanced sample pair is reduced as much as possible, and the feature distance of different time step positions between the enhanced sample pair and the feature distance of different time steps of each sample are enlarged as much as possible. l (L) c And/l t Contrast loss terms designed according to the NT-Xent loss function concept, respectively use context-dependent and time-dependent construction supervision information between enhanced samples in the batch data to help the feature extractor benefit from the unlabeled data. For the first feature matrix->And a second feature matrix->By random clipping, cut into shorter sections,/->Representing a first clipping feature matrix, the first feature matrix being +_for the first feature matrix from the first starting position s1 to the first ending position e1>Matrix after clipping +_>Representing a second clipping feature matrix, the second feature matrix being +_for the first feature matrix from the first starting position s2 to the first ending position e2>Matrix after clipping +_>And->In [ s2:e1 ]]Partially overlapping, s1 and s2 represent first and second starting positions, respectively, and e1 and e2 represent first and second ending positions, respectively; t represents the length of the time dimension of the feature matrix, i.e. the number of time steps; />Representing a second clipping feature matrix->The characteristics of sample i at time step t +.>Representing a first clipping feature matrix->The characteristics of sample i at time step t, < >>Representing a first clipping feature matrix->Sample i, of (b)>Representing a second clipping feature matrix->Sample i, of (b)>Representing a second clipping feature matrix->Sample j of (a);
the second loss function loss2 is used for constraining the distance between the feature extractor and the feature extractor of the previous training round, and the calculation formula of the second loss function loss2 is as follows:
wherein ,first feature matrix extracted by local feature extractor respectively representing kth client kth round>And a first feature matrix->First feature matrix extracted by local feature extractor respectively representing kth client kth round>And a first feature matrix->Representing features extracted by each client using global feature extractor weights received from the server at the beginning of the training round r.
In this embodiment, the feature to be clipped and retained represents a length of 32, and the starting position of clipping data of one batch is the same, and samples are taken from the uniform distribution U [0,32 ].
S32: after the server aggregates the local feature extractor weights of all the clients, obtaining global feature extractor weights and transmitting the weights to the local feature extractors of all the clients, wherein the local feature extractor weights of all the clients are used as initial weights of the next training round of the local feature extractors of all the clients;
in S32, the server uses a weighted average method to aggregate the local feature extractor weights of all clients, and the calculation formula is as follows:
where k is the client index, |D|, |D k The I represents the data volume of the global client and the data volume of the kth client respectively, θ G ,θ k The weight of the global feature extractor and the weight of the local feature extractor of the kth client are represented, respectively.
S33: repeating S31-S32, updating the weights of the global feature extractors in multiple rounds until the preset rounds are reached, and transmitting the final weights of the global feature extractors to the local feature extractors of all clients, so that all clients obtain trained local feature extractors.
In this embodiment, after the feature extractor training and the model weight uploading are completed by all three clients, the server aggregates the model weights to form new model weights, and sends the new model weights to the clients. The number of training times was set to 40.
S4: each client trains the classifier by using the label data set under the supervision and learning framework to obtain a corresponding client classifier, and in each client, the current feature extractor is connected with the client classifier to form a fault diagnosis model;
in S4, each client performs the following steps:
s41: inputting the tag data set into a local feature extractor of the updated weight to obtain a feature matrix data set;
s42: and taking the feature matrix data set as the input of the support vector machine, and training the support vector machine to obtain the client classifier.
S5: and preprocessing sensor data of the equipment to be diagnosed, and then accessing the sensor data into a fault diagnosis model of a corresponding client to obtain a corresponding equipment diagnosis result.
In this embodiment, the test set obtained in step S24 is actually the data to be diagnosed after being preprocessed. And classifying the test set data by using a fault diagnosis model to obtain a device diagnosis result.
Comparing the equipment diagnosis result with the actual result, and selecting the accuracy (Acc) and the macro average F1 score (MF 1) as evaluation indexes to measure the performance of the model. In this example, the index results are shown in table 1.
Table 1 is a model Performance evaluation Table
The evaluation result shows that the method can successfully diagnose the equipment, and has higher diagnosis accuracy, and the method is feasible and effective. Fig. 5, 6, and 7 illustrate confusion matrices for three client diagnostic results. It can be seen that the method of the present invention achieves extremely high diagnostic accuracy despite the large variance in data distribution for each client.
The above embodiment is an implementation of the present invention on a kesixi Chu Da bearing fault data set, but the implementation of the fault diagnosis method of the present invention is not limited to bearings, and any similar scheme that collects equipment operation data through sensors and performs equipment fault diagnosis according to the principles and ideas of the present invention should be regarded as the protection scope of the present invention.
Claims (10)
1. The equipment fault diagnosis method based on federal self-supervision learning is characterized by comprising the following steps of:
s1: the server initializes the weight of the feature extractor and transmits the weight to each client, and each client uses the weight as the initial weight of the local feature extractor;
s2: each client acquires signals generated when the local equipment works by using a sensor and records the signals as local vibration data, and then, the local vibration data are preprocessed to obtain a label-free data set and a label-bearing data set;
s3: under the federal self-supervision learning framework, each client trains a local feature extractor by using the unlabeled data set, so as to obtain a trained local feature extractor;
s4: each client trains the classifier by using the label data set under the supervision and learning framework to obtain a corresponding client classifier, and in each client, the current feature extractor is connected with the client classifier to form a fault diagnosis model;
s5: and preprocessing sensor data of the equipment to be diagnosed, and then accessing the sensor data into a fault diagnosis model of a corresponding client to obtain a corresponding equipment diagnosis result.
2. The method for diagnosing a device failure based on federal self-supervised learning as recited in claim 1, wherein in S1, the feature extractor uses a convolutional neural network with residual connections.
3. The method for diagnosing a device failure based on federal self-supervised learning as recited in claim 1, wherein each client performs the steps of:
s21: collecting signals generated by local equipment during working by using an accelerometer and recording the signals as local vibration data;
s22: randomly selecting local vibration data with preset proportion, marking the selected local vibration data according to the real state of the equipment to obtain an initial labeled data set, and marking the unselected local vibration data as an initial unlabeled data set;
s23: dividing all signals of an initial labeled data set and an initial unlabeled data set into a plurality of sections by using a sliding window to obtain divided labeled data sets and cut unlabeled data sets respectively;
s24: and respectively carrying out numerical scaling on the segmented labeled data set and the segmented unlabeled data set by using a maximum-minimum method to obtain a final unlabeled data set and a final labeled data set.
4. The method for diagnosing equipment failure based on federal self-supervised learning according to claim 1, wherein the step S3 is specifically:
s31: in each round of training, each client trains a local feature extractor by using an unlabeled data set under a self-supervision learning framework, obtains the weights of the local feature extractors of each client after the current round of training and uploads the weights to a server;
s32: after the server aggregates the local feature extractor weights of all the clients, obtaining global feature extractor weights and transmitting the weights to the local feature extractors of all the clients;
s33: repeating S31-S32, updating the weights of the global feature extractors in multiple rounds until the preset rounds are reached, and transmitting the final weights of the global feature extractors to the local feature extractors of all clients, so that all clients obtain trained local feature extractors.
5. The method for diagnosing a device failure based on federal self-supervised learning as recited in claim 4, wherein each client performs the steps of:
s311: carrying out data enhancement on each piece of non-tag data in the non-tag data set by using two different data enhancement methods to obtain a corresponding enhancement sample pair;
s312: and training the data set for one round by using the enhanced sample corresponding to the unlabeled data set under the self-supervision learning framework, obtaining the weight of the local feature extractor after the current round of training, and uploading the weight to the server.
6. The method for diagnosing a device failure based on federal self-supervised learning as recited in claim 5, wherein in S311, the first data enhancement sampleFirstly adding Gaussian noise to each piece of label-free data, and then scaling the data added with the noise to obtain the label-free data; second data enhancement sample->Is obtained by smoothly warping the interval of time steps of each piece of unlabeled data and then applying noise.
7. The method for diagnosing a device failure based on federal self-supervised learning as recited in claim 4, wherein the local feature extractor outputs a first feature matrix during trainingAnd a first feature matrix->The feature extractor weights are optimized using a gradient descent algorithm, the loss function comprising a first loss function and a second loss function, the first loss function loss1 having the following formula:
wherein N is the batch size, alpha and beta are the first and second super parameters respectively,is an indication function; i and j respectively represent samples in a batchFirst and second indexes of the book; l (L) c For the context contrast loss function, l t For time contrast loss function->Representing a first clipping feature matrix,/a>Representing a second clipping feature matrix, s1 and s2 representing first and second starting positions, respectively, and e1 and e2 representing first and second ending positions, respectively; t represents the length of the feature matrix time dimension; />Representing a second clipping feature matrix->The characteristics of sample i at time step t +.>Representing a first clipping feature matrix->The characteristics of sample i at time step t, < >>Representing a first clipping feature matrix->Sample i, of (b)>Representing a second clipping feature matrix->Sample i, of (b)>Representing a second clipping feature matrix->Sample j of (a);
the calculation formula of the second loss function loss2 is as follows:
wherein ,first feature matrix extracted by local feature extractor respectively representing kth client kth round>And a first feature matrix->First feature matrix extracted by local feature extractor respectively representing kth client kth round>And a first feature matrix-> Representing features extracted by each client using global feature extractor weights received from the server at the time of the r-th round of training.
8. The method for diagnosing a device failure based on federal self-supervised learning as set forth in claim 4, wherein in S32, the server aggregates the local feature extractor weights of all clients using a weighted average method, and the calculation formula is as follows:
where k is the client index, |D|, |D k The I represents the data volume of the global client and the data volume of the kth client respectively, θ G ,θ k The weight of the global feature extractor and the weight of the local feature extractor of the kth client are represented, respectively.
9. The method for diagnosing a device failure based on federal self-supervised learning as recited in claim 1, wherein each client performs the steps of:
s51: inputting the tag data set into a local feature extractor of the updated weight to obtain a feature matrix data set;
s52: and taking the feature matrix data set as the input of the support vector machine, and training the support vector machine to obtain the client classifier.
10. The method for diagnosing equipment failure based on federal self-supervised learning according to claim 1, wherein in S5, the sensor data of the equipment to be diagnosed is segmented by using a sliding window, normalized by using a maximum-minimum method, and then input into a failure diagnosis model of a corresponding client.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310893683.7A CN116910652A (en) | 2023-07-20 | 2023-07-20 | Equipment fault diagnosis method based on federal self-supervision learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310893683.7A CN116910652A (en) | 2023-07-20 | 2023-07-20 | Equipment fault diagnosis method based on federal self-supervision learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116910652A true CN116910652A (en) | 2023-10-20 |
Family
ID=88359876
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310893683.7A Pending CN116910652A (en) | 2023-07-20 | 2023-07-20 | Equipment fault diagnosis method based on federal self-supervision learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116910652A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117992873A (en) * | 2024-03-20 | 2024-05-07 | 合肥工业大学 | Transformer fault classification method and model training method based on heterogeneous federal learning |
-
2023
- 2023-07-20 CN CN202310893683.7A patent/CN116910652A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117992873A (en) * | 2024-03-20 | 2024-05-07 | 合肥工业大学 | Transformer fault classification method and model training method based on heterogeneous federal learning |
CN117992873B (en) * | 2024-03-20 | 2024-06-11 | 合肥工业大学 | Transformer fault classification method and model training method based on heterogeneous federal learning |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110285969B (en) | Rolling bearing fault migration diagnosis method with polynomial nuclear implantation characteristic distribution adaptation | |
CN111709448A (en) | Mechanical fault diagnosis method based on migration relation network | |
CN112179691B (en) | Mechanical equipment running state abnormity detection system and method based on counterstudy strategy | |
CN110108456A (en) | A kind of rotating machinery health evaluating method of depth convolutional neural networks | |
CN113076834B (en) | Rotating machine fault information processing method, processing system, processing terminal, and medium | |
CN113947017B (en) | Method for predicting residual service life of rolling bearing | |
CN113011763B (en) | Bridge damage identification method based on space-time diagram convolution attention | |
CN111060337A (en) | Running equipment real-time fault diagnosis method based on expert system | |
CN116910652A (en) | Equipment fault diagnosis method based on federal self-supervision learning | |
CN108444696A (en) | A kind of gearbox fault analysis method | |
CN112966400B (en) | Centrifugal fan fault trend prediction method based on multi-source information fusion | |
CN114429152A (en) | Rolling bearing fault diagnosis method based on dynamic index antagonism self-adaption | |
CN116593157A (en) | Complex working condition gear fault diagnosis method based on matching element learning under small sample | |
CN114755017B (en) | Variable-speed bearing fault diagnosis method of cross-domain data driving unsupervised field shared network | |
CN117009794B (en) | Machine fault diagnosis method and system based on unsupervised subdomain self-adaption | |
CN114705432B (en) | Method and system for evaluating health state of explosion-proof motor bearing | |
CN114462446A (en) | Rolling bearing fault diagnosis method based on vibration signal and electronic equipment | |
CN113869339A (en) | Deep learning classification model for fault diagnosis and fault diagnosis method | |
CN110728377B (en) | Intelligent fault diagnosis method and system for electromechanical equipment | |
CN117475191A (en) | Bearing fault diagnosis method for feature alignment domain antagonistic neural network | |
CN112507479A (en) | Oil drilling machine health state assessment method based on manifold learning and softmax | |
CN115876467A (en) | Pseudo label transfer type two-stage field self-adaptive rolling bearing fault diagnosis method | |
CN115655717A (en) | Bearing fault diagnosis method based on depth domain generalization network | |
CN113312719B (en) | Rotary machine fault diagnosis method based on class unbalance weight cross entropy | |
CN113111752B (en) | Rolling mill fault diagnosis method for sample imbalance enhanced extended depth confidence network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |