CN115952442B - Global robust weighting-based federal domain generalized fault diagnosis method and system - Google Patents
Global robust weighting-based federal domain generalized fault diagnosis method and system Download PDFInfo
- Publication number
- CN115952442B CN115952442B CN202310218371.6A CN202310218371A CN115952442B CN 115952442 B CN115952442 B CN 115952442B CN 202310218371 A CN202310218371 A CN 202310218371A CN 115952442 B CN115952442 B CN 115952442B
- Authority
- CN
- China
- Prior art keywords
- model
- client
- global
- domain
- local
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 90
- 238000003745 diagnosis Methods 0.000 title claims abstract description 52
- 238000012549 training Methods 0.000 claims abstract description 30
- 230000002776 aggregation Effects 0.000 claims abstract description 18
- 238000004220 aggregation Methods 0.000 claims abstract description 18
- 238000004364 calculation method Methods 0.000 claims abstract description 18
- 238000000605 extraction Methods 0.000 claims abstract description 10
- 238000004590 computer program Methods 0.000 claims description 14
- 230000008569 process Effects 0.000 claims description 14
- 230000006870 function Effects 0.000 claims description 11
- 238000003860 storage Methods 0.000 claims description 10
- 238000011176 pooling Methods 0.000 claims description 6
- 239000000284 extract Substances 0.000 claims description 4
- 238000010606 normalization Methods 0.000 claims description 2
- 241000764238 Isis Species 0.000 claims 2
- 238000006116 polymerization reaction Methods 0.000 abstract 1
- 238000002474 experimental method Methods 0.000 description 8
- 230000006978 adaptation Effects 0.000 description 7
- 238000010586 diagram Methods 0.000 description 7
- 238000009826 distribution Methods 0.000 description 6
- 230000005540 biological transmission Effects 0.000 description 5
- 238000012545 processing Methods 0.000 description 4
- 238000004891 communication Methods 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 230000004931 aggregating effect Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000013526 transfer learning Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000008485 antagonism Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
Images
Landscapes
- Test And Diagnosis Of Digital Computers (AREA)
- Debugging And Monitoring (AREA)
Abstract
The invention belongs to the field of fault diagnosis, and provides a global robust weighting-based federal domain generalization fault diagnosis method and a global robust weighting-based federal domain generalization fault diagnosis system, wherein each source domain client side trains and receives a global model by utilizing a local source domain training data set and updates parameters to form a new local model; the source domain client sends the updated local model, the extracted network characteristics and the labels to a central server; based on the extracted network characteristics, the central server takes classification results of different classifiers on different characteristics as performance metrics, and performs model aggregation by using classification loss calculation weights; and the central server sends the aggregated global model to the target domain client for fault diagnosis. When the global robust weighting strategy is used for carrying out local model polymerization, the classification networks of different local models classify the characteristics extracted by the characteristic extraction networks of different local models, and the weight of each local model is directly related to the classification result when the local model is polymerized.
Description
Technical Field
The invention belongs to the technical field of fault diagnosis, and particularly relates to a federal domain generalized fault diagnosis method and system based on global robust weighting.
Background
The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.
With the increasing amount of industrial data, data-driven fault diagnosis methods are rapidly developed. The data distribution from different working conditions and different devices may have differences, which means that it cannot be guaranteed that an excellent model with high robustness can be trained by only accumulating massive data, which is a problem that the migration learning method aims to solve.
The transfer learning can be used to solve the problem that the training data and the test data have feature space differences. Domain adaptation and domain generalization are two types of methods commonly used in transfer learning. In the domain adaptation method, the central idea of solving the domain offset is to perform feature space alignment or countermeasure training on the source domain data and the target domain data, which means that during the training process of the whole domain adaptation method, the data of a part of the target domain still needs to be acquired to perform the above operation. Compared with the domain adaptation method, the domain generalization does not need to acquire any information of the target domain, and a model trained by the domain generalization method can be used for the situation that the target domain data is invisible. The domain generalization method aims at comprehensively utilizing rich information among a plurality of source domains to train a model with strong generalization capability on unknown test data, and compared with the domain adaptation method, the domain generalization method is more suitable for fault diagnosis under actual conditions. In recent years, intelligent fault diagnosis methods for domain generalization have been widely studied. In the prior art, a mechanical fault diagnosis single-domain generalization network guided by the antagonism mutual information is provided, and a fault diagnosis experiment is designed to verify the feasibility of the method. A comparison domain generalization method is also provided, and the classification accuracy of a training model is improved by maximizing the same domain information and simultaneously minimizing different domain information. And learning fault characteristics with discriminant and domain invariance from a source domain by combining prior diagnosis knowledge and a fault diagnosis scheme of a depth domain generalization network, and generalizing the learned knowledge to identify an unseen target sample.
Industrial big data provides a large amount of sample training data for data driven failure diagnosis. However, as people continue to increase in their interest in user privacy and data security, it has not been feasible to aggregate industrial data for different enterprises to train deep learning fault diagnosis models. Currently, most domain generalization methods directly aggregate a plurality of source domain data sets and train the source domain data sets, however, the method has the risk of data leakage. The federal learning (Federated Learning, FL) utilizes local data sets scattered on each client to fuse data characteristic information by a privacy protection technology, and the global model training is completed on a central server in a distributed manner, so that the local data cannot leave the client in the whole process, and the privacy security of the data and users is protected to the greatest extent. The federal learning asynchronous updating method in the prior art can identify network parameters of clients participating in different time and apply the network parameters to the field of fault diagnosis. Also, a weighted federal average based on the difference, a weighted strategy is proposed by the difference in distance between the different source and target domains, and a fault diagnosis experiment is performed.
However, the existing work at present only focuses on improving the performance of the model at the internal client, and ignores the generalization capability of the model to the outside-federal invisible domain. This is a key problem that hinders the wide application of FL models in practical applications.
Disclosure of Invention
In order to solve the problems, the invention provides a federal domain generalization fault diagnosis method and a federal domain generalization fault diagnosis system based on global robustness weighting. Meanwhile, in the whole process, only a plurality of source domain data sets containing rich information are utilized to the greatest extent, and no additional operation is carried out on the target domain data set. The invention provides a global robust weighting strategy, and provides a weighting strategy which takes the characteristics extracted by a characteristic extraction network as an information transmission medium, takes the classification results of different classifiers on different characteristics as performance metrics and calculates weights by using classification losses. In the process of aggregating the client models by the central server, the models with excellent performance are given higher weight, and the models with poor performance are limited, so that the generalization capability and the classification accuracy of the aggregated models are improved. Meanwhile, the Maximum Mean Difference (MMD) is introduced as a loss term to reduce the deviation between different source domain data, so that the performance of the model is further improved.
According to some embodiments, the first scheme of the invention provides a federal domain generalization fault diagnosis method based on global robust weighting, which adopts the following technical scheme:
the federal domain generalization fault diagnosis method based on global robust weighting comprises the following steps:
the central server initializes the global model and sends the global model to all source domain clients;
each source domain client trains the received global model by utilizing a local source domain training data set, and updates parameters to form a new local model;
the source domain client sends the updated local model, the extracted network characteristics and the labels to a central server;
the central server uses classification results of different classifiers on different characteristics as performance metrics based on the extracted network characteristics, and performs model aggregation by using classification loss calculation weights;
and the central server sends the aggregated global model to a target domain client for fault diagnosis.
According to some embodiments, a second aspect of the present invention provides a global robust weighting-based federal domain generalization fault diagnosis system, which adopts the following technical scheme:
a federal domain generalized fault diagnosis system based on global robustness weighting, comprising:
the system comprises a central server, a plurality of source domain clients and target clients;
the central server initializes a global model and sends the global model to all source domain clients;
each source domain client trains the received global model by utilizing a local source domain training data set, and updates parameters to form a new local model;
the source domain client sends the updated local model, the extracted network characteristics and the labels to a central server;
the central server uses classification results of different classifiers on different characteristics as performance metrics based on the extracted network characteristics, and performs model aggregation by using classification loss calculation weights;
and the central server sends the aggregated global model to a target domain client for fault diagnosis.
According to some embodiments, a third aspect of the present invention provides a computer-readable storage medium.
A computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps in a federal domain generalization fault diagnosis method based on global robustness weighting as described in the first aspect above.
According to some embodiments, a fourth aspect of the invention provides a computer device.
A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps in the global robustness weighted federal domain generalized fault diagnosis method according to the first aspect described above when the program is executed.
Compared with the prior art, the invention has the beneficial effects that:
aiming at the problem of data leakage in the existing cross-domain fault diagnosis method, the invention provides a global robust weighted federal domain generalization intelligent fault diagnosis method suitable for distributed training under different working conditions, and under the conditions that source domain data cannot be communicated and target domain data and labels are invisible, the rich information of a plurality of source domains is fully utilized, a network result is extracted by using characteristics as an information transmission medium, and the classification result of a classifier on the characteristics of different source domains is used as a performance measure. And when the models are aggregated, a global robust weighting strategy is executed to improve the generalization capability of the global model, and meanwhile, the Maximum Mean Difference (MMD) is introduced to limit the distance between source domain features, so that the classification performance of the local model and the global model is further improved.
The invention provides a model global robust weighting strategy implemented at a central server, which is used for improving global model generalization capability. Because the final weight is directly related to the final classification result of each local model, the model with high classification accuracy has a larger weight value, and the global model which is effective for all clients can be obtained more quickly. In addition, the method provided by the invention designs a plurality of tasks on the Paderbern data set and obtains excellent results, further verifies that the method can train a fault diagnosis model with excellent generalization capability, and proves the effectiveness of the proposed global robust weighting strategy.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention.
FIG. 1 is a flow chart of a method for diagnosing federal domain generalization faults based on global robustness weighting according to an embodiment of the present invention;
FIG. 2 is a flow chart of a global robust weighting strategy training process according to an embodiment of the present invention;
fig. 3 is a diagram illustrating a network model structure in a source domain client according to an embodiment of the present invention.
Detailed Description
The invention will be further described with reference to the drawings and examples.
It should be noted that the following detailed description is illustrative and is intended to provide further explanation of the invention. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the present invention. As used herein, the singular is also intended to include the plural unless the context clearly indicates otherwise, and furthermore, it is to be understood that the terms "comprises" and/or "comprising" when used in this specification are taken to specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof.
Embodiments of the invention and features of the embodiments may be combined with each other without conflict.
Example 1
As shown in fig. 1, this embodiment provides a federal domain generalized fault diagnosis method based on global robustness weighting, and this embodiment is illustrated by applying the method to a server, and it can be understood that the method may also be applied to a terminal, and may also be applied to a system and a terminal, and implemented through interaction between the terminal and the server. The server can be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, and can also be a cloud server for providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network servers, cloud communication, middleware services, domain name services, security services CDNs, basic cloud computing services such as big data and artificial intelligent platforms and the like. The terminal may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, etc. The terminal and the server may be directly or indirectly connected through wired or wireless communication, which is not limited herein. In this embodiment, the method includes the steps of:
the central server initializes the global model and sends the global model to all source domain clients;
each source domain client trains the received global model by utilizing a local source domain training data set, and updates parameters to form a new local model;
the source domain client sends the updated local model, the extracted network characteristics and the labels to a central server;
the central server uses classification results of different classifiers on different characteristics as performance metrics based on the extracted network characteristics, and performs model aggregation by using classification loss calculation weights;
and the central server sends the aggregated global model to a target domain client for fault diagnosis.
Each source domain client trains and receives a global model by using a local source domain training data set, and updates parameters to form a new local model, specifically:
step 1: downloading a model; assuming that there isNPersonal source domain data setThe source clients having these datasets are denoted +.>The method comprises the steps of carrying out a first treatment on the surface of the Each client downloads the global model at the beginning and takes the global model as a local model;
step 2: forward propagation; each client performs forward propagation and extracts the characteristics of the source domain data setAnd calculate; wherein ,/>Representing a local client modelkClassification loss of (c);
step 3: calculation ofThe method comprises the steps of carrying out a first treatment on the surface of the Since the source domain dataset is only visible in the local model, it is calculatedIn the process of (2), the features extracted from each source domain dataset need to be transferred to a central server; considering that there are often multiple source domains instead of two in domain generalization problem, define +.>The method comprises the following steps:
wherein ,、/>respectively represent the firsts、tSource domain dataset on individual client, < >>、/>Respectively represent the firsts、tSource domain data set on the individual client is on the firstTThe first extracted by the feature extraction network during epochf 1 、f 2 The characteristics of the device are that,Nrepresents the total number of source domain clients, a and b represent +.>、/>Is a number of (3).
Step 4: counter-propagating;downloaded by each client and calculated +.>To perform subsequent local model back propagation;
wherein ,αis thatIs herein,/-)>,/>Is a client terminalkIs used to determine the local model loss value of (a),kfor the client serial number,Tis the firstTEpoch->The function of (a) is to constrain the distance between multiple source domains;
after the calculation is completed, back propagation is carried out on the local source domain client model to obtain a local client model with updated parameters。
Based on the extracted network characteristics, the central server uses classification results of different classifiers on different characteristics as performance metrics, and uses classification loss calculation weights to perform model aggregation, and the method specifically comprises the following steps:
uploading the model, the characteristics and the corresponding labels after updating the parameters to a central server so as to calculate the global robust weight of each model;
and after the global robust weight of each client is obtained, model aggregation is carried out to obtain an aggregated global model.
The global robust weight is specifically:
wherein ,for the number of features->Is the firstiLocal model pair of individual clientslClassification loss of extracted features of individual local clients, all +.>The items are related to each client, at this time in +.>Weighting the performance metrics of the model; since there are only two clients at this time, this time +.>;/>Is characterized by->Is (are) true tags->Is the firstpClassifier pair feature of the individual model>Is used for classifying the result of the classification of (a),Nis the total number of source domain clients.
After global robust weight of each client is obtained, model aggregation is carried out to obtain an aggregated global model, and the method specifically comprises the following steps:
wherein ,represent the firstiGlobal robust weights of individual client models, local client model +.>。
The source domain client comprises a feature extraction network and a classification network;
the feature extraction network comprises a first convolution layer, a first group of standardization layers, a first maximum pooling layer, a second convolution layer, a second group of standardization layers, a second maximum pooling layer, a third convolution layer, a third group of standardization layers and a third maximum pooling layer which are connected in sequence;
the classification network comprises a flattening layer, an overfitting layer, a full-connection layer, a normalization layer and a full-connection layer which are sequentially connected.
In a specific embodiment, the method described in this embodiment includes:
A. federal domain generalization problem definition
The federal domain generalization (Federated Domain Generalization, FDG) approach is an organic combination of federal learning and domain generalization. Thus, in the problem definition, FDG has a commonality of both, specifically as follows:
source domain client and target domain client: in the FDG, a client having a source domain data set is called a source domain client, and a client having a target domain data set is called a target domain client.
Domain distribution differences: the data distribution between the source domain and the source domain data set and between the source domain and the target domain data set has distribution difference, and the target domain data set and the label are completely invisible in the whole training process.
Shared tag space: it is assumed that the type of possible failure is the same between different clients, including source domain clients and target domain clients, and that the failure tags are also the same between source domain clients.
Data and user privacy protection: the data of all clients are only visible locally, and in the whole training process, only the characteristic and model parameters can be transferred between the clients and the central server, but the communication of the original data cannot be performed.
B. Global robust weighted federal domain generalization
Generally, global robust weighted federal domain generalization (Federated Domain Generalization with Global Robust Weighting, FDG-GRW) introduces a weighted idea to increase the weight of source domain client models that perform well during the process of aggregating client models by a central server in cases where raw data cannot be passed to each other while target domain data and labels are not visible, and to reduce the weight of worse source domain client models to maximize the generalization ability of the aggregated global model. Meanwhile, under the limitation, the maximum mean value difference (Maximum Mean Difference, MMD) is introduced in the training process, so that the distribution difference between the source domain data is further reduced. A specific training flow is shown in fig. 2. Details of each step are described below.
Step 1: and (5) downloading the model. Assume that there are N source domain datasetsThe source clients having these datasets are denoted +.>. Each client initially downloads the global model and acts as a local model.
Step 2: forward propagation. Each client performs forward propagation and extracts the characteristics of the source domain data setAnd calculate +.>. wherein />Representing a local client modelkThe formula of the classification loss of (2) is:
wherein ,representative data samplejIs>Is the predictive result of the model,/->Is the number of the characteristics of the device,Trepresents the firstTThe number of epochs is one,krepresents the firstkAnd a client.
Step 3: calculation of. In order to find a feature transformation, the distribution differences of all source domain data in the feature transformation space are minimized, thereby training out a better generalization model, the introduction of maximum mean differences (Maximum Mean Discrepancy, MMD) is considered herein, which is defined as follows:
wherein ,is a kernel function which acts to map the input into the regenerated kernel hilbert space,/->Andis a data set-> and />The first of (3)h 1 、h 2 The number of elements to be added to the composition,mandnis a data set-> and />The total number of elements in (a) is determined.
Since the source domain dataset is only visible in the local model, it is calculatedFeatures extracted from each source domain dataset need to be passed on to a central server. Considering that there are often multiple source domains instead of two in domain generalization problem, define +.>The method comprises the following steps:
wherein ,、/>respectively represent the firsts、tSource domain dataset on individual client, < >>、/>Respectively represent the firsts、tSource Domain data on individual clientsIs collected at the firstTThe first extracted by the feature extraction network during epochf 1 、f 2 And N represents the total number of source domain clients,a、brespectively represent->、/>Is a number of (3).
Step 4: back propagation.Downloaded by each client and calculated +.>For subsequent local model back propagation.
wherein ,αis thatIs herein,/-)>,/>Is a client terminalkIs used to determine the local model loss value of (a),kfor the client serial number,Tis the firstTEpoch->The function of (a) is to constrain the distance between multiple source domains.
After the calculation is completed, back propagation is carried out on the local source domain client model to obtain a local client model with updated parameters。
Step 5: weights are calculated. Model after parameter updatingCharacteristics->And its corresponding label->Uploading to a central server to calculate the global robust weight of each model, wherein the specific strategy is as follows: first calculate each local model +.>The classification loss of other characteristics is taken as a reference, and then the global robust weight is calculated, wherein the specific formula is as follows:
wherein ,for the number of features->Is the firstiLocal model pair of individual clientslClassification loss of extracted features of individual local clients, all +.>The items are related to each client, at this time in +.>Weighting the performance metrics of the model; since there are only two clients at this time, this time +.>;/>Is characterized by->Is (are) true tags->Is the firstpClassifier pair feature of the individual model>Is used for classifying the result of the classification of (a),Nis the total number of source domain clients.
It is easy to conclude that the sum of the weights of all clients is 1.
Step 6: and (5) model aggregation. After obtaining the weight of each client, the following global robust weighting strategy is performed:
step 7: the aggregated global modelAnd sending the data to all clients to prepare for the next round of training. />Is the firstiGlobal robust weights for individual client models.
C. Network basic structure
In this context, the model structures of the client and the central server are identical. In the model transmission and model aggregation processes, only the transmission and calculation of network parameters are needed, and the structure of the network is not required to be changed. Specific network architecture and network layer parameters are shown in fig. 3.
D. General procedure
The overall flow of the DWFDG model presented herein is as follows:
firstly, in a training stage, a central server initializes a global model and sends the global model to all source domain clients; each client then trains the received global model using its own local training data set and updates parameters to form a new local model. Then, the client containing the source domain dataset sends the updated local model, the extracted features and the labels to the central server, which calculates the weights and performs model aggregation. Finally, when the training round number reaches the set value, the training plan is ended. In the test stage, the server sends the global model to the target domain client for fault diagnosis, and the specific steps are shown in fig. 1.
Aiming at the problem of data leakage in the existing cross-domain fault diagnosis method, the embodiment provides a global robust weighted federal domain generalization intelligent fault diagnosis method suitable for distributed training under different working conditions, and under the conditions that source domain data cannot be communicated and target domain data and labels are invisible, the rich information of a plurality of source domains is fully utilized, a network result is extracted by using characteristics as an information transmission medium, and classification results of a classifier on the characteristics of different source domains are used as performance metrics. And when the models are aggregated, a global robust weighting strategy is executed to improve the generalization capability of the global model, and meanwhile, the Maximum Mean Difference (MMD) is introduced to limit the distance between source domain features, so that the classification performance of the local model and the global model is further improved.
In order to verify the effectiveness of the method provided by the invention, experiments are carried out on a bearing fault data set of Paderbern university, and experimental results show that the method has excellent generalization performance and the effectiveness of a global robust weighting strategy. The experimental procedure was as follows:
A. contrast method
1) Differentially weighted federal domain adaptation (FTL): FTL is a federal domain adaptive weighting method, which measures the distance between a source domain and a target domain by using Maximum Mean Difference (MMD), and designs weights of different source domains. For FTL, all source domain and target domain data participate in training.
2) Multisource Unsupervised Domain Adaptation (MUDA): the MUDA constructs a domain discriminator for each source domain and learns domain invariant features through domain countermeasure training and based thereon performs fault diagnosis tasks.
3) Federal average (FedAvg): fedAvg, a distributed framework, is the origin of all federal learning tasks, allowing multiple clients containing source domain datasets to train machine learning models directly without uploading any proprietary data to a central server. In the method, a local client trains a local model, and a central server obtains a global model from the local model through average weighted aggregation operation. After multiple rounds of training, fedAVg obtains a global optimization model. In this context, the network model and superparameters of FedAVg are identical to the methods presented herein, except for the weighting strategy.
B. Case 1: bearing failure dataset experiments at Paderbortan university
1) Paderbonn dataset: the first dataset used in this section of the experiment was the Paderbern dataset [9]. In this dataset, the actual damage to the bearings was caused by accelerated life experiments conducted by a scientific laboratory bench. The bearing usage in this experiment is represented by bearing code number, and see table 1 in detail. The dataset contains bearings in three different states: inner ring failure (IR), outer ring failure (OR), and health (H). The data sets of the different clients come from bearings operating at different rotational speeds, radial forces and load torques. The working conditions of the bearings used in this section are shown in Table 2. It is assumed herein that A, B, C, D is distributed among four clients and that a model is co-trained based on the proposed global robust weighting method using two or three source domain data sets without generating data aggregation. The trained model will be tested on the target client.
Table 1 Paderborn dataset experimental bearing code number
TABLE 2 Paderborn dataset under different conditions
2) Experimental results: the results of this method and comparisons with other methods on the Paderbonn dataset are shown in Table 3. FDG achieves comparable or even better results than FTL. The method can well complete domain generalization tasks, and can train a model with good generalization capability under the condition that a target domain data set and a label are not utilized at all, which means that a local model trained on a client containing a source domain data set by the method can adapt to other fields. The method achieves better results in all tasks than FedAvg, which further demonstrates the effectiveness of the method.
TABLE 3 Paderborrn dataset experimental results
Example two
The embodiment provides a federal domain generalization fault diagnosis system based on global robust weighting, which comprises:
the system comprises a central server, a plurality of source domain clients and target clients;
the central server initializes a global model and sends the global model to all source domain clients;
each source domain client trains the received global model by utilizing a local source domain training data set, and updates parameters to form a new local model;
the source domain client sends the updated local model, the extracted network characteristics and the labels to a central server;
the central server uses classification results of different classifiers on different characteristics as performance metrics based on the extracted network characteristics, and performs model aggregation by using classification loss calculation weights;
and the central server sends the aggregated global model to a target domain client for fault diagnosis.
The above modules are the same as examples and application scenarios implemented by the corresponding steps, but are not limited to what is disclosed in the first embodiment. It should be noted that the modules described above may be implemented as part of a system in a computer system, such as a set of computer-executable instructions.
The foregoing embodiments are directed to various embodiments, and details of one embodiment may be found in the related description of another embodiment.
The proposed system may be implemented in other ways. For example, the system embodiments described above are merely illustrative, such as the division of the modules described above, are merely a logical function division, and may be implemented in other manners, such as multiple modules may be combined or integrated into another system, or some features may be omitted, or not performed.
Example III
The present embodiment provides a computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps in the federal domain generalization fault diagnosis method based on global robustness weighting as described in the above embodiment.
Example IV
The present embodiment provides a computer device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps in the global robustness weighted federal domain generalized fault diagnosis method according to the above embodiment when the processor executes the program.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, magnetic disk storage, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Those skilled in the art will appreciate that implementing all or part of the above-described methods in accordance with the embodiments may be accomplished by way of a computer program stored on a computer readable storage medium, which when executed may comprise the steps of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), or the like.
While the foregoing description of the embodiments of the present invention has been presented in conjunction with the drawings, it should be understood that it is not intended to limit the scope of the invention, but rather, it is intended to cover all modifications or variations within the scope of the invention as defined by the claims of the present invention.
Claims (8)
1. The federal domain generalization fault diagnosis method based on global robust weighting is characterized by comprising the following steps of:
the central server initializes the global model and sends the global model to all source domain clients;
each source domain client trains and receives a global model by utilizing a local source domain training data set, and updates parameters to form a new local model, specifically:
step 1: downloading a model; assume that there are N source domain datasetsThe source clients having these datasets are denoted +.>The method comprises the steps of carrying out a first treatment on the surface of the Each client downloads the global model at the beginning and takes the global model as a local model;
step 2: forward propagation; each client performs forward propagation and extracts the characteristics of the source domain data setAnd calculate +.>; wherein ,/>Representing a local client modelkClassification loss of (c);
step 3: calculation ofThe method comprises the steps of carrying out a first treatment on the surface of the Since the source domain dataset is only visible in the local model, it is calculatedIn the process of (2), the features extracted from each source domain dataset need to be transferred to a central server; considering that there are often multiple source domains instead of two in domain generalization problem, define +.>The method comprises the following steps:
wherein ,、/>respectively represent the firsts、tSource domain dataset on individual client, < >>、/>Respectively represent the firsts、tThe source domain data set on each client is extracted by the feature extraction network at the time of T-th epochf 1 、f 2 The characteristics of the device are that,Nrepresenting the total number of source domain clients,a、brespectively represent->、/>Is the number of (3);
step 4: counter-propagating;downloaded by each client and calculated +.>To perform subsequent local model back propagation;
wherein alpha isIs herein,/-)>,/>Is a client terminalkIs used to determine the local model loss value of (a),kfor the client serial number,Tis the firstTEpoch->The function of (a) is to constrain the distance between multiple source domains;
after the calculation is completed, back propagation is carried out on the local source domain client model to obtain a local client model with updated parameters;
the source domain client sends the updated local model, the extracted network characteristics and the labels to a central server;
the central server uses classification results of different classifiers on different characteristics as performance metrics based on the extracted network characteristics, and performs model aggregation by using classification loss calculation weights;
and the central server sends the aggregated global model to a target domain client for fault diagnosis.
2. The global robust weighting based federal domain generalization fault diagnosis method according to claim 1, wherein the central server uses classification results of different classifiers on different features as performance metrics based on extracted network features, and performs model aggregation by using classification loss calculation weights, specifically:
uploading the model, the characteristics and the corresponding labels after updating the parameters to a central server so as to calculate the global robust weight of each model;
and after the global robust weight of each client is obtained, model aggregation is carried out to obtain an aggregated global model.
3. The federal domain generalization fault diagnosis method based on global robustness weighting according to claim 2, wherein the global robustness weighting is specifically:
wherein ,for the number of features->Is the firstiLocal model pair of individual clientslClassification loss of extracted features of individual local clients, all +.>Items are associated with each client, at this time, in order toWeighting the performance metrics of the model; since there are only two clients at this time, this time +.>;Is characterized by->Is (are) true tags->Is the firstpClassifier pair feature of the individual model>Is used for classifying the result of the classification of (a),Nis the total number of source domain clients.
4. The global robust weighting-based federal domain generalization fault diagnosis method according to claim 2, wherein after global robust weights of each client are obtained, model aggregation is performed to obtain an aggregated global model, which specifically is:
5. The global robustness weighted federal domain generalization fault diagnosis method according to claim 1, wherein the source domain client comprises a feature extraction network and a classification network;
the feature extraction network comprises a first convolution layer, a first group of standardization layers, a first maximum pooling layer, a second convolution layer, a second group of standardization layers, a second maximum pooling layer, a third convolution layer, a third group of standardization layers and a third maximum pooling layer which are connected in sequence;
the classification network comprises a flattening layer, an overfitting layer, a full-connection layer, a normalization layer and a full-connection layer which are sequentially connected.
6. The utility model provides a federal domain generalization fault diagnosis system based on global robustness weighting which is characterized by comprising:
the system comprises a central server, a plurality of source domain clients and target clients;
the central server initializes a global model and sends the global model to all source domain clients;
each source domain client trains and receives a global model by utilizing a local source domain training data set, and updates parameters to form a new local model, specifically:
step 1: downloading a model; assume that there are N source domain datasetsHaving these data setsThe source clients are denoted +.>The method comprises the steps of carrying out a first treatment on the surface of the Each client downloads the global model at the beginning and takes the global model as a local model;
step 2: forward propagation; each client performs forward propagation and extracts the characteristics of the source domain data setAnd calculate +.>; wherein ,/>Representing a local client modelkClassification loss of (c);
step 3: calculation ofThe method comprises the steps of carrying out a first treatment on the surface of the Since the source domain dataset is only visible in the local model, it is calculatedIn the process of (2), the features extracted from each source domain dataset need to be transferred to a central server; considering that there are often multiple source domains instead of two in domain generalization problem, define +.>The method comprises the following steps:
wherein ,、/>respectively represent the firsts、tSource domain dataset on individual client, < >>、/>Respectively represent the firsts、tThe source domain data set on each client is extracted by the feature extraction network at the time of T-th epochf 1 、f 2 The characteristics of the device are that,Nrepresenting the total number of source domain clients,a、brespectively represent->、/>Is the number of (3);
step 4: counter-propagating;downloaded by each client and calculated +.>To perform subsequent local model back propagation;
wherein alpha isIs herein,/-)>,/>Is a client terminalkIs used to determine the local model loss value of (a),kfor the client serial number,Tis the firstTEpoch->The function of (a) is to constrain the distance between multiple source domains;
after the calculation is completed, back propagation is carried out on the local source domain client model to obtain a local client model with updated parameters;
the source domain client sends the updated local model, the extracted network characteristics and the labels to a central server;
the central server uses classification results of different classifiers on different characteristics as performance metrics based on the extracted network characteristics, and performs model aggregation by using classification loss calculation weights;
and the central server sends the aggregated global model to a target domain client for fault diagnosis.
7. A computer readable storage medium, having stored thereon a computer program, which when executed by a processor, implements the steps of the federal domain generalization fault diagnosis method based on global robustness weighting according to any of claims 1 to 5.
8. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the global robustness weighted federal domain generalization fault diagnosis method according to any of claims 1-5 when the program is executed by the processor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310218371.6A CN115952442B (en) | 2023-03-09 | 2023-03-09 | Global robust weighting-based federal domain generalized fault diagnosis method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310218371.6A CN115952442B (en) | 2023-03-09 | 2023-03-09 | Global robust weighting-based federal domain generalized fault diagnosis method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115952442A CN115952442A (en) | 2023-04-11 |
CN115952442B true CN115952442B (en) | 2023-06-13 |
Family
ID=85891172
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310218371.6A Active CN115952442B (en) | 2023-03-09 | 2023-03-09 | Global robust weighting-based federal domain generalized fault diagnosis method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115952442B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116992336B (en) * | 2023-09-04 | 2024-02-13 | 南京理工大学 | Bearing fault diagnosis method based on federal local migration learning |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114818996A (en) * | 2022-06-28 | 2022-07-29 | 山东大学 | Method and system for diagnosing mechanical fault based on federal domain generalization |
Family Cites Families (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070028220A1 (en) * | 2004-10-15 | 2007-02-01 | Xerox Corporation | Fault detection and root cause identification in complex systems |
US11663528B2 (en) * | 2020-06-30 | 2023-05-30 | Intuit Inc. | Training an ensemble of machine learning models for classification prediction using probabilities and ensemble confidence |
CN112798280B (en) * | 2021-02-05 | 2022-01-04 | 山东大学 | Rolling bearing fault diagnosis method and system |
EP4120653A1 (en) * | 2021-07-15 | 2023-01-18 | EXFO Inc. | Communication network performance and fault analysis using learning models with model interpretation |
CN114254700A (en) * | 2021-12-06 | 2022-03-29 | 中国海洋大学 | TBM hob fault diagnosis model construction method based on federal learning |
CN114399055A (en) * | 2021-12-28 | 2022-04-26 | 重庆大学 | Domain generalization method based on federal learning |
CN115239989A (en) * | 2022-06-28 | 2022-10-25 | 浙江工业大学 | Distributed fault diagnosis method for oil immersed transformer based on data privacy protection |
CN115438714A (en) * | 2022-08-01 | 2022-12-06 | 华南理工大学 | Clustering federal learning driven mechanical fault diagnosis method, device and medium |
CN115560983A (en) * | 2022-09-30 | 2023-01-03 | 哈尔滨理工大学 | Rolling bearing fault diagnosis method and system under different working conditions based on federal feature transfer learning |
CN115328691B (en) * | 2022-10-14 | 2023-03-03 | 山东大学 | Fault diagnosis method, system, storage medium and equipment based on model difference |
CN115525038A (en) * | 2022-10-26 | 2022-12-27 | 河北工业大学 | Equipment fault diagnosis method based on federal hierarchical optimization learning |
CN115905978A (en) * | 2022-11-18 | 2023-04-04 | 安徽工业大学 | Fault diagnosis method and system based on layered federal learning |
CN115773562A (en) * | 2022-11-24 | 2023-03-10 | 杭州经纬信息技术股份有限公司 | Unified heating ventilation air-conditioning system fault detection method based on federal learning |
CN115731424B (en) * | 2022-12-03 | 2023-10-31 | 北京邮电大学 | Image classification model training method and system based on enhanced federal domain generalization |
-
2023
- 2023-03-09 CN CN202310218371.6A patent/CN115952442B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114818996A (en) * | 2022-06-28 | 2022-07-29 | 山东大学 | Method and system for diagnosing mechanical fault based on federal domain generalization |
Non-Patent Citations (2)
Title |
---|
Federated learning for machinery fault diagnosis with dynamic validation and self-supervision;W. Zhang 等;Knowledge-Based Systems;第1-15页 * |
基于惯性/组合导航系统级故障诊断方法研究;徐景硕 等;舰船电子工程;第41卷(第7期);第148-151页 * |
Also Published As
Publication number | Publication date |
---|---|
CN115952442A (en) | 2023-04-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Li et al. | A review of applications in federated learning | |
Hu et al. | MHAT: An efficient model-heterogenous aggregation training scheme for federated learning | |
Banabilah et al. | Federated learning review: Fundamentals, enabling technologies, and future applications | |
Yazdinejad et al. | Federated learning for drone authentication | |
Yang et al. | Robust federated learning with noisy labels | |
Ismail et al. | A hybrid model of self-organizing maps (SOM) and least square support vector machine (LSSVM) for time-series forecasting | |
Lee et al. | Digestive neural networks: A novel defense strategy against inference attacks in federated learning | |
Li et al. | Lotteryfl: Empower edge intelligence with personalized and communication-efficient federated learning | |
CN110347932B (en) | Cross-network user alignment method based on deep learning | |
Jin et al. | Accelerated federated learning with decoupled adaptive optimization | |
CN115952442B (en) | Global robust weighting-based federal domain generalized fault diagnosis method and system | |
CN114818996B (en) | Method and system for diagnosing mechanical fault based on federal domain generalization | |
WO2022012668A1 (en) | Training set processing method and apparatus | |
Gupta et al. | Learner’s dilemma: IoT devices training strategies in collaborative deep learning | |
CN115328691A (en) | Fault diagnosis method, system, storage medium and equipment based on model difference | |
Shen et al. | Leveraging cross-network information for graph sparsification in influence maximization | |
Ma et al. | Adaptive distillation for decentralized learning from heterogeneous clients | |
CN117999562A (en) | Method and system for quantifying client contribution in federal learning | |
CN113313266B (en) | Federal learning model training method based on two-stage clustering and storage device | |
Ghosh et al. | A tutorial on different classification techniques for remotely sensed imagery datasets | |
Pustozerova et al. | Training effective neural networks on structured data with federated learning | |
Usmanova et al. | Federated continual learning through distillation in pervasive computing | |
CN111275562A (en) | Dynamic community discovery method based on recursive convolutional neural network and self-encoder | |
Xiong et al. | Anomaly network traffic detection based on deep transfer learning | |
CN115862751A (en) | Quantum chemistry property calculation method for updating polymerization attention mechanism based on edge features |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |