CN115905978A - Fault diagnosis method and system based on layered federal learning - Google Patents
Fault diagnosis method and system based on layered federal learning Download PDFInfo
- Publication number
- CN115905978A CN115905978A CN202211446890.XA CN202211446890A CN115905978A CN 115905978 A CN115905978 A CN 115905978A CN 202211446890 A CN202211446890 A CN 202211446890A CN 115905978 A CN115905978 A CN 115905978A
- Authority
- CN
- China
- Prior art keywords
- model
- client
- feature
- feature extractor
- local
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 68
- 238000003745 diagnosis Methods 0.000 title claims abstract description 52
- 238000012549 training Methods 0.000 claims abstract description 47
- 230000008569 process Effects 0.000 claims abstract description 21
- 238000013145 classification model Methods 0.000 claims description 42
- 238000005457 optimization Methods 0.000 claims description 27
- 230000006870 function Effects 0.000 claims description 18
- 230000002776 aggregation Effects 0.000 claims description 10
- 238000004220 aggregation Methods 0.000 claims description 10
- 238000004364 calculation method Methods 0.000 claims description 8
- 238000004590 computer program Methods 0.000 claims description 8
- 238000013136 deep learning model Methods 0.000 claims description 6
- 238000004422 calculation algorithm Methods 0.000 claims description 3
- 238000012935 Averaging Methods 0.000 claims description 2
- 238000001514 detection method Methods 0.000 abstract description 2
- 230000004913 activation Effects 0.000 description 11
- 238000012360 testing method Methods 0.000 description 11
- 238000011176 pooling Methods 0.000 description 8
- 238000003860 storage Methods 0.000 description 6
- 238000010801 machine learning Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 230000001808 coupling effect Effects 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000005055 memory storage Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000013024 troubleshooting Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Computer And Data Communications (AREA)
Abstract
The invention provides a fault diagnosis method and a fault diagnosis system based on layered federal learning, and relates to the field of fault detection; the method comprises the steps that a client side local model is layered into a feature extractor model used for extracting common features among client side data, a feature classifier model used for extracting private features of the client side data and a feature reconstructor model used for restoring the extracted common features into original data, the training process is only that parameters of the feature extractor model are jointly trained at the client side and the server side, and the feature classifier model and the feature reconstructor model are locally trained at the client side; the trained feature extractor model and a client private feature classifier model form a local prediction model of the client for prediction; the method and the device can solve the technical problem of low model prediction precision caused by heterogeneous client data.
Description
Technical Field
The invention relates to the technical field of fault detection, in particular to a fault diagnosis method and system based on layered federal learning.
Background
In the field of equipment fault diagnosis, the traditional data-driven machine learning method can distinguish different types of equipment fault types and shows excellent distinguishing performance. However, most of the methods of machine learning rely on a large amount of high-quality training data, i.e., a large amount of equipment failure sample data. In a real-world industrial scenario, it is difficult for a single enterprise to obtain sufficient fault type data. The intuitive idea is that a plurality of enterprises jointly learn, but because the operation data of enterprise equipment reflects the production capacity and other privacy data of the enterprises to some extent, and uploading all the equipment operation data to the central server can cause a large amount of communication overhead, privacy protection and communication cost are more and more concerned by people, and the method also becomes a driving factor for the enterprises to participate in the joint learning. Federal learning is a special distributed machine learning method, private data of each enterprise can not leave a local storage center, the method is a machine learning method only communicating model parameters between the enterprise (client) and a central server, and the model parameters can be protected by using technologies such as a compression mechanism, safe multi-party calculation, differential privacy and the like, so that the privacy safety of users is protected to a great extent.
The federated learning framework is increasingly becoming a framework for the federated learning of different enterprises, and theoretically, the traditional federated learning workflow is as follows, (1) the central server shares an initial model with each client; (2) Each client trains the model on local data thereof and sends the trained weight back to the central server; (3) The central server updates its global model with the weights from each client; (4) This process is repeated until the global model of federal learning converges; (5) Sending the trained global model to each client; (6) Each client uses the global model as a startup model, available for troubleshooting incoming data.
Although the federal learning framework provides ideas for equipment fault diagnosis by combining different enterprises, a great problem is faced under the federal learning framework, namely data heterogeneity. In the centralized learning stage, each client uploads a local data set to the central server, the data of each client forms the same training set, and in the federal learning stage, each client does not upload the local data set of itself but uses the local data set to perform model training, so that the natural local data sets of each client can generate heterogeneous phenomena due to different environments of the clients. The data isomerism can influence the performance of a client model, but most methods do not take the data isomerism problem under the federal learning framework into consideration at present.
The problem of data non-independent and same distribution is not considered in the process of federal learning in both the federal learning-based fault diagnosis method disclosed in the patent CN114662618B and the federal learning-based fault diagnosis method for the smart meter disclosed in the patent CN111537945B, and the accuracy of model classification is affected due to the fact that different enterprise data have the phenomenon of non-independent and same distribution. Therefore, a failure diagnosis method based on federal learning under the data heterogeneous condition is currently needed to improve the failure prediction accuracy under the client data heterogeneous condition.
Disclosure of Invention
The invention aims to provide a fault diagnosis method and a fault diagnosis system based on layered federal learning, which can effectively solve the technical problem of low prediction precision of federal equipment fault diagnosis under the condition of client data isomerism, provide a new thought for subsequent personalized federal learning research and related engineering application, and apply the federal learning to more practical scenes.
In order to achieve the above purpose, the invention provides the following technical scheme: a fault diagnosis method based on layered federal learning is applied to a client and comprises the following steps:
for any client, building a local model of the client in a layered manner; the local model comprises a feature extractor model, a feature classifier model and a feature reconstructor model, the feature extractor model and the feature classifier model are combined to form a client classification model, and the feature extractor model and the feature reconstructor model are combined to form a client reconstruction model;
receiving global feature extractor model parameters and client updating turns broadcasted by a server side for any client side;
training a corresponding client classification model and a corresponding client reconstruction model according to local data of the client, global feature extractor model parameters received by the local data and the client update turns;
uploading the updated characteristic extractor model parameters trained by the local models of the clients to a server side so that the server side can perform weight aggregation to obtain and broadcast the updated global characteristic extractor model parameters;
repeatedly executing the local training updating process of the client until all the characteristic extractor models in the client converge or reach the precision of a global characteristic extractor model preset by the server;
and taking the client classification model which is trained correspondingly by each client as a local prediction model of the client classification model to diagnose the fault of the incoming data.
Furthermore, when the client classification model and the client reconstruction model are updated and trained locally, the client classification model is optimized by adopting classification loss, and the client reconstruction model is optimized by adopting reconstruction loss, so that the feature extractor of the client realizes two times of optimization.
Furthermore, a federated learning system is set, wherein K clients exist in the network node, and each client has a corresponding data setWherein, N k Representing the amount of data owned by the data set; the K clients jointly train a deep learning model, and data sets among different clients are heterogeneous; that is, for any i ≠ j, there is a relationship ≠>
Defining a fault classification task of a client to comprise M types, adopting cross entropy loss for the fault classification task, and adopting mean square error loss for a reconstruction task;
then for client k, its classification is lostAnd loss of reconstitution>Sequentially comprises the following steps:
wherein,indicates the judgment type y i Whether the same as the category m, the same is 1, and different is 0; p is a radical of formula m Representing the probability that the softmax function is predicted as the m-th class;
Further, defining the model parameters of the feature extractor model of the client k at the t-th round of updating asThe model parameter of the feature classifier model is ^ h>Model parameters of the feature reconstructor model being->The corresponding updating formula of each model is as follows:
wherein eta is F ,η C And η R Respectively representing the learning rates of the feature extractor model, the feature classifier model and the feature reconstructor model in optimization.
The invention also provides a fault diagnosis method based on layered federal learning, which is applied to a server and comprises the following steps:
initializing global feature extractor model parameters, broadcasting the initialized global feature extractor model parameters and client updating turns;
receiving feature extractor model parameters which are uploaded by each client and updated by local models of the clients according to local data training, wherein the local models of the clients comprise three parts, namely a feature extractor model, a feature classifier model and a feature reconstructor model which are built in a layered mode, and the feature extractor model and the feature classifier model are combined to form a client classification model, and the feature extractor model and the feature reconstructor model are combined to form a client reconstruction model;
carrying out weight aggregation on the updated and uploaded feature extractor model parameters of each client by adopting a Federal averaging algorithm, and updating and broadcasting global feature extractor model parameters so that each client can repeatedly execute a local model training updating process according to the updated global feature extractor model parameters until all the feature extractor models in the clients converge or the accuracy of the global feature extractor model preset by a server is reached; and then, each client uses the client classification model which is trained correspondingly to perform fault diagnosis on the incoming data.
Another technical solution of the present invention is to provide a fault diagnosis system based on hierarchical federal learning, including:
the building module is used for building a local model of the client by layers for any client; the local model comprises a feature extractor model, a feature classifier model and a feature reconstructor model, the feature extractor model and the feature classifier model are combined to form a client classification model, and the feature extractor model and the feature reconstructor model are combined to form a client reconstruction model;
the receiving module is used for receiving the global feature extractor model parameters and the client updating turns broadcasted by the server to any client;
the training updating module is used for training a corresponding client classification model and a corresponding client reconstruction model according to local data of the client, global feature extractor model parameters received by the local data and the client updating turns;
the uploading module is used for uploading the updated characteristic extractor model parameters trained by the local models of the clients to the server side so that the server side can perform weight aggregation to obtain and broadcast the updated global characteristic extractor model parameters;
the circulation module is used for repeatedly executing the local training updating process of the client side until all the characteristic extractor models in the client side converge or the precision of a global characteristic extractor model preset by the server side is reached;
and the fault diagnosis module is used for diagnosing the fault of the incoming data by taking the client classification model which is trained correspondingly by each client as a local prediction model.
Further, when the training update module trains the client classification model and the client reconstruction model, the classification loss optimization client classification model and the reconstruction loss optimization client reconstruction model are adopted, so that the feature extractor of the client realizes twice optimization.
Further, the system further comprises:
a setting module for setting a Federal learning system, wherein K clients exist in the network node, and each clientAll have their corresponding data setsWherein N is k Representing the amount of data that the data set owns; the K clients jointly train a deep learning model, and data sets among different clients are heterogeneous; that is, for any i ≠ j, there is a relationship ≠>
A definition calculation module used for defining that the fault classification task of the client comprises M types, adopting cross entropy loss for the fault classification task, adopting mean square error loss for the reconstruction task,
then for client k, its classification is lostAnd loss of reconstitution>Sequentially calculating as follows:
wherein,indicates the judgment type y i Whether the same as the category m, the same is 1, and the different is 0; p is a radical of m Representing the probability of the prediction of the softmax function as the mth class;
Further, the training updating module adopts a classification loss optimization client classification model and adopts a reconstruction loss optimization client reconstruction model, and the updating process of the reconstruction loss optimization client reconstruction model is as follows:
defining the model parameters of the feature extractor model as follows when the client k updates in the t-th roundThe model parameter of the feature classifier model is ^ h>The model parameter of the feature reconstructor model is->The corresponding updating formulas of the models are as follows:
wherein eta is F ,η C And η R Respectively representing the learning rates of the feature extractor model, the feature classifier model and the feature reconstructor model in optimization.
The invention further provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the fault diagnosis method based on hierarchical federal learning as described above is implemented.
According to the technical scheme, the technical scheme of the invention has the following beneficial effects:
the invention discloses a fault diagnosis method and a fault diagnosis system based on layered federal learning, wherein the method realizes the training process of a server and clients, and specifically, the method is used for layering a local model of each client and dividing the local model into three layers, namely a feature extractor, a classifier and a reconstructor; the server side and each client side share the initial parameters of the feature extractor model; each client side trains a model on local data of the client side, and sends the trained weight of the feature extractor model back to the server side; the server side updates the global feature extractor model parameters of the server side by using the weight from each client side; continuously repeating the process until the precision of the feature extractor model of each client side in the test set is converged or the precision of the global feature extractor model preset by the server side is reached; sending the trained global feature extractor model parameters to each client; and each client uses the global feature extractor and a classifier trained by the client as a local prediction model to diagnose the fault of the incoming data.
According to the method and the system, the traditional whole client model is prevented from being used for carrying out federal learning, but the feature extractor is used as a target optimization structure in the federal learning, so that the problem of data heterogeneity of the federal learning system is effectively solved, the model test precision is obviously improved, and the federal learning can be applied to more practical scenes.
It should be understood that all combinations of the foregoing concepts and additional concepts described in greater detail below can be considered as part of the inventive subject matter of this disclosure unless such concepts are mutually inconsistent.
The foregoing and other aspects, embodiments and features of the present teachings can be more fully understood from the following description taken in conjunction with the accompanying drawings. Additional aspects of the present invention, such as features and/or advantages of exemplary embodiments, will be apparent from the description which follows, or may be learned by practice of specific embodiments in accordance with the teachings of the present invention.
Drawings
The drawings are not intended to be drawn to scale. In the drawings, each identical or nearly identical component that is illustrated in various figures may be represented by a like numeral. For purposes of clarity, not every component may be labeled in every drawing. Embodiments of various aspects of the present invention will now be described, by way of example, with reference to the accompanying drawings, in which:
FIG. 1 is a network architecture of a federated learning system in an embodiment of the present invention;
FIG. 2 is a general flowchart of an apparatus fault diagnosis method based on hierarchical federated learning according to an embodiment of the present invention;
FIG. 3 is a detailed flowchart of an apparatus fault diagnosis method based on hierarchical federated learning according to an embodiment of the present invention;
fig. 4 is a specific structure of the client local model disclosed in the embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the drawings of the embodiments of the present invention. It is to be understood that the embodiments described are only a few embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the described embodiments of the invention without any inventive step, are within the scope of protection of the invention. Unless defined otherwise, technical or scientific terms used herein shall have the ordinary meaning as understood by one of ordinary skill in the art to which this invention belongs.
The use of "first," "second," and similar terms in the description and claims of the present application do not denote any order, quantity, or importance, but rather the terms are used to distinguish one element from another. Similarly, the singular forms "a," "an," and "the" do not denote a limitation of quantity, but rather denote the presence of at least one, unless the context clearly dictates otherwise. The terms "comprises" or "comprising," and the like, mean that the elements or components listed in the preceding list of elements or components include the features, integers, steps, operations, elements and/or components listed in the following list of elements or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components and/or groups thereof.
The invention provides a fault diagnosis method and system based on layered federal learning, aiming at the technical problems that the prediction precision of a client model in a federal learning system is not high and the requirement of an industrial application scene is not met due to the data heterogeneity problem existing in the federal learning in the prior art, and the fault diagnosis method and system based on layered federal learning effectively solve the data heterogeneity problem of the federal learning system and achieve the effect of improving the prediction precision of the model.
The fault diagnosis method and system based on hierarchical federated learning of the present invention are further described in detail with reference to the embodiments shown in the drawings.
The federal learning system shown in fig. 1 includes a central server and a plurality of clients connected to the central server, and the fault diagnosis method based on hierarchical federal learning provided by the present invention is applied to the system.
In order to implement fault diagnosis under data heterogeneous conditions in the above system, the above method needs to face the following challenges, including: (1) How to extract similar features between client data under the condition of heterogeneous client data; (2) How to ensure the validity of similar features extracted at each client; (3) How to determine the client model structure for federated learning optimization. Therefore, the local model of the client is constructed in a layered mode, then the common characteristics of all the clients are obtained by only adopting parts in the local model for training and updating, and finally fault diagnosis under the condition of data isomerism is realized.
As shown in fig. 2, the general flow of the fault diagnosis method based on hierarchical federated learning disclosed in the embodiment sequentially includes: the method comprises the steps of building a client local model in a layering mode, initializing parameters of a global special extractor model, conducting client local training, conducting aggregation and circular training on a server side until the client model converges or the global situation reaches preset precision, and finally conducting fault diagnosis by using a classification model trained by the client. The process is initiated, a layered local model is built at the client, and then the part of the local model related to the common characteristics of all the clients participates in the cyclic training and updating of the server, so that the problem of data heterogeneity of the federal learning system is solved, and the fault diagnosis precision of the model in each client is improved.
Referring to fig. 3, when the fault diagnosis method based on hierarchical federated learning disclosed in the embodiment is applied to a client, the method specifically includes the following steps:
step S102, building a local model of a client by layers for any client; the local model comprises a feature extractor model, a feature classifier model and a feature reconstructor model, the feature extractor model and the feature classifier model are combined to form a client classification model, and the feature extractor model and the feature reconstructor model are combined to form a client reconstruction model;
the main task of the feature extractor model is to extract the common features of the data of each client, and the main task of the feature classifier model is to extract the personalized features of the data of each client; in order to decouple the feature extractor model and the feature classifier model, when the feature extractor model extracts the data features of the client, the feature reconstructor model is introduced to reconstruct the data features of the client back to the original data, and the association between the data features extracted by the feature extractor model and the original data is strengthened; and, in the federal learning framework of this solution, a feature extractor model is used as a structure for target optimization.
In a specific implementation, the structures of the feature extractor model, the feature reconstructor model, and the feature classifier model are shown in fig. 4. Firstly, dividing fault data into 1024 multiplied by 1 one-dimensional data before model training; the feature extractor comprises two one-dimensional convolutional layers, two one-dimensional pooling layers and two ReLU activation functions, specifically, 1024 × 1 fault data is firstly output as 256 × 32 data features through one-dimensional convolutional layer and one ReLU activation function (feature dimensions are not changed through the ReLU activation function), then outputting 128 × 32 data characteristics through a one-dimensional pooling layer, outputting 128 × 64 data characteristics through a one-dimensional convolution layer and a ReLU activation function, and finally outputting 64 × 64 data characteristics through a one-dimensional pooling layer; the feature reconstructor model is composed of two one-dimensional convolutional layers, two one-dimensional pooling layers, two LeakyReLU activation functions and a Reshape layer, specifically, 64 × 64 data features output by the feature extractor model are output as 32 × 128 data features through one-dimensional convolutional layer and one LeakyReLU activation function (feature dimensions are not changed through the LeakyReLU activation function), then are output as 16 × 128 data features through one-dimensional pooling layer, are output as 8 × 256 data features through one-dimensional convolutional layer and one LeakyReLU activation function, are output as 4 × 256 data features through one-dimensional pooling layer, and finally are converted into dimensions of initial fault data through the Reshape layer; the feature classifier model comprises a one-dimensional convolutional layer, a one-dimensional pooling layer, two linear layers, two ReLU activation functions and a Flatten layer, and specifically comprises the following characteristics that 64 x 64 data features output by the feature extractor model are output as 32 x 64 data features through the one-dimensional convolutional layer and the ReLU activation function, are output as 16 x 64 data features through the one-dimensional pooling layer, are straightened into 1024 data features through the Flatten layer, are output as 128 data features through the one linear layer and the ReLU activation function, and are output as 10 data features through the one linear layer and are classified.
Aiming at the facing challenge (1), the scheme builds a feature extractor model at each client for extracting similar features among client data, and builds a feature classifier model for extracting the private features of the client data and finishing classification tasks; aiming at the challenge (2), a feature reconstructor model is set up at each client, so that the coupling effect between the feature extractor model and the feature classifier model of the client is relieved, the relation between the data features extracted by the feature extractor model and the original data is enhanced, and the feature effectiveness is improved; aiming at the challenge (3), obviously, the traditional optimized client full model structure is not suitable for the scheme, so that the application determines that the feature extractor model of the client is used as the client model structure optimized by the federal learning; in conclusion, the local model built at the client can fully solve the problem of data heterogeneity of the federal learning system, and the test precision of the local prediction model is improved.
Step S104, receiving global feature extractor model parameters and client updating turns broadcast by a server for any client; in the initial situation, the global feature extractor model parameters are initialized and rebroadcast by the server side.
Step S106, training a corresponding client classification model and a corresponding client reconstruction model according to local data of the client, global feature extractor model parameters received by the local data and the update turns of the client; that is, not only the feature extractor model is trained, but also the feature extractor model, the feature classifier model and the feature reconstructor model are trained locally while ensuring the integrity of the network.
Step S108, uploading the feature extractor model parameters updated by the local model training of each client to a server, so that the server performs weight aggregation to obtain and broadcast updated global feature extractor model parameters;
step S110, the local training updating process of the client is repeatedly executed until all the characteristic extractor models in the client converge or the precision of a global characteristic extractor model preset by the server is reached;
the two steps realize the reciprocating process that the client sends the trained feature extractor model parameters to the central server and then receives the global feature extractor model parameters fed back by the central server.
And step S112, taking the client classification models which are trained correspondingly by the clients as local prediction models of the clients, and carrying out fault diagnosis on the incoming data.
Another embodiment of the present invention provides a fault diagnosis method applied to a server side and based on hierarchical federated learning, including:
initializing global feature extractor model parameters, and broadcasting the initialized global feature extractor model parameters and client updating turns; optionally, in an embodiment, the update round of each client is fixed and is set to 5 times.
Receiving characteristic extractor model parameters uploaded by each client, which are updated according to local data training by a local model of each client, wherein the local model of each client comprises three parts of a characteristic extractor model, a characteristic classifier model and a characteristic reconstructor model which are built in a layered mode, and the characteristic extractor model and the characteristic classifier model are combined to form a client classification model, the characteristic extractor model and the characteristic reconstructor model are combined to form a client reconstruction model;
carrying out weight aggregation on the updated and uploaded feature extractor model parameters of each client by adopting a federal average algorithm, and updating and broadcasting global feature extractor model parameters so that each client can repeatedly execute a local model training and updating process according to the updated global feature extractor model parameters until all the feature extractor models in the clients are converged or the accuracy of the global feature extractor model preset by a server is reached; and then, each client uses the client classification model which is trained correspondingly to perform fault diagnosis on the incoming data.
Optionally, as shown in fig. 1, the client sends the locally trained feature extractor model parameters to the central server, and the central server records information and upload time of each client at the same time.
In the process of federal learning, in order to relieve the influence of data isomerism on the model, the fault diagnosis method applied to the client or the server does not use the traditional whole client model to conduct federal learning any more, but only uses the feature extractor model as a target optimization structure in the federal learning, effectively solves the problem of data isomerism of the federal learning system, and focuses on common features of data of all clients. Furthermore, the method firstly trains the feature extractor model in the training set according to the method of federal learning, and adjusts the hyper-parameters of the model in the verification set until the test accuracy of the model in the test set is converged; and finally, combining the trained feature extractor model and the local feature classifier models of the clients to form a local prediction model of the client for application, so that the prediction precision is fully improved.
As an optional implementation manner, in the method, when the client classification model and the client reconstruction model perform local update training, the client classification model is optimized by using classification loss, and the client reconstruction model is optimized by using reconstruction loss, so that the feature extractor of the client is optimized twice.
For example, for the federated learning system shown in FIG. 1, there are K clients in its network node, each with its corresponding data setWherein N is k Representing the amount of data that the data set owns; the K clients jointly train a deep learning model, and data sets among different clients are heterogeneous; that is, for any i ≠ j, there is a relationship ≠>
Defining a fault classification task of a client to comprise M types, adopting cross entropy loss for the fault classification task, and adopting mean square error loss for a reconstruction task;
then for client k, its classification is lostAnd loss of reconstitution>Sequentially comprises the following steps:
wherein,indicates the judgment type y i Whether the same as the category m, the same is 1, and the different is 0; p is a radical of m Representing the probability of the prediction of the softmax function as the mth class;
Defining the model parameters of the feature extractor model of the client k in the t-th round of updating asThe model parameter of the feature classifier model is ^ h>The model parameter of the feature reconstructor model is->Then the corresponding update formula for each model is as follows: />
Wherein eta is F ,η C And η R Respectively representing the learning rates of the feature extractor model, the feature classifier model and the feature reconstructor model in optimization.
In addition, whether the test precision of the feature extractor model is converged in the test set is calculated by the following formula:
wherein n is testing Root of common BinSample data size, n, of model test correct And the sample data size representing that the test result in the samples participating in the model test is consistent with the real result.
In this embodiment, an electronic device is further provided, where the electronic device includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the computer program is executed by the processor, the fault diagnosis method based on hierarchical federal learning as described above is implemented.
The computer program may be run on a processor or stored in a computer readable storage medium, which may include permanent and non-permanent, removable and non-removable media, and may be implemented by any method or technology for storage of information. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of memory storage media include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a storage medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
These computer programs may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks, and corresponding steps may be implemented by different modules.
For example, the present embodiment provides a system, i.e., a fault diagnosis system based on hierarchical federated learning, the system including: the building module is used for building a local model of the client by layers for any client; the local model comprises a feature extractor model, a feature classifier model and a feature reconstructor model, the feature extractor model and the feature classifier model are combined to form a client classification model, and the feature extractor model and the feature reconstructor model are combined to form a client reconstruction model; the receiving module is used for receiving the global feature extractor model parameters and the client updating turns broadcasted by the server to any client; the training updating module is used for training a corresponding client classification model and a corresponding client reconstruction model according to local data of the client, global feature extractor model parameters received by the local data and the client updating turns; the uploading module is used for uploading the feature extractor model parameters updated by the training of the local models of the clients to the server side so that the server side can perform weight aggregation to obtain and broadcast the updated global feature extractor model parameters; the circulation module is used for repeatedly executing the local training updating process of the client side until all the characteristic extractor models in the client side converge or the precision of a global characteristic extractor model preset by the server side is reached; and the fault diagnosis module is used for diagnosing the fault of the incoming data by taking the client classification model which is trained correspondingly by each client as a local prediction model.
The system is used for implementing the steps of the fault diagnosis method based on the layered federal learning disclosed in the above embodiment, and therefore, the steps have been already described, and are not described again here.
For example, when the training update module trains the client classification model and the client reconstruction model, the classification loss optimization client classification model and the reconstruction loss optimization client reconstruction model are adopted, so that the feature extractor of the client realizes twice optimization.
For another example, the system further comprises:
the setting module is used for setting a federal learning system, K clients exist in network nodes of the system, and each client has a corresponding data setWherein, N k Representing the amount of data that the data set owns; the K clients jointly train a deep learning model, and data sets among different clients are heterogeneous; that is, for any i ≠ j, there is a relationship ≠>
A definition calculation module used for defining that the fault classification task of the client comprises M types, adopting cross entropy loss for the fault classification task, adopting mean square error loss for the reconstruction task,
then for client k, its classification is lostAnd loss of reconstitution>The calculation formula of (2) is as follows:
wherein,indicates the judgment type y i Whether the same as the category m, the same is 1, and different is 0; p is a radical of m Representing the probability that the softmax function is predicted as the m-th class;
For another example, according to the definition and calculation formula of the setting module and the definition calculation module, the training and updating module adopts a classification loss optimization client classification model and a reconstruction loss optimization client reconstruction model, and the updating process of the training and updating module adopts:
defining the model parameters of the feature extractor model of the client k in the t-th round of updating asModel parameters of a feature classifier model being ^ er>The model parameter of the feature reconstructor model is->Then the corresponding model update formula is as follows:
wherein eta is F ,η C And η R Respectively representing the learning rates of the feature extractor model, the feature classifier model and the feature reconstructor model in optimization.
On one hand, the local model of the client is determined by carrying out hierarchical design on the local model of the client, namely, the local model is layered into a feature extractor model for extracting common features among client data, a feature classifier model for extracting private features of the client data and a feature reconstructor model for restoring the extracted common features into original data; on the other hand, the method only uploads the feature extractor model to the server side, the feature classifier model and the feature reconstructor model in the training parameters of the client side and only carries out local training, solves the problems of extracting similar features among client side data under the condition of client side data isomerism and ensuring the effectiveness of the similar features extracted at each client side, fully solves the technical problem of low prediction precision of federal equipment fault diagnosis under the condition of client side data isomerism, provides a new thought for subsequent personalized federal learning research and related engineering application, and applies federal learning to more practical scenes.
Although the present invention has been described with reference to the preferred embodiments, it is not intended to be limited thereto. Those skilled in the art can make various changes and modifications without departing from the spirit and scope of the invention. Therefore, the protection scope of the present invention should be defined by the appended claims.
Claims (10)
1. A fault diagnosis method based on layered federal learning is characterized by being applied to a client and comprising the following steps:
for any client, building a local model of the client in a layered manner; the local model comprises a feature extractor model, a feature classifier model and a feature reconstructor model, the feature extractor model and the feature classifier model are combined to form a client classification model, and the feature extractor model and the feature reconstructor model are combined to form a client reconstruction model;
receiving global feature extractor model parameters and client updating turns broadcasted by a server side for any client side;
training a corresponding client classification model and a corresponding client reconstruction model according to local data of the client, received global feature extractor model parameters and a client updating turn;
uploading the updated characteristic extractor model parameters trained by the local models of the clients to a server side so that the server side can perform weight aggregation to obtain and broadcast the updated global characteristic extractor model parameters;
repeatedly executing the local training updating process of the client until all the characteristic extractor models in the client converge or the accuracy of the global characteristic extractor model preset by the server is achieved;
and taking the client classification model which is trained correspondingly by each client as a local prediction model of the client classification model to diagnose the fault of the incoming data.
2. The fault diagnosis method based on hierarchical federated learning as claimed in claim 1, wherein, during local update training of the client classification model and the client reconstruction model, a classification loss optimization client classification model and a reconstruction loss optimization client reconstruction model are adopted, so that the feature extractor of the client realizes twice optimization.
3. The hierarchical federated learning-based fault diagnosis method of claim 2,
setting a federal learning system, wherein K clients exist in a network node, and each client has a corresponding data setWherein N is k Representing the amount of data that the data set owns; the K clients jointly train a deep learning model, and data sets among different clients are heterogeneous; that is, for any i ≠ j, there is a relationship ≠>
Defining a fault classification task of a client to comprise M types, adopting cross entropy loss for the fault classification task, and adopting mean square error loss for a reconstruction task;
then for client k, its classification is lostAnd loss of reconstitution>Sequentially comprises the following steps:
wherein,indicates the judgment type y i Whether the same as the category m, the same is 1, and the different is 0; p is a radical of m Representing the probability of the prediction of the softmax function as the mth class;
4. The fault diagnosis method based on hierarchical federated learning as claimed in claim 3, wherein the model parameters of the feature extractor model when defining client k at the t-th round of updating are defined asThe model parameter of the feature classifier model is ^ h>The model parameter of the feature reconstructor model is->The corresponding updating formula of each model is as follows: />
Wherein eta is F ,η C And η R Respectively representing the learning rates of the feature extractor model, the feature classifier model and the feature reconstructor model during optimization.
5. A fault diagnosis method based on layered federal learning is characterized by being applied to a server side and comprising the following steps:
initializing global feature extractor model parameters, broadcasting the initialized global feature extractor model parameters and client updating turns;
receiving feature extractor model parameters which are uploaded by each client and updated by local models of the clients according to local data training, wherein the local models of the clients comprise three parts, namely a feature extractor model, a feature classifier model and a feature reconstructor model which are built in a layered mode, and the feature extractor model and the feature classifier model are combined to form a client classification model, and the feature extractor model and the feature reconstructor model are combined to form a client reconstruction model;
carrying out weight aggregation on the updated and uploaded feature extractor model parameters of each client by adopting a Federal averaging algorithm, and updating and broadcasting global feature extractor model parameters so that each client can repeatedly execute a local model training updating process according to the updated global feature extractor model parameters until all the feature extractor models in the clients converge or the accuracy of the global feature extractor model preset by a server is reached; and then, each client adopts the client classification model which is correspondingly trained to carry out fault diagnosis on the incoming data.
6. A fault diagnosis system based on hierarchical federated learning, comprising:
the building module is used for building a local model of the client by layers for any client; the local model comprises a feature extractor model, a feature classifier model and a feature reconstructor model, the feature extractor model and the feature classifier model are combined to form a client classification model, and the feature extractor model and the feature reconstructor model are combined to form a client reconstruction model;
the receiving module is used for receiving the global feature extractor model parameters and the client updating turns broadcasted by the server to any client;
the training updating module is used for training a corresponding client classification model and a corresponding client reconstruction model according to local data of the client, global feature extractor model parameters received by the local data and the client updating turns;
the uploading module is used for uploading the feature extractor model parameters updated by the training of the local models of the clients to the server side so that the server side can perform weight aggregation to obtain and broadcast the updated global feature extractor model parameters;
the circulation module is used for repeatedly executing the local training updating process of the client side until all the characteristic extractor models in the client side converge or the precision of a global characteristic extractor model preset by the server side is reached;
and the fault diagnosis module is used for diagnosing the fault of the incoming data by taking the client classification model which is trained correspondingly by each client as a local prediction model.
7. The layered federal learning based failure diagnosis system as claimed in claim 6, wherein the training update module optimizes the client classification model by using classification loss and optimizes the client reconstruction model by using reconstruction loss when training the client classification model and the client reconstruction model, so that the feature extractor of the client is optimized twice.
8. The system of claim 7, further comprising:
the setting module is used for setting a federal learning system, K clients exist in network nodes of the system, and each client has a corresponding data setWherein, N k Representing the amount of data that the data set owns; the K clients jointly train a deep learning model, and data sets among different clients are heterogeneous; that is, for any i ≠ j, there is a relationship ≠>
A definition calculation module used for defining that the fault classification task of the client comprises M types, adopting cross entropy loss for the fault classification task, adopting mean square error loss for the reconstruction task,
then for client k, its classification is lostAnd loss of reconstitution>The calculation is as follows:
wherein,indicates the judgment type y i Whether the same as the category m, the same is 1, and different is 0; p is a radical of m Representing the probability that the softmax function is predicted as the m-th class;
9. The system of claim 8, wherein the training update module employs a classification loss optimization client classification model and employs a reconstruction loss optimization client reconstruction model, and the update process comprises:
defining the model parameters of the feature extractor model of the client k in the t-th round of updating asThe model parameter of the feature classifier model is ^ h>The model parameter of the feature reconstructor model is->The corresponding updating formula of each model is as follows:
wherein eta is F ,η C And η R Respectively representing the learning rates of the feature extractor model, the feature classifier model and the feature reconstructor model in optimization.
10. An electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor, when executing the computer program, implements the hierarchical federated learning-based fault diagnosis method of any one of claims 1-4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211446890.XA CN115905978A (en) | 2022-11-18 | 2022-11-18 | Fault diagnosis method and system based on layered federal learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211446890.XA CN115905978A (en) | 2022-11-18 | 2022-11-18 | Fault diagnosis method and system based on layered federal learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115905978A true CN115905978A (en) | 2023-04-04 |
Family
ID=86484257
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211446890.XA Pending CN115905978A (en) | 2022-11-18 | 2022-11-18 | Fault diagnosis method and system based on layered federal learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115905978A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115952442A (en) * | 2023-03-09 | 2023-04-11 | 山东大学 | Global robust weighting-based federal domain generalized fault diagnosis method and system |
CN117992873A (en) * | 2024-03-20 | 2024-05-07 | 合肥工业大学 | Transformer fault classification method and model training method based on heterogeneous federal learning |
CN118296329A (en) * | 2024-06-06 | 2024-07-05 | 贵州大学 | Federal element learning fault diagnosis method for non-independent identical distribution condition |
-
2022
- 2022-11-18 CN CN202211446890.XA patent/CN115905978A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115952442A (en) * | 2023-03-09 | 2023-04-11 | 山东大学 | Global robust weighting-based federal domain generalized fault diagnosis method and system |
CN117992873A (en) * | 2024-03-20 | 2024-05-07 | 合肥工业大学 | Transformer fault classification method and model training method based on heterogeneous federal learning |
CN117992873B (en) * | 2024-03-20 | 2024-06-11 | 合肥工业大学 | Transformer fault classification method and model training method based on heterogeneous federal learning |
CN118296329A (en) * | 2024-06-06 | 2024-07-05 | 贵州大学 | Federal element learning fault diagnosis method for non-independent identical distribution condition |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN115905978A (en) | Fault diagnosis method and system based on layered federal learning | |
CN113961759B (en) | Abnormality detection method based on attribute map representation learning | |
CN115081532B (en) | Federal continuous learning training method based on memory replay and differential privacy | |
CN113194493B (en) | Wireless network data missing attribute recovery method and device based on graph neural network | |
CN110071798B (en) | Equivalent key obtaining method and device and computer readable storage medium | |
CN110674925B (en) | No-reference VR video quality evaluation method based on 3D convolutional neural network | |
CN115688913A (en) | Cloud-side collaborative personalized federal learning method, system, equipment and medium | |
CN115563650A (en) | Privacy protection system for realizing medical data based on federal learning | |
CN114091667A (en) | Federal mutual learning model training method oriented to non-independent same distribution data | |
CN113518007A (en) | Multi-internet-of-things equipment heterogeneous model efficient mutual learning method based on federal learning | |
CN116205383B (en) | Static dynamic collaborative graph convolution traffic prediction method based on meta learning | |
Jin et al. | Deep learning for seasonal precipitation prediction over China | |
CN109949217A (en) | Video super-resolution method for reconstructing based on residual error study and implicit motion compensation | |
CN115271101A (en) | Personalized federal learning method based on graph convolution hyper-network | |
CN114064627A (en) | Knowledge graph link completion method and system for multiple relations | |
CN107357858B (en) | Network reconstruction method based on geographic position | |
CN114998107A (en) | Image blind super-resolution network model, method, equipment and storage medium | |
Zhang et al. | Sonar image quality evaluation using deep neural network | |
CN113850399A (en) | Prediction confidence sequence-based federal learning member inference method | |
CN113541986B (en) | Fault prediction method and device for 5G slice and computing equipment | |
CN108769674A (en) | A kind of video estimation method based on adaptive stratification motion modeling | |
CN117217328A (en) | Constraint factor-based federal learning client selection method | |
Li et al. | Towards communication-efficient digital twin via AI-powered transmission and reconstruction | |
CN117746172A (en) | Heterogeneous model polymerization method and system based on domain difference perception distillation | |
CN115829029A (en) | Channel attention-based self-distillation implementation method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |