CN112560991A - Personalized federal learning method based on hybrid expert model - Google Patents
Personalized federal learning method based on hybrid expert model Download PDFInfo
- Publication number
- CN112560991A CN112560991A CN202011567011.XA CN202011567011A CN112560991A CN 112560991 A CN112560991 A CN 112560991A CN 202011567011 A CN202011567011 A CN 202011567011A CN 112560991 A CN112560991 A CN 112560991A
- Authority
- CN
- China
- Prior art keywords
- model
- parameters
- federal learning
- global
- personalized
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 28
- 238000000605 extraction Methods 0.000 claims abstract description 34
- 230000004913 activation Effects 0.000 claims description 14
- 230000006870 function Effects 0.000 claims description 6
- 230000004931 aggregating effect Effects 0.000 claims description 5
- 238000011478 gradient descent method Methods 0.000 claims description 3
- 230000007547 defect Effects 0.000 abstract description 2
- 239000010410 layer Substances 0.000 description 89
- 230000002776 aggregation Effects 0.000 description 4
- 238000004220 aggregation Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 239000002356 single layer Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Probability & Statistics with Applications (AREA)
- Image Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides an individualized federal learning method based on a mixed expert model, aiming at overcoming the defect that a private model in a large-scale stateless mobile federal environment is difficult to realize full training: all clients participate in the training of the global model together by adding federal learning to obtain a global model parameter thetaG(ii) a Each client downloads theta from the server respectivelyGInitializing a feature extraction layer and an individual classification layer in the client by using the parameter, and obtaining an individual classification layer parameter by individualizing by using a fixed base layer method; at the moment, the client i has theta comprising a feature extraction layer parameter and a global classification layer parameterGAnd the parameters of the individual classification layer, the feature extraction layer, the global classification layer and the individual classification layer are initialized by utilizing the three, the gating model is trained together, and the parameters of the gating model are obtainedAnd finally, the client side obtains parameters of the feature extraction layer, the global classification layer, the individual classification layer and the gating model, and the individual federal learning is completed.
Description
Technical Field
The invention relates to the field of federal learning, in particular to a personalized federal learning method based on a mixed expert model.
Background
In the deep learning field, the quantity and quality of training data largely determine the training effect of the deep neural network model. Under the premise of considering user privacy, user data can not be collected to a data center, but an effective model is difficult to obtain through training only by island data of each client. The federated learning of data retention in the local client is an effective solution, and the federated learning obtains model updates by using data and computing resources dispersed in each client, and obtains a global model by aggregating the model updates of each client, thereby training to obtain a global model which makes full use of the global data while ensuring the privacy of user data. In personalized federal learning, each client obtains an independent personalized model suitable for local data distribution based on a global model, and the personalized model is more in line with an optimization target considered from the perspective of the client. However, personalization is achieved, meanwhile, global knowledge is forgotten by a client, and how to balance the relationship between the two is a research hotspot of personalized federal learning at the present stage.
A hybrid expert Model (MOE) is an integrated learning method that effectively utilizes multiple learners, decomposing complex tasks into individual expert models, and then using a gating model to combine the experts. The privacy federation learning of the field self-adaptation is inspired by an MOE framework, a local private model and a model participating in global updating are used as local and global experts, and the output ratio of the local private model and the model is obtained by a gating model, so that the effect of the local and global field self-adaptation is achieved. For example, a model training method based on domain adaptation and federal learning is proposed in publication No. CN111738440A (published japanese 2020-10-02).
However, private adaptive privacy federated learning requires all clients to participate synchronously, and at the same time, all clients need to keep intermediate states including local gating models and private model states in the federated training process, which makes it limited to federated learning between institutions, and it is difficult to achieve sufficient training for large-scale stateless mobile federated environment private models. Because the gating model is a single-layer linear neural network, the gating model cannot directly utilize high-dimensional input data to effectively balance the private model and the global model in a federal training task based on high-dimensional data such as images.
Disclosure of Invention
The invention provides a personalized federal learning method based on a mixed expert model, aiming at overcoming the defect that the large-scale stateless mobile federal environment private model in the prior art is difficult to realize full training.
In order to solve the technical problems, the technical scheme of the invention is as follows:
the personalized federal learning method based on the hybrid expert model comprises the following steps:
s1: all clients participate in the training of the global model together by adding federal learning to obtain a global model parameter thetaG;
S2: each client i downloads the global model parameter theta from the server respectivelyGInitializing parameters of a personality classification layer, a feature extraction layer and a global classification layer in the client i by using the parameters, and performing random initialization on a gating model;
s3: the client i carries out personalized federal learning, local data are input into the feature extraction layer to obtain an activation value, then the activation value is respectively input into the global classification layer, the personalized classification layer and the gating model, and the personalized classification layer and the gating model are respectively input into the global classification layer, the personalized classification layer and the gating model according to the global model parameter thetaGPerforming personalized federal learning, training to obtain personalized classification layer parametersAnd gating model parameters
S4: judging whether a preset training round number is reached, if so, obtaining an individual classification layer parameterAnd gating model parametersThe client i finishes personalized federal learning; if not, the step S3 is executed.
In the technical scheme, the number of the clients is i, the clients respectively hold the private local data, and the clients receive the global model parameters issued by the server and are applied to personalized federal learning. The client training model comprises an individual classification layer, a gating model and a global model, the global model comprises a global classification layer and a feature extraction layer, specifically, the output end of the feature extraction layer is respectively connected with the input ends of the individual classification layer, the gating model and the global classification layer, the output ends of the individual classification layer, the gating model and the global classification layer are respectively connected with the input end of an aggregation weighting layer, and the output end of the aggregation weighting layer is the output end of the client training model.
Preferably, the step of S1 further includes the steps of: the client i sends a global model parameter theta according to the serverG(θE,θC,G) Local training is carried out by utilizing a gradient descent method to obtain an updated global model parameter thetaG(θE,θC,G) And uploading to a server; wherein theta isEFor the parameters of the feature extraction layer, θC,GParameters of the global classification layer.
Preferably, in step S3, the global model parameter θ is received according to the clientG(θE,θC,G) By fixing the parameter theta of the feature extraction layerETo classify the layer parameters for the personalityFine tuning and personalized federal learning.
Preferably, the specific steps of the individual classification layer for performing the individual federal learning according to the global model parameters are as follows:
s3.1: initializing setting personality classification layer parametersSetting local fine-tuning hyper-parameters;
S3.2:extracting mini-batch (x, y) epsilon D from local data of client iiInputting the data into a feature extraction layer to obtain an activation value AE,x;
S3.3: will activate the value AE,xInputting output predictive tags in a personality classification layer
S3.4: computing predictive labelsCross entropy loss1 with real label y, using back propagation algorithm to obtain individual classification layer parameterGradient of (2)
S3.5: classifying layer parameters according to personalityGradient of (2)Classifying the parameters of the hierarchy for the personalityUpdating, judging whether the preset number of training rounds is reached, and if not, skipping to execute the step S3.2; if yes, outputting the parameters of the individual classification layerAnd completing personalized federal learning of the personalized classification layer.
Preferably, the expression formula of the individual classification layer for performing the individual federal learning according to the global model parameters is as follows:
AE,x=ME(x,θE)
in the formula, ME(. cndot.) represents a feature extraction layer,representing a personality classification level; CEL (·) represents a cross-entropy loss function; alpha represents the learning rate of the individual classification layer for individual federal learning; diRepresenting a client local data set.
Preferably, in step S3, the global model parameter θ is used as a basisGBy fixing the parameter theta of the feature extraction layerEAnd personality classification level parametersTo gate control model parametersFine tuning and personalized federal learning.
Preferably, the steps of the gated model for personalized federal learning according to global model parameters are as follows:
s3.7: extracting mini-batch (x, y) epsilon D from local data of client iiInputting the data into a feature extraction layer to obtain an activation value AE,x;
S3.8: will activate the value AE,xRespectively inputting the individual classification layer, the global classification layer and the gating model, aggregating the output results to complete forward propagation, and obtaining an output prediction label
S3.9: computing predictive labelsCross entropy loss with real label y 2, using back propagation algorithm to obtain gating model parameterGradient of (2)
S3.10: according to gated model parametersGradient of (2)To gate control model parametersUpdating, judging whether the preset number of training rounds is reached, and if not, skipping to execute the step S3.7; if yes, outputting the gating model parametersAnd completing personalized federal learning of the gating model.
Preferably, the gating model performs personalized federal learning according to global model parameters by using the following expression formula:
AE,x=ME(x,θE)
in the formula, gateout represents the ratio of the global model obtained by inputting the activation value into the gating model and the output of the individual classification layer, and the value range is [0,1 ]];MC(. to) represents a global classification layer; beta represents the learning rate of the gated model for personalized federal learning.
Preferably, the hyper-parameters comprise a learning rate and a number of training rounds.
Preferably, the personalized federal learning method further comprises the steps of: the client i takes the tasks to be classified as input data x' and calculates the prediction probability prob of each classification for judging the personalized federal learning result; the expression formula is as follows:
AE,x′=ME(x′,θE)
prob=softmax(y′)
in the formula, y 'represents a prediction label corresponding to the output of the input data x'; softmax (·) denotes a softmax function.
Compared with the prior art, the technical scheme of the invention has the beneficial effects that: according to global model parameters obtained through federal learning, parameters of the individual classification layer and the gating model are finely adjusted, the gating model is trained independently by using the global model and the individual classification layer, the individual classification layer and the global model are mixed, and the global knowledge is kept while the individuation capability is improved; the invention takes the individual classification layer and the global model as local and global experts to form a mixed expert model, then uses the gating model to combine the experts, and adopts the output of the characteristic extraction layer as the input of the gating model, so that the gating model can more effectively divide input data.
Drawings
Fig. 1 is a flowchart of the personalized federal learning method based on a hybrid expert model according to embodiment 1.
Fig. 2 is a flowchart of the personalized federal learning method based on a hybrid expert model according to embodiment 1.
Fig. 3 is a schematic structural diagram of the client training model in embodiment 1.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the patent;
for the purpose of better illustrating the embodiments, certain features of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product;
it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The technical solution of the present invention is further described below with reference to the accompanying drawings and examples.
Example 1
The embodiment provides an individualized federal learning method based on a hybrid expert model, which is a flow chart of the individualized federal learning method based on the hybrid expert model according to the embodiment, as shown in fig. 1-2.
The personalized federal learning method based on the hybrid expert model provided by the embodiment comprises the following steps:
s1: all clients participate in the training of the global model together by adding federal learning to obtain a global model parameter thetaG;
S2: each client i downloads the global model parameter theta from the server respectivelyGAnd initializing parameters of a personality classification layer, a feature extraction layer and a global classification layer in the client i by using the parameters, and performing random initialization by using a gating model.
Further, the step of S1 includes the steps of:
the client i sends a global model parameter theta according to the serverG(θE,θC,G) Local training is carried out by utilizing a gradient descent method to obtain an updated global model parameter thetaG(θE,θC,G) And uploading to a server; and repeating the steps until the federal training is completed. Wherein theta isEFor the parameters of the feature extraction layer, θC,GParameters of the global classification layer.
S3: the client i carries out personalized federal learning and inputs local data x into the feature extraction layer MEIn (1), the activation value A is obtainedE,xThen, the activation values are respectively input into a global classification layer, an individual classification layer and a gate control model, and the individual classification layer and the gate control model are respectively based on a global model parameter thetaGPerforming personalized federal learning, training to obtain personalized classification layer parametersAnd gating model parametersWhere i represents the client number.
When the individual classification layer is subjected to individual federal learning, the global model parameter theta received by the client i is used for learningG(θE,θC,G) By fixing the parameter theta of the feature extraction layerETo classify the layer parameters for the personalityFine tuning and personalized federal learning. The method comprises the following specific steps:
s3.1: initializing setting personality classification layer parametersSetting local fine-tuning hyper-parameters;
s3.2: extracting mini-batch (x, y) epsilon D from local data of client iiInputting the data into a feature extraction layer to obtain an activation value AE,x(ii) a The expression formula is as follows:
AE,x=ME(x,θE);
s3.3: will activate the value AE,xInputting into the individual classification layer, outputting to obtain the prediction labelThe expression formula is as follows:
s3.4: computing predictive labelsCross entropy loss1 with real label y, using back propagation algorithm to obtain individual classification layer parameterGradient of (2)The expression formula is as follows:
s3.5: classifying layer parameters according to personalityGradient of (2)Classifying the parameters of the hierarchy for the personalityUpdating, wherein the expression formula is as follows:
then judging whether the preset number of training rounds is reached, if not, skipping to execute the step S3.2; if yes, outputting the parameters of the individual classification layerAnd completing personalized federal learning of the personalized classification layer.
In the formula, ME(. cndot.) represents a feature extraction layer,representing a personality classification level; CEL (·) represents a cross-entropy loss function; alpha represents the learning rate of the individual classification layer for individual federal learning; diRepresenting a client local data set.
Wherein, when carrying out personalized federal learning on the gating model, the global model parameter theta is used for learningGBy fixing the parameter theta of the feature extraction layerEAnd personality classification level parametersTo gate control model parametersFine tuning and personalized federal learning. The specific steps ofThe following were used:
s3.7: extracting mini-batch (x, y) epsilon D from local data of client iiInputting the data into a feature extraction layer to obtain an activation value AE,x(ii) a The expression formula is as follows:
AE,x=ME(x,θE);
s3.8: will activate the value AE,xRespectively inputting individual classification layer, global classification layer and gating model, aggregating output results to complete forward propagation, and outputting to obtain prediction labelThe expression formula is as follows:
s3.9: computing predictive labelsCross entropy loss with real label y 2, using back propagation algorithm to obtain gating model parameterGradient of (2)The expression formula is as follows:
s3.10: according to gated model parametersGradient of (2)To gate control model parametersUpdating, wherein the expression formula is as follows:
then judging whether the preset number of training rounds is reached, if not, skipping to execute the step S3.7; if yes, outputting the gating model parametersAnd completing personalized federal learning of the gating model.
In addition, in this step, the super-parameters preset in the individual classification layer or the gating model include the learning rate and the number of training rounds of the individual federal learning.
S4: judging whether a preset training round number is reached, if so, completing personalized federal learning by the client; if not, the step S2 is executed.
Further, the personalized federal learning method further comprises the following steps: the client i takes the tasks to be classified as input data x' and calculates the prediction probability prob of each classification for judging the personalized federal learning result; the expression formula is as follows:
AE,x′=ME(x′,θE)
prob=softmax(y′)
in the formula, y 'represents a prediction label corresponding to the output of the input data x'; softmax (·) denotes a softmax function.
In this embodiment, the number of the clients is multiple, the clients respectively hold the private local data, and the clients receive the global model parameters issued by the server and apply the global model parameters to personalized federal learning. The server in this embodiment is a single federal server, and is responsible for federal learning task allocation, coordinating available clients, receiving global model parameters uploaded from the clients, aggregating the global model parameters to obtain latest global model parameters, and then issuing the latest global model parameters to the clients again.
In this embodiment, the training model of the client includes an individual classification layer, a gating model and a global model, the global model includes a global classification layer and a feature extraction layer, specifically, an output end of the feature extraction layer is connected to input ends of the individual classification layer, the gating model and the global classification layer, output ends of the individual classification layer, the gating model and the global classification layer are connected to an input end of an aggregation weighting layer, and an output end of the aggregation weighting layer is an output end of the training model of the client. Fig. 3 is a schematic structural diagram of the client training model of this embodiment.
In this embodiment, the personality classification layer and the global model are used as local and global experts to form a hybrid expert model, and then a gating model is used to combine the experts. Furthermore, in the embodiment, the output of the feature extraction layer is used as the input of the gating model, so that the gating model can more effectively divide the input data.
In this embodiment, after the client and the server complete federal learning, the client further finely adjusts parameters of the individual classification layer and the gating model, and then separately trains the gating model by using the global model and the individual classification layer, so as to realize mixing of the individual classification layer and the global model, and maintain global knowledge while improving individuation ability.
The same or similar reference numerals correspond to the same or similar parts;
the terms describing positional relationships in the drawings are for illustrative purposes only and are not to be construed as limiting the patent;
it should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.
Claims (10)
1. The personalized federal learning method based on the hybrid expert model is characterized by comprising the following steps of:
s1: all clients participate in the training of the global model together by adding federal learning to obtain a global model parameter thetaG;
S2: each client downloads the global model parameter theta from the serverGInitializing parameters of a personality classification layer, a feature extraction layer and a global classification layer in the client i by using the parameters, and performing random initialization on a gating model;
s3: the client i carries out personalized federal learning, local data are input into a feature extraction layer to obtain an activation value, then the activation value is respectively input into a global classification layer, a personalized classification layer and a gating model, and the personalized classification layer and the gating model are respectively input into the global classification layer, the personalized classification layer and the gating model according to a global model parameter thetaGPerforming personalized federal learning, training to obtain personalized classification layer parametersAnd gating model parameters
2. The personalized federal learning method as claimed in claim 1, wherein the step S1 further comprises the steps of: the client i sends a global model parameter theta according to the serverG(θE,θC,G) Local training is carried out by utilizing a gradient descent method to obtain an updated global model parameter thetaG(θE,θC,G) And uploading to a server; wherein theta isEFor the parameters of the feature extraction layer, θC,GParameters of the global classification layer.
3. The personalized federal learning method as claimed in claim 2, wherein in the step S3, the global model parameter θ is received from the client iG(θE,θC,G) By fixing the parameter theta of the feature extraction layerETo classify the layer parameters for the personalityFine tuning and personalized federal learning.
4. The personalized federal learning method as claimed in claim 3, wherein the personalized classification layer performs personalized federal learning according to global model parameters by the following specific steps:
s3.1: initializing setting personality classification layer parametersSetting local fine-tuning hyper-parameters;
s3.2: extracting mini-batch (x, y) epsilon D from local data of client iiInputting the data into a feature extraction layer to obtain an activation value AE,x;
S3.3: will activate the value AE,xInputting into the individual classification layer, outputting to obtain the prediction label
S3.4: computing the predictive labelCross entropy loss1 with real label y, using back propagation algorithm to obtain individual classification layer parameterGradient of (2)
S3.5: classifying layer parameters according to the personalityGradient of (2)Classifying the parameters of the hierarchy for the personalityUpdating, judging whether the preset number of training rounds is reached, and if not, skipping to execute the step S3.2; if yes, outputting the parameters of the individual classification layerPersonalized federation to complete a hierarchy of personality classificationsAnd (5) learning.
5. The personalized federal learning method as claimed in claim 4, wherein the expression formula of the personalized federal learning by the personality classification layer according to the global model parameters is as follows:
AE,x=ME(x,θE)
in the formula, ME(. cndot.) represents a feature extraction layer,representing a personality classification level; CEL (·) represents a cross-entropy loss function; alpha represents the learning rate of the individual classification layer for individual federal learning; diRepresenting a client local data set.
6. The personalized federal learning method as claimed in claim 5, wherein in the step S3, the global model parameter θ is used as a basisGBy fixing the parameter theta of the feature extraction layerEAnd personality classification level parametersTo gate control model parametersFine tuning and personalized federal learning.
7. The personalized federal learning method as claimed in claim 6, wherein the gated model performs personalized federal learning according to global model parameters by the following steps:
s3.7: extracting mini-batch (x, y) epsilon D from local data of client iiInputting the data into a feature extraction layer to obtain an activation value AE,x;
S3.8: will activate the value AE,xRespectively inputting individual classification layer, global classification layer and gating model, aggregating output results to complete forward propagation, and outputting to obtain prediction label
S3.9: computing the predictive labelCross entropy loss with real label y 2, using back propagation algorithm to obtain gating model parameterGradient of (2)
S3.10: according to the gating model parametersGradient of (2)To gate control model parametersUpdating, judging whether the preset number of training rounds is reached, and if not, skipping to execute the step S3.7; if yes, outputting the gating model parametersAnd completing personalized federal learning of the gating model.
8. The personalized federal learning method as claimed in claim 7, wherein the gating model performs personalized federal learning according to global model parameters by using the following expression formula:
AE,x=ME(x,θE)
wherein gateout represents an activation valueThe value range of the ratio of the global model obtained by inputting the gating model to the output of the individual classification layer is [0,1 ]];MC(. to) represents a global classification layer; beta represents the learning rate of the gated model for personalized federal learning.
9. The personalized federal learning method as claimed in claim 4 or 7, wherein the hyper-parameters include learning rate and training round number.
10. The personalized federal learning method as claimed in claim 8, further comprising the steps of: the client i takes the task to be classified as input data x' and calculates the prediction probability prob of each classification, and the expression formula is as follows:
AE,x′=ME(x′,θE)
gateout=MGate(AE,x′,θGatei)
prob=softmax(y′)
in the formula, y 'represents a prediction label corresponding to the output of the input data x'; softmax (·) denotes a softmax function.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011567011.XA CN112560991B (en) | 2020-12-25 | 2020-12-25 | Personalized federal learning method based on mixed expert model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011567011.XA CN112560991B (en) | 2020-12-25 | 2020-12-25 | Personalized federal learning method based on mixed expert model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112560991A true CN112560991A (en) | 2021-03-26 |
CN112560991B CN112560991B (en) | 2023-07-07 |
Family
ID=75033035
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011567011.XA Active CN112560991B (en) | 2020-12-25 | 2020-12-25 | Personalized federal learning method based on mixed expert model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112560991B (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113095238A (en) * | 2021-04-15 | 2021-07-09 | 山东省人工智能研究院 | Personalized electrocardiosignal monitoring method based on federal learning |
CN113537509A (en) * | 2021-06-28 | 2021-10-22 | 南方科技大学 | Collaborative model training method and device |
CN113688862A (en) * | 2021-07-09 | 2021-11-23 | 深圳大学 | Brain image classification method based on semi-supervised federal learning and terminal equipment |
CN114357067A (en) * | 2021-12-15 | 2022-04-15 | 华南理工大学 | Personalized federal meta-learning method for data isomerism |
CN114429195A (en) * | 2022-01-21 | 2022-05-03 | 清华大学 | Performance optimization method and device for hybrid expert model training |
CN114818996A (en) * | 2022-06-28 | 2022-07-29 | 山东大学 | Method and system for diagnosing mechanical fault based on federal domain generalization |
CN118410851A (en) * | 2024-07-03 | 2024-07-30 | 浪潮电子信息产业股份有限公司 | Mixed expert model routing network optimization method, product, device and medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111275207A (en) * | 2020-02-10 | 2020-06-12 | 深圳前海微众银行股份有限公司 | Semi-supervision-based horizontal federal learning optimization method, equipment and storage medium |
CN111291897A (en) * | 2020-02-10 | 2020-06-16 | 深圳前海微众银行股份有限公司 | Semi-supervision-based horizontal federal learning optimization method, equipment and storage medium |
CN111310938A (en) * | 2020-02-10 | 2020-06-19 | 深圳前海微众银行股份有限公司 | Semi-supervision-based horizontal federal learning optimization method, equipment and storage medium |
-
2020
- 2020-12-25 CN CN202011567011.XA patent/CN112560991B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111275207A (en) * | 2020-02-10 | 2020-06-12 | 深圳前海微众银行股份有限公司 | Semi-supervision-based horizontal federal learning optimization method, equipment and storage medium |
CN111291897A (en) * | 2020-02-10 | 2020-06-16 | 深圳前海微众银行股份有限公司 | Semi-supervision-based horizontal federal learning optimization method, equipment and storage medium |
CN111310938A (en) * | 2020-02-10 | 2020-06-19 | 深圳前海微众银行股份有限公司 | Semi-supervision-based horizontal federal learning optimization method, equipment and storage medium |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113095238A (en) * | 2021-04-15 | 2021-07-09 | 山东省人工智能研究院 | Personalized electrocardiosignal monitoring method based on federal learning |
CN113095238B (en) * | 2021-04-15 | 2021-12-28 | 山东省人工智能研究院 | Personalized electrocardiosignal monitoring method based on federal learning |
CN113537509A (en) * | 2021-06-28 | 2021-10-22 | 南方科技大学 | Collaborative model training method and device |
CN113688862A (en) * | 2021-07-09 | 2021-11-23 | 深圳大学 | Brain image classification method based on semi-supervised federal learning and terminal equipment |
CN113688862B (en) * | 2021-07-09 | 2023-07-04 | 深圳大学 | Brain image classification method based on semi-supervised federal learning and terminal equipment |
CN114357067A (en) * | 2021-12-15 | 2022-04-15 | 华南理工大学 | Personalized federal meta-learning method for data isomerism |
CN114357067B (en) * | 2021-12-15 | 2024-06-25 | 华南理工大学 | Personalized federal element learning method aiming at data isomerism |
CN114429195A (en) * | 2022-01-21 | 2022-05-03 | 清华大学 | Performance optimization method and device for hybrid expert model training |
CN114429195B (en) * | 2022-01-21 | 2024-07-19 | 清华大学 | Performance optimization method and device for training mixed expert model |
CN114818996A (en) * | 2022-06-28 | 2022-07-29 | 山东大学 | Method and system for diagnosing mechanical fault based on federal domain generalization |
CN118410851A (en) * | 2024-07-03 | 2024-07-30 | 浪潮电子信息产业股份有限公司 | Mixed expert model routing network optimization method, product, device and medium |
Also Published As
Publication number | Publication date |
---|---|
CN112560991B (en) | 2023-07-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112560991A (en) | Personalized federal learning method based on hybrid expert model | |
Zhang et al. | A novel federated learning scheme for generative adversarial networks | |
Petzka et al. | On the regularization of wasserstein gans | |
CN106779087B (en) | A kind of general-purpose machinery learning data analysis platform | |
Liu et al. | Ensemble learning via negative correlation | |
Cao et al. | PerFED-GAN: Personalized federated learning via generative adversarial networks | |
CN106067042B (en) | Polarization SAR classification method based on semi-supervised depth sparseness filtering network | |
CN113191484A (en) | Federal learning client intelligent selection method and system based on deep reinforcement learning | |
CN109983480A (en) | Use cluster loss training neural network | |
CN109754068A (en) | Transfer learning method and terminal device based on deep learning pre-training model | |
CN107145860B (en) | Classification of Polarimetric SAR Image method based on spatial information and deep learning | |
CN109829049A (en) | The method for solving video question-answering task using the progressive space-time attention network of knowledge base | |
CN115907001B (en) | Knowledge distillation-based federal graph learning method and automatic driving method | |
US20220318412A1 (en) | Privacy-aware pruning in machine learning | |
CN116523079A (en) | Reinforced learning-based federal learning optimization method and system | |
Jin et al. | Image generation method based on improved condition GAN | |
CN116348881A (en) | Combined mixing model | |
Cai et al. | Multi-granularity weighted federated learning in heterogeneous mobile edge computing systems | |
Hihn et al. | Bounded rational decision-making with adaptive neural network priors | |
CN106250928A (en) | Parallel logic homing method based on Graphics Processing Unit and system | |
CN116719607A (en) | Model updating method and system based on federal learning | |
CN113449867B (en) | Deep reinforcement learning multi-agent cooperation method based on knowledge distillation | |
Tao et al. | Communication efficient federated learning via channel-wise dynamic pruning | |
Zhang et al. | FedCR: Personalized federated learning based on across-client common representation with conditional mutual information regularization | |
Nawaz et al. | K-DUMBs IoRT: Knowledge Driven Unified Model Block Sharing in the Internet of Robotic Things |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |