CN112560991A - Personalized federal learning method based on hybrid expert model - Google Patents

Personalized federal learning method based on hybrid expert model Download PDF

Info

Publication number
CN112560991A
CN112560991A CN202011567011.XA CN202011567011A CN112560991A CN 112560991 A CN112560991 A CN 112560991A CN 202011567011 A CN202011567011 A CN 202011567011A CN 112560991 A CN112560991 A CN 112560991A
Authority
CN
China
Prior art keywords
model
parameters
federal learning
global
personalized
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011567011.XA
Other languages
Chinese (zh)
Other versions
CN112560991B (en
Inventor
郭斌彬
肖丹阳
吴维刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
Original Assignee
Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Yat Sen University filed Critical Sun Yat Sen University
Priority to CN202011567011.XA priority Critical patent/CN112560991B/en
Publication of CN112560991A publication Critical patent/CN112560991A/en
Application granted granted Critical
Publication of CN112560991B publication Critical patent/CN112560991B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides an individualized federal learning method based on a mixed expert model, aiming at overcoming the defect that a private model in a large-scale stateless mobile federal environment is difficult to realize full training: all clients participate in the training of the global model together by adding federal learning to obtain a global model parameter thetaG(ii) a Each client downloads theta from the server respectivelyGInitializing a feature extraction layer and an individual classification layer in the client by using the parameter, and obtaining an individual classification layer parameter by individualizing by using a fixed base layer method; at the moment, the client i has theta comprising a feature extraction layer parameter and a global classification layer parameterGAnd the parameters of the individual classification layer, the feature extraction layer, the global classification layer and the individual classification layer are initialized by utilizing the three, the gating model is trained together, and the parameters of the gating model are obtained
Figure DDA0002860951720000011
And finally, the client side obtains parameters of the feature extraction layer, the global classification layer, the individual classification layer and the gating model, and the individual federal learning is completed.

Description

Personalized federal learning method based on hybrid expert model
Technical Field
The invention relates to the field of federal learning, in particular to a personalized federal learning method based on a mixed expert model.
Background
In the deep learning field, the quantity and quality of training data largely determine the training effect of the deep neural network model. Under the premise of considering user privacy, user data can not be collected to a data center, but an effective model is difficult to obtain through training only by island data of each client. The federated learning of data retention in the local client is an effective solution, and the federated learning obtains model updates by using data and computing resources dispersed in each client, and obtains a global model by aggregating the model updates of each client, thereby training to obtain a global model which makes full use of the global data while ensuring the privacy of user data. In personalized federal learning, each client obtains an independent personalized model suitable for local data distribution based on a global model, and the personalized model is more in line with an optimization target considered from the perspective of the client. However, personalization is achieved, meanwhile, global knowledge is forgotten by a client, and how to balance the relationship between the two is a research hotspot of personalized federal learning at the present stage.
A hybrid expert Model (MOE) is an integrated learning method that effectively utilizes multiple learners, decomposing complex tasks into individual expert models, and then using a gating model to combine the experts. The privacy federation learning of the field self-adaptation is inspired by an MOE framework, a local private model and a model participating in global updating are used as local and global experts, and the output ratio of the local private model and the model is obtained by a gating model, so that the effect of the local and global field self-adaptation is achieved. For example, a model training method based on domain adaptation and federal learning is proposed in publication No. CN111738440A (published japanese 2020-10-02).
However, private adaptive privacy federated learning requires all clients to participate synchronously, and at the same time, all clients need to keep intermediate states including local gating models and private model states in the federated training process, which makes it limited to federated learning between institutions, and it is difficult to achieve sufficient training for large-scale stateless mobile federated environment private models. Because the gating model is a single-layer linear neural network, the gating model cannot directly utilize high-dimensional input data to effectively balance the private model and the global model in a federal training task based on high-dimensional data such as images.
Disclosure of Invention
The invention provides a personalized federal learning method based on a mixed expert model, aiming at overcoming the defect that the large-scale stateless mobile federal environment private model in the prior art is difficult to realize full training.
In order to solve the technical problems, the technical scheme of the invention is as follows:
the personalized federal learning method based on the hybrid expert model comprises the following steps:
s1: all clients participate in the training of the global model together by adding federal learning to obtain a global model parameter thetaG
S2: each client i downloads the global model parameter theta from the server respectivelyGInitializing parameters of a personality classification layer, a feature extraction layer and a global classification layer in the client i by using the parameters, and performing random initialization on a gating model;
s3: the client i carries out personalized federal learning, local data are input into the feature extraction layer to obtain an activation value, then the activation value is respectively input into the global classification layer, the personalized classification layer and the gating model, and the personalized classification layer and the gating model are respectively input into the global classification layer, the personalized classification layer and the gating model according to the global model parameter thetaGPerforming personalized federal learning, training to obtain personalized classification layer parameters
Figure BDA0002860951700000021
And gating model parameters
Figure BDA0002860951700000022
S4: judging whether a preset training round number is reached, if so, obtaining an individual classification layer parameter
Figure BDA0002860951700000023
And gating model parameters
Figure BDA0002860951700000024
The client i finishes personalized federal learning; if not, the step S3 is executed.
In the technical scheme, the number of the clients is i, the clients respectively hold the private local data, and the clients receive the global model parameters issued by the server and are applied to personalized federal learning. The client training model comprises an individual classification layer, a gating model and a global model, the global model comprises a global classification layer and a feature extraction layer, specifically, the output end of the feature extraction layer is respectively connected with the input ends of the individual classification layer, the gating model and the global classification layer, the output ends of the individual classification layer, the gating model and the global classification layer are respectively connected with the input end of an aggregation weighting layer, and the output end of the aggregation weighting layer is the output end of the client training model.
Preferably, the step of S1 further includes the steps of: the client i sends a global model parameter theta according to the serverGEC,G) Local training is carried out by utilizing a gradient descent method to obtain an updated global model parameter thetaGEC,G) And uploading to a server; wherein theta isEFor the parameters of the feature extraction layer, θC,GParameters of the global classification layer.
Preferably, in step S3, the global model parameter θ is received according to the clientGEC,G) By fixing the parameter theta of the feature extraction layerETo classify the layer parameters for the personality
Figure BDA0002860951700000025
Fine tuning and personalized federal learning.
Preferably, the specific steps of the individual classification layer for performing the individual federal learning according to the global model parameters are as follows:
s3.1: initializing setting personality classification layer parameters
Figure BDA0002860951700000031
Setting local fine-tuning hyper-parameters;
S3.2:extracting mini-batch (x, y) epsilon D from local data of client iiInputting the data into a feature extraction layer to obtain an activation value AE,x
S3.3: will activate the value AE,xInputting output predictive tags in a personality classification layer
Figure BDA0002860951700000032
S3.4: computing predictive labels
Figure BDA0002860951700000033
Cross entropy loss1 with real label y, using back propagation algorithm to obtain individual classification layer parameter
Figure BDA0002860951700000034
Gradient of (2)
Figure BDA0002860951700000035
S3.5: classifying layer parameters according to personality
Figure BDA0002860951700000036
Gradient of (2)
Figure BDA0002860951700000037
Classifying the parameters of the hierarchy for the personality
Figure BDA0002860951700000038
Updating, judging whether the preset number of training rounds is reached, and if not, skipping to execute the step S3.2; if yes, outputting the parameters of the individual classification layer
Figure BDA0002860951700000039
And completing personalized federal learning of the personalized classification layer.
Preferably, the expression formula of the individual classification layer for performing the individual federal learning according to the global model parameters is as follows:
AE,x=ME(x,θE)
Figure BDA00028609517000000310
Figure BDA00028609517000000311
Figure BDA00028609517000000312
Figure BDA00028609517000000313
in the formula, ME(. cndot.) represents a feature extraction layer,
Figure BDA00028609517000000314
representing a personality classification level; CEL (·) represents a cross-entropy loss function; alpha represents the learning rate of the individual classification layer for individual federal learning; diRepresenting a client local data set.
Preferably, in step S3, the global model parameter θ is used as a basisGBy fixing the parameter theta of the feature extraction layerEAnd personality classification level parameters
Figure BDA00028609517000000315
To gate control model parameters
Figure BDA00028609517000000316
Fine tuning and personalized federal learning.
Preferably, the steps of the gated model for personalized federal learning according to global model parameters are as follows:
s3.6: random initialization of gated model parameters
Figure BDA00028609517000000317
Setting local fine-tuning hyper-parameters;
s3.7: extracting mini-batch (x, y) epsilon D from local data of client iiInputting the data into a feature extraction layer to obtain an activation value AE,x
S3.8: will activate the value AE,xRespectively inputting the individual classification layer, the global classification layer and the gating model, aggregating the output results to complete forward propagation, and obtaining an output prediction label
Figure BDA00028609517000000318
S3.9: computing predictive labels
Figure BDA00028609517000000319
Cross entropy loss with real label y 2, using back propagation algorithm to obtain gating model parameter
Figure BDA0002860951700000041
Gradient of (2)
Figure BDA0002860951700000042
S3.10: according to gated model parameters
Figure BDA0002860951700000043
Gradient of (2)
Figure BDA0002860951700000044
To gate control model parameters
Figure BDA0002860951700000045
Updating, judging whether the preset number of training rounds is reached, and if not, skipping to execute the step S3.7; if yes, outputting the gating model parameters
Figure BDA0002860951700000046
And completing personalized federal learning of the gating model.
Preferably, the gating model performs personalized federal learning according to global model parameters by using the following expression formula:
AE,x=ME(x,θE)
Figure BDA0002860951700000047
Figure BDA0002860951700000048
Figure BDA0002860951700000049
Figure BDA00028609517000000410
Figure BDA00028609517000000411
in the formula, gateout represents the ratio of the global model obtained by inputting the activation value into the gating model and the output of the individual classification layer, and the value range is [0,1 ]];MC(. to) represents a global classification layer; beta represents the learning rate of the gated model for personalized federal learning.
Preferably, the hyper-parameters comprise a learning rate and a number of training rounds.
Preferably, the personalized federal learning method further comprises the steps of: the client i takes the tasks to be classified as input data x' and calculates the prediction probability prob of each classification for judging the personalized federal learning result; the expression formula is as follows:
AE,x′=ME(x′,θE)
Figure BDA00028609517000000412
Figure BDA00028609517000000413
prob=softmax(y′)
in the formula, y 'represents a prediction label corresponding to the output of the input data x'; softmax (·) denotes a softmax function.
Compared with the prior art, the technical scheme of the invention has the beneficial effects that: according to global model parameters obtained through federal learning, parameters of the individual classification layer and the gating model are finely adjusted, the gating model is trained independently by using the global model and the individual classification layer, the individual classification layer and the global model are mixed, and the global knowledge is kept while the individuation capability is improved; the invention takes the individual classification layer and the global model as local and global experts to form a mixed expert model, then uses the gating model to combine the experts, and adopts the output of the characteristic extraction layer as the input of the gating model, so that the gating model can more effectively divide input data.
Drawings
Fig. 1 is a flowchart of the personalized federal learning method based on a hybrid expert model according to embodiment 1.
Fig. 2 is a flowchart of the personalized federal learning method based on a hybrid expert model according to embodiment 1.
Fig. 3 is a schematic structural diagram of the client training model in embodiment 1.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the patent;
for the purpose of better illustrating the embodiments, certain features of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product;
it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The technical solution of the present invention is further described below with reference to the accompanying drawings and examples.
Example 1
The embodiment provides an individualized federal learning method based on a hybrid expert model, which is a flow chart of the individualized federal learning method based on the hybrid expert model according to the embodiment, as shown in fig. 1-2.
The personalized federal learning method based on the hybrid expert model provided by the embodiment comprises the following steps:
s1: all clients participate in the training of the global model together by adding federal learning to obtain a global model parameter thetaG
S2: each client i downloads the global model parameter theta from the server respectivelyGAnd initializing parameters of a personality classification layer, a feature extraction layer and a global classification layer in the client i by using the parameters, and performing random initialization by using a gating model.
Further, the step of S1 includes the steps of:
the client i sends a global model parameter theta according to the serverGEC,G) Local training is carried out by utilizing a gradient descent method to obtain an updated global model parameter thetaGEC,G) And uploading to a server; and repeating the steps until the federal training is completed. Wherein theta isEFor the parameters of the feature extraction layer, θC,GParameters of the global classification layer.
S3: the client i carries out personalized federal learning and inputs local data x into the feature extraction layer MEIn (1), the activation value A is obtainedE,xThen, the activation values are respectively input into a global classification layer, an individual classification layer and a gate control model, and the individual classification layer and the gate control model are respectively based on a global model parameter thetaGPerforming personalized federal learning, training to obtain personalized classification layer parameters
Figure BDA0002860951700000051
And gating model parameters
Figure BDA0002860951700000052
Where i represents the client number.
When the individual classification layer is subjected to individual federal learning, the global model parameter theta received by the client i is used for learningGEC,G) By fixing the parameter theta of the feature extraction layerETo classify the layer parameters for the personality
Figure BDA0002860951700000061
Fine tuning and personalized federal learning. The method comprises the following specific steps:
s3.1: initializing setting personality classification layer parameters
Figure BDA0002860951700000062
Setting local fine-tuning hyper-parameters;
s3.2: extracting mini-batch (x, y) epsilon D from local data of client iiInputting the data into a feature extraction layer to obtain an activation value AE,x(ii) a The expression formula is as follows:
AE,x=ME(x,θE);
s3.3: will activate the value AE,xInputting into the individual classification layer, outputting to obtain the prediction label
Figure BDA0002860951700000063
The expression formula is as follows:
Figure BDA0002860951700000064
s3.4: computing predictive labels
Figure BDA0002860951700000065
Cross entropy loss1 with real label y, using back propagation algorithm to obtain individual classification layer parameter
Figure BDA0002860951700000066
Gradient of (2)
Figure BDA0002860951700000067
The expression formula is as follows:
Figure BDA0002860951700000068
Figure BDA0002860951700000069
s3.5: classifying layer parameters according to personality
Figure BDA00028609517000000610
Gradient of (2)
Figure BDA00028609517000000611
Classifying the parameters of the hierarchy for the personality
Figure BDA00028609517000000612
Updating, wherein the expression formula is as follows:
Figure BDA00028609517000000613
then judging whether the preset number of training rounds is reached, if not, skipping to execute the step S3.2; if yes, outputting the parameters of the individual classification layer
Figure BDA00028609517000000614
And completing personalized federal learning of the personalized classification layer.
In the formula, ME(. cndot.) represents a feature extraction layer,
Figure BDA00028609517000000615
representing a personality classification level; CEL (·) represents a cross-entropy loss function; alpha represents the learning rate of the individual classification layer for individual federal learning; diRepresenting a client local data set.
Wherein, when carrying out personalized federal learning on the gating model, the global model parameter theta is used for learningGBy fixing the parameter theta of the feature extraction layerEAnd personality classification level parameters
Figure BDA00028609517000000616
To gate control model parameters
Figure BDA00028609517000000617
Fine tuning and personalized federal learning. The specific steps ofThe following were used:
s3.6: random initialization of gated model parameters
Figure BDA00028609517000000618
Setting local fine-tuning hyper-parameters;
s3.7: extracting mini-batch (x, y) epsilon D from local data of client iiInputting the data into a feature extraction layer to obtain an activation value AE,x(ii) a The expression formula is as follows:
AE,x=ME(x,θE);
s3.8: will activate the value AE,xRespectively inputting individual classification layer, global classification layer and gating model, aggregating output results to complete forward propagation, and outputting to obtain prediction label
Figure BDA0002860951700000071
The expression formula is as follows:
Figure BDA0002860951700000072
Figure BDA0002860951700000073
s3.9: computing predictive labels
Figure BDA0002860951700000074
Cross entropy loss with real label y 2, using back propagation algorithm to obtain gating model parameter
Figure BDA0002860951700000075
Gradient of (2)
Figure BDA0002860951700000076
The expression formula is as follows:
Figure BDA0002860951700000077
Figure BDA0002860951700000078
s3.10: according to gated model parameters
Figure BDA0002860951700000079
Gradient of (2)
Figure BDA00028609517000000710
To gate control model parameters
Figure BDA00028609517000000711
Updating, wherein the expression formula is as follows:
Figure BDA00028609517000000712
then judging whether the preset number of training rounds is reached, if not, skipping to execute the step S3.7; if yes, outputting the gating model parameters
Figure BDA00028609517000000713
And completing personalized federal learning of the gating model.
In addition, in this step, the super-parameters preset in the individual classification layer or the gating model include the learning rate and the number of training rounds of the individual federal learning.
S4: judging whether a preset training round number is reached, if so, completing personalized federal learning by the client; if not, the step S2 is executed.
Further, the personalized federal learning method further comprises the following steps: the client i takes the tasks to be classified as input data x' and calculates the prediction probability prob of each classification for judging the personalized federal learning result; the expression formula is as follows:
AE,x′=ME(x′,θE)
Figure BDA00028609517000000714
Figure BDA00028609517000000715
prob=softmax(y′)
in the formula, y 'represents a prediction label corresponding to the output of the input data x'; softmax (·) denotes a softmax function.
In this embodiment, the number of the clients is multiple, the clients respectively hold the private local data, and the clients receive the global model parameters issued by the server and apply the global model parameters to personalized federal learning. The server in this embodiment is a single federal server, and is responsible for federal learning task allocation, coordinating available clients, receiving global model parameters uploaded from the clients, aggregating the global model parameters to obtain latest global model parameters, and then issuing the latest global model parameters to the clients again.
In this embodiment, the training model of the client includes an individual classification layer, a gating model and a global model, the global model includes a global classification layer and a feature extraction layer, specifically, an output end of the feature extraction layer is connected to input ends of the individual classification layer, the gating model and the global classification layer, output ends of the individual classification layer, the gating model and the global classification layer are connected to an input end of an aggregation weighting layer, and an output end of the aggregation weighting layer is an output end of the training model of the client. Fig. 3 is a schematic structural diagram of the client training model of this embodiment.
In this embodiment, the personality classification layer and the global model are used as local and global experts to form a hybrid expert model, and then a gating model is used to combine the experts. Furthermore, in the embodiment, the output of the feature extraction layer is used as the input of the gating model, so that the gating model can more effectively divide the input data.
In this embodiment, after the client and the server complete federal learning, the client further finely adjusts parameters of the individual classification layer and the gating model, and then separately trains the gating model by using the global model and the individual classification layer, so as to realize mixing of the individual classification layer and the global model, and maintain global knowledge while improving individuation ability.
The same or similar reference numerals correspond to the same or similar parts;
the terms describing positional relationships in the drawings are for illustrative purposes only and are not to be construed as limiting the patent;
it should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.

Claims (10)

1. The personalized federal learning method based on the hybrid expert model is characterized by comprising the following steps of:
s1: all clients participate in the training of the global model together by adding federal learning to obtain a global model parameter thetaG
S2: each client downloads the global model parameter theta from the serverGInitializing parameters of a personality classification layer, a feature extraction layer and a global classification layer in the client i by using the parameters, and performing random initialization on a gating model;
s3: the client i carries out personalized federal learning, local data are input into a feature extraction layer to obtain an activation value, then the activation value is respectively input into a global classification layer, a personalized classification layer and a gating model, and the personalized classification layer and the gating model are respectively input into the global classification layer, the personalized classification layer and the gating model according to a global model parameter thetaGPerforming personalized federal learning, training to obtain personalized classification layer parameters
Figure FDA0002860951690000011
And gating model parameters
Figure FDA0002860951690000012
S4: judging whether a preset training round number is reached, if so, obtaining an individual classification layer parameter
Figure FDA0002860951690000013
And gating model parameters
Figure FDA0002860951690000014
The client i finishes personalized federal learning; if not, the step S3 is executed.
2. The personalized federal learning method as claimed in claim 1, wherein the step S1 further comprises the steps of: the client i sends a global model parameter theta according to the serverGEC,G) Local training is carried out by utilizing a gradient descent method to obtain an updated global model parameter thetaGEC,G) And uploading to a server; wherein theta isEFor the parameters of the feature extraction layer, θC,GParameters of the global classification layer.
3. The personalized federal learning method as claimed in claim 2, wherein in the step S3, the global model parameter θ is received from the client iGEC,G) By fixing the parameter theta of the feature extraction layerETo classify the layer parameters for the personality
Figure FDA0002860951690000015
Fine tuning and personalized federal learning.
4. The personalized federal learning method as claimed in claim 3, wherein the personalized classification layer performs personalized federal learning according to global model parameters by the following specific steps:
s3.1: initializing setting personality classification layer parameters
Figure FDA0002860951690000016
Setting local fine-tuning hyper-parameters;
s3.2: extracting mini-batch (x, y) epsilon D from local data of client iiInputting the data into a feature extraction layer to obtain an activation value AE,x
S3.3: will activate the value AE,xInputting into the individual classification layer, outputting to obtain the prediction label
Figure FDA0002860951690000017
S3.4: computing the predictive label
Figure FDA0002860951690000018
Cross entropy loss1 with real label y, using back propagation algorithm to obtain individual classification layer parameter
Figure FDA0002860951690000019
Gradient of (2)
Figure FDA00028609516900000110
S3.5: classifying layer parameters according to the personality
Figure FDA00028609516900000111
Gradient of (2)
Figure FDA00028609516900000112
Classifying the parameters of the hierarchy for the personality
Figure FDA00028609516900000113
Updating, judging whether the preset number of training rounds is reached, and if not, skipping to execute the step S3.2; if yes, outputting the parameters of the individual classification layer
Figure FDA0002860951690000021
Personalized federation to complete a hierarchy of personality classificationsAnd (5) learning.
5. The personalized federal learning method as claimed in claim 4, wherein the expression formula of the personalized federal learning by the personality classification layer according to the global model parameters is as follows:
AE,x=ME(x,θE)
Figure FDA0002860951690000022
Figure FDA0002860951690000023
Figure FDA0002860951690000024
Figure FDA0002860951690000025
in the formula, ME(. cndot.) represents a feature extraction layer,
Figure FDA0002860951690000026
representing a personality classification level; CEL (·) represents a cross-entropy loss function; alpha represents the learning rate of the individual classification layer for individual federal learning; diRepresenting a client local data set.
6. The personalized federal learning method as claimed in claim 5, wherein in the step S3, the global model parameter θ is used as a basisGBy fixing the parameter theta of the feature extraction layerEAnd personality classification level parameters
Figure FDA0002860951690000027
To gate control model parameters
Figure FDA0002860951690000028
Fine tuning and personalized federal learning.
7. The personalized federal learning method as claimed in claim 6, wherein the gated model performs personalized federal learning according to global model parameters by the following steps:
s3.6: random initialization of gated model parameters
Figure FDA0002860951690000029
Setting local fine-tuning hyper-parameters;
s3.7: extracting mini-batch (x, y) epsilon D from local data of client iiInputting the data into a feature extraction layer to obtain an activation value AE,x
S3.8: will activate the value AE,xRespectively inputting individual classification layer, global classification layer and gating model, aggregating output results to complete forward propagation, and outputting to obtain prediction label
Figure FDA00028609516900000210
S3.9: computing the predictive label
Figure FDA00028609516900000211
Cross entropy loss with real label y 2, using back propagation algorithm to obtain gating model parameter
Figure FDA00028609516900000212
Gradient of (2)
Figure FDA00028609516900000213
S3.10: according to the gating model parameters
Figure FDA00028609516900000214
Gradient of (2)
Figure FDA00028609516900000215
To gate control model parameters
Figure FDA00028609516900000216
Updating, judging whether the preset number of training rounds is reached, and if not, skipping to execute the step S3.7; if yes, outputting the gating model parameters
Figure FDA00028609516900000217
And completing personalized federal learning of the gating model.
8. The personalized federal learning method as claimed in claim 7, wherein the gating model performs personalized federal learning according to global model parameters by using the following expression formula:
AE,x=ME(x,θE)
Figure FDA0002860951690000031
Figure FDA0002860951690000032
Figure FDA0002860951690000033
Figure FDA0002860951690000034
Figure FDA0002860951690000035
wherein gateout represents an activation valueThe value range of the ratio of the global model obtained by inputting the gating model to the output of the individual classification layer is [0,1 ]];MC(. to) represents a global classification layer; beta represents the learning rate of the gated model for personalized federal learning.
9. The personalized federal learning method as claimed in claim 4 or 7, wherein the hyper-parameters include learning rate and training round number.
10. The personalized federal learning method as claimed in claim 8, further comprising the steps of: the client i takes the task to be classified as input data x' and calculates the prediction probability prob of each classification, and the expression formula is as follows:
AE,x′=ME(x′,θE)
gateout=MGate(AE,x′Gatei)
Figure FDA0002860951690000036
prob=softmax(y′)
in the formula, y 'represents a prediction label corresponding to the output of the input data x'; softmax (·) denotes a softmax function.
CN202011567011.XA 2020-12-25 2020-12-25 Personalized federal learning method based on mixed expert model Active CN112560991B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011567011.XA CN112560991B (en) 2020-12-25 2020-12-25 Personalized federal learning method based on mixed expert model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011567011.XA CN112560991B (en) 2020-12-25 2020-12-25 Personalized federal learning method based on mixed expert model

Publications (2)

Publication Number Publication Date
CN112560991A true CN112560991A (en) 2021-03-26
CN112560991B CN112560991B (en) 2023-07-07

Family

ID=75033035

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011567011.XA Active CN112560991B (en) 2020-12-25 2020-12-25 Personalized federal learning method based on mixed expert model

Country Status (1)

Country Link
CN (1) CN112560991B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113095238A (en) * 2021-04-15 2021-07-09 山东省人工智能研究院 Personalized electrocardiosignal monitoring method based on federal learning
CN113537509A (en) * 2021-06-28 2021-10-22 南方科技大学 Collaborative model training method and device
CN113688862A (en) * 2021-07-09 2021-11-23 深圳大学 Brain image classification method based on semi-supervised federal learning and terminal equipment
CN114357067A (en) * 2021-12-15 2022-04-15 华南理工大学 Personalized federal meta-learning method for data isomerism
CN114429195A (en) * 2022-01-21 2022-05-03 清华大学 Performance optimization method and device for hybrid expert model training
CN114818996A (en) * 2022-06-28 2022-07-29 山东大学 Method and system for diagnosing mechanical fault based on federal domain generalization
CN118410851A (en) * 2024-07-03 2024-07-30 浪潮电子信息产业股份有限公司 Mixed expert model routing network optimization method, product, device and medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111275207A (en) * 2020-02-10 2020-06-12 深圳前海微众银行股份有限公司 Semi-supervision-based horizontal federal learning optimization method, equipment and storage medium
CN111291897A (en) * 2020-02-10 2020-06-16 深圳前海微众银行股份有限公司 Semi-supervision-based horizontal federal learning optimization method, equipment and storage medium
CN111310938A (en) * 2020-02-10 2020-06-19 深圳前海微众银行股份有限公司 Semi-supervision-based horizontal federal learning optimization method, equipment and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111275207A (en) * 2020-02-10 2020-06-12 深圳前海微众银行股份有限公司 Semi-supervision-based horizontal federal learning optimization method, equipment and storage medium
CN111291897A (en) * 2020-02-10 2020-06-16 深圳前海微众银行股份有限公司 Semi-supervision-based horizontal federal learning optimization method, equipment and storage medium
CN111310938A (en) * 2020-02-10 2020-06-19 深圳前海微众银行股份有限公司 Semi-supervision-based horizontal federal learning optimization method, equipment and storage medium

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113095238A (en) * 2021-04-15 2021-07-09 山东省人工智能研究院 Personalized electrocardiosignal monitoring method based on federal learning
CN113095238B (en) * 2021-04-15 2021-12-28 山东省人工智能研究院 Personalized electrocardiosignal monitoring method based on federal learning
CN113537509A (en) * 2021-06-28 2021-10-22 南方科技大学 Collaborative model training method and device
CN113688862A (en) * 2021-07-09 2021-11-23 深圳大学 Brain image classification method based on semi-supervised federal learning and terminal equipment
CN113688862B (en) * 2021-07-09 2023-07-04 深圳大学 Brain image classification method based on semi-supervised federal learning and terminal equipment
CN114357067A (en) * 2021-12-15 2022-04-15 华南理工大学 Personalized federal meta-learning method for data isomerism
CN114357067B (en) * 2021-12-15 2024-06-25 华南理工大学 Personalized federal element learning method aiming at data isomerism
CN114429195A (en) * 2022-01-21 2022-05-03 清华大学 Performance optimization method and device for hybrid expert model training
CN114429195B (en) * 2022-01-21 2024-07-19 清华大学 Performance optimization method and device for training mixed expert model
CN114818996A (en) * 2022-06-28 2022-07-29 山东大学 Method and system for diagnosing mechanical fault based on federal domain generalization
CN118410851A (en) * 2024-07-03 2024-07-30 浪潮电子信息产业股份有限公司 Mixed expert model routing network optimization method, product, device and medium

Also Published As

Publication number Publication date
CN112560991B (en) 2023-07-07

Similar Documents

Publication Publication Date Title
CN112560991A (en) Personalized federal learning method based on hybrid expert model
Zhang et al. A novel federated learning scheme for generative adversarial networks
Petzka et al. On the regularization of wasserstein gans
CN106779087B (en) A kind of general-purpose machinery learning data analysis platform
Liu et al. Ensemble learning via negative correlation
Cao et al. PerFED-GAN: Personalized federated learning via generative adversarial networks
CN106067042B (en) Polarization SAR classification method based on semi-supervised depth sparseness filtering network
CN113191484A (en) Federal learning client intelligent selection method and system based on deep reinforcement learning
CN109983480A (en) Use cluster loss training neural network
CN109754068A (en) Transfer learning method and terminal device based on deep learning pre-training model
CN107145860B (en) Classification of Polarimetric SAR Image method based on spatial information and deep learning
CN109829049A (en) The method for solving video question-answering task using the progressive space-time attention network of knowledge base
CN115907001B (en) Knowledge distillation-based federal graph learning method and automatic driving method
US20220318412A1 (en) Privacy-aware pruning in machine learning
CN116523079A (en) Reinforced learning-based federal learning optimization method and system
Jin et al. Image generation method based on improved condition GAN
CN116348881A (en) Combined mixing model
Cai et al. Multi-granularity weighted federated learning in heterogeneous mobile edge computing systems
Hihn et al. Bounded rational decision-making with adaptive neural network priors
CN106250928A (en) Parallel logic homing method based on Graphics Processing Unit and system
CN116719607A (en) Model updating method and system based on federal learning
CN113449867B (en) Deep reinforcement learning multi-agent cooperation method based on knowledge distillation
Tao et al. Communication efficient federated learning via channel-wise dynamic pruning
Zhang et al. FedCR: Personalized federated learning based on across-client common representation with conditional mutual information regularization
Nawaz et al. K-DUMBs IoRT: Knowledge Driven Unified Model Block Sharing in the Internet of Robotic Things

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant