CN117010484A - Personalized federal learning generalization method, device and application based on attention mechanism - Google Patents
Personalized federal learning generalization method, device and application based on attention mechanism Download PDFInfo
- Publication number
- CN117010484A CN117010484A CN202311277193.0A CN202311277193A CN117010484A CN 117010484 A CN117010484 A CN 117010484A CN 202311277193 A CN202311277193 A CN 202311277193A CN 117010484 A CN117010484 A CN 117010484A
- Authority
- CN
- China
- Prior art keywords
- parameters
- personalized
- client
- attention mechanism
- sharing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 60
- 230000007246 mechanism Effects 0.000 title claims abstract description 43
- 238000004220 aggregation Methods 0.000 claims description 12
- 230000002776 aggregation Effects 0.000 claims description 12
- 239000013598 vector Substances 0.000 claims description 8
- 238000004891 communication Methods 0.000 claims description 7
- 238000010606 normalization Methods 0.000 claims description 4
- 238000005304 joining Methods 0.000 claims description 3
- 230000000694 effects Effects 0.000 abstract description 10
- 238000004590 computer program Methods 0.000 description 13
- 238000010586 diagram Methods 0.000 description 10
- 238000003860 storage Methods 0.000 description 9
- 230000008569 process Effects 0.000 description 8
- 230000006870 function Effects 0.000 description 5
- 238000009826 distribution Methods 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 230000000052 comparative effect Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/098—Distributed learning, e.g. federated learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Information Transfer Between Computers (AREA)
- Machine Translation (AREA)
Abstract
The application relates to a personalized federal learning generalization method, equipment and application based on an attention mechanism, which comprises the following steps: initializing the sharing parameters of the global model, sending the sharing parameters to a client which is pre-connected, receiving the sharing parameters and the personalized parameters of each client after local training, and updating the sharing parameters of the server based on the sharing parameters of each client; and sending the personalized parameters of the existing client and the shared parameters of the server to an untrained new client, and generating the personalized parameters at the new client by using the super network based on the attention mechanism. The new client trains with local data to update the super network parameters instead of the local model parameters. The sharing parameter part is unchanged, and personalized parameters of the new client are generated through super network learning. When the super network of the new client is constructed, the super network refers to the personalized parameters of each model at the same time so as to introduce the correlation information of the personalized parameters of the client and promote the final effect.
Description
Technical Field
The application relates to the technical field of artificial intelligence, in particular to a personalized federal learning generalization method, device and application based on an attention mechanism.
Background
The federation learning trains a general model on the premise of data island (namely, data among all clients is not communicated and uploaded to a server) by sharing parameters or gradients trained by the data of all clients, so that the data privacy of the clients is protected. Personalized federal learning is a common federal learning method, and aims to keep personalized model parameters aiming at different data distribution of each client, adapt to the data distribution of the client, and improve the effect of a local model.
Personalized federal learning involves an important issue, namely how to guarantee generalization of models. In particular, when a client is newly added, particularly a client with less trainable data, the effect of the new client is often difficult to guarantee. The reason is that when the data are less, the local model directly carries out the training of the overall parameters, the fitting phenomenon is easy to occur, and the model effect is reduced.
Chinese patent publication No. CN115600686a discloses a federal learning system based on personalized convertors, which trains a personalized model of a new client by setting a super network at a server and distributing randomly initialized embedded vectors to the newly added client for reuse with local data. However, the randomly initialized trainable embedded vectors do not converge easily, and the model structure of each client lacks flexibility, and is only applicable to a local model with an attention layer, such as a transducer.
Disclosure of Invention
The application aims to overcome the defects of the prior art and provide a personalized federal learning generalization method, equipment and application based on an attention mechanism, which improve the convergence of a new client by alleviating fitting and improve the training effect.
The aim of the application can be achieved by the following technical scheme:
the application provides a personalized federal learning generalization method based on an attention mechanism, which is applied to a server and comprises the following steps of:
initializing the sharing parameters of the global model, sending the sharing parameters to at least one client which is pre-established with connection, receiving and storing the sharing parameters and personalized parameters of each client after local training, updating the sharing parameters of the server based on the sharing parameters of each client, and executing the steps for a plurality of times until reaching a termination condition;
and sending the personalized parameters of all the existing clients and the shared parameters of the server to an untrained new client, generating the personalized parameters at the new client by using the super network based on the attention mechanism, training the super network based on the local data of the new client, and finishing the local update of the personalized parameters of the super network of the new client.
As a preferable technical scheme, the termination condition is that the communication round reaches a preset value.
As a preferable technical scheme, the input of the super network is the personalized parameter of each existing client, and the input is the personalized parameter of the new client.
As a preferred technical solution, the super network based on the attention mechanism includes:
the full connection layer is used for generating hidden vectors;
a plurality of normalization layers and a plurality of self-attention layers arranged between the normalization layers for generating personalized parameters of the new client according to the hidden vectors.
As a preferable technical scheme, the sharing parameters of the new client adopt the sharing parameters of the server.
As a preferable technical scheme, the method further comprises the following steps:
and receiving the sharing parameters and the personalized parameters of a plurality of clients including the new client after parameter initialization, and updating the sharing parameters of the server based on the sharing parameters of the clients in a weighted manner.
As a preferable technical scheme, the sharing parameters of the server are updated through weighted aggregation based on the sharing parameters of the clients.
In another aspect of the present application, there is provided a personalized federal learning generalization method based on an attention mechanism, applied to an untrained new client, comprising the steps of:
receiving personalized parameters of a plurality of clients subjected to local training and sharing parameters of a server subjected to global summation;
updating parameters of the super network based on the attention mechanism by utilizing local data training, generating personalized parameters of a new client by utilizing the trained super network based on personalized parameters of a plurality of clients which have undergone local training, and taking the shared parameters of the server subjected to global summation as shared parameters of the new client;
uploading the updated personalized parameters and the updated shared parameters to the server.
In another aspect of the present application, there is provided an electronic apparatus including: one or more processors and memory, the memory having stored therein one or more programs comprising instructions for performing the personalized federal learning generalization method based on an attention mechanism described above.
The application further provides an application of the personalized federal learning generalization method based on the attention mechanism, and the personalized federal learning generalization method is applied to a service end and at least one vehicle-mounted end aiming at the vehicle network comprising the service end, wherein the service end is provided with a global model, the vehicle-mounted end is provided with a local model, the local model comprises sharing parameters and personalized parameters, and the vehicle-mounted end further comprises a super network for generating the personalized parameters when joining the vehicle network.
Compared with the prior art, the application has the following advantages:
(1) The convergence of new client training is improved, and the training effect is improved: compared with the scheme that a common global average model is used for making an initialization model of a new client and then local training is directly carried out, the method and the device generate personalized parameters of the new client by using the super network based on an attention mechanism, can ensure the rapid convergence of the new client model, avoid overfitting caused by data deficiency in the local training, and reserve the generalization capability of the global model caused by wide coverage data. Different from the existing scheme of distributing embedded vectors for each client to train, the super-network training input is the personalized parameter of each trained client, and is easy to converge.
(2) The method is suitable for scenes with various client model structures, and has strong applicability: different from the existing partial schemes, the client is limited to adopt a certain network structure, the local model structure of each client is not limited, and can be exemplified by CNN, a transformer or other structures, and a personalized layer in the network structure is used as the output of the super network, so that the local training of the client can be more flexible and is not limited by calculation conditions and the like.
Drawings
FIG. 1 is a flowchart of a federal learning generalization method applied to a server in an embodiment;
FIG. 2 is a schematic diagram of a super network in an embodiment;
FIG. 3 is a flow chart of a federal learning generalization method applied to new clients in an embodiment;
FIG. 4 is a flowchart of a parameter update process of an existing client in an embodiment;
fig. 5 is a schematic structural diagram of an electronic device in an embodiment.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, shall fall within the scope of the present application.
The features of the following examples and embodiments may be combined with each other without any conflict.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.
Example 1
In order to solve or partially solve the problem that in the prior art, when a new client is added in federal learning, the model effect of the new client is difficult to guarantee, the embodiment provides a personalized federal learning generalization method based on an attention mechanism, so as to be applied to a server. The method is based on the attention mechanism of model weight similarity, a super network based on the attention mechanism of model parameter correlation is used at a new client, and a plurality of models of the original clients are aggregated and trained to obtain model parameters of the new client.
In this embodiment, there are N clients, 1 server, and communication round K.
Referring to fig. 1, the method comprises the steps of:
s1, randomly initializing sharing parameters of a global model of a serverAnd personalization parameters of the client {>,…/>};
S2, the server sends initialization parametersTo each client;
s3, the client receives and updates the sharing parametersLocal parameters are performed based on local data (including +.>And->) Is to get { about }>,/>…/>Sum {>,/>…/>};
S4, the client side updates the updated local parameter {,/>…/>Sum {>,/>…/>Uploading to a server;
s5, the server receives parameters uploaded by each client and { according to training data quantity of each client,/>…Weighting aggregation is carried out to obtain new +.>. Step S2 is skipped until the cycle number reaches a preset communication round K;
s6, adding a new clientParticipate in training, sharing parameter stored in server +.>Personalized parameter {,/>…/>Transmitting to the new client;
s7, constructing a super network based on an attention mechanism at the new client to generate local model parameters, and training the super network by using local data to obtain parameters of a local model;
and S8, transmitting the parameters of the local model to a server, receiving the model parameters of each client by the server, and carrying out weighted aggregation according to the training data quantity of each client.
Referring to fig. 2, a schematic diagram of the structure of an attention-based supernetwork is shown. The input of the model is the personalized parameter { of the existing client side,/>…/>Output as new client +.>Personalized parameters->. The model comprises a full connection layer, a standardization layer 1, a self-attention layer 1, a standardization layer 2, a self-attention layer 2 and a standardization layer 3 which are connected in sequence. The full connection layer is used for generating a plurality of hidden vectors matched with the number of the existing clients according to the personalized parameters of the existing clients. It is emphasized that the kind and number of layers in this embodiment may vary, for example, a structure of plural sets of standardized layers, self-attention layers may be used.
Referring to fig. 4, the parameter updating process of the existing client includes the following steps:
s1, receiving and updating shared parameters;
S2, carrying out local parameters according to the local data (comprisingAnd->) Is to get { about }>,/>…Sum {>,/>…/>};
S3, the client side updates the updated local parameter {,/>…/>Sum {>,/>…/>Upload to the server.
The method considers the relation between the clients, in particular introduces a attention mechanism, and the input of the same super network is the personalized parameters of a plurality of original clients so as to generate the personalized parameters of a new client.
To illustrate the advantages of the method, the following provides a federal learning server-side update algorithm as a comparative example, which specifically includes the following steps:
step1, randomly initializing parameters of a global model;
Step2, sending global model parameters to each client;
step3, the client receives the global parameters and updates the local parameters;
step4, the server receives parameters of each client, carries out weighted aggregation according to training data quantity of each client, and jumps to Step2 until the circulation times reach a preset communication round K;
therefore, compared with the method that a common global average model is used for making an initialization model of a new client and then local training is directly carried out, the method and the device can ensure the rapid convergence of the new client model by using the super network based on the attention mechanism, avoid overfitting caused by lack of data in the local training, and retain the generalization capability of the global model caused by wide coverage data. The reason is that if the initialized client model is directly trained on complete parameters, the overall model of the client moves to a local optimal position biased to local data distribution, and when the local data is less, the optimal solution is far away from the global optimal solution, so that the effect of the local model is affected. However, the input of the super network is a model of other clients, so that the output model is constrained by the global training result, the over-fitting phenomenon can be greatly improved, and the convergence effect of the new client is still ensured.
In a specific application scenario, aiming at the Internet of vehicles comprising a service end and at least one vehicle-mounted end, the signed personalized federal learning generalization method is applied to the service end, the service end is provided with a global model, the vehicle-mounted end is provided with a local model, the local model comprises sharing parameters and personalized parameters, and the vehicle-mounted end further comprises a super network for generating the personalized parameters when joining the Internet of vehicles.
When the super network of the new client is constructed, the super network refers to the personalized parameters of each model at the same time, so that the correlation information of the personalized parameters of the client can be introduced, and the final effect is improved. Unlike previous solutions, the correlation between models is not considered in the training process.
Example 2
Based on embodiment 1, referring to fig. 3, the present embodiment provides a personalized federal learning generalization method based on an attention mechanism, so as to be applied to a new (i.e. untrained) client, the method includes the following steps:
s1, receiving personalized parameters of a plurality of existing client models and sharing parameters of a server subjected to weighted aggregation
S2, according to local data training, updating the super-network parameters to obtain local model personalized layer parameters, and using global average sharing parameters by other layers;
S3, constructing a super network based on an attention mechanism, wherein the network is input into a personalized parameter { in a client model,/>…/>Output as local model personalization layer parameter +.>;
And S4, uploading the new parameters to the server.
To illustrate the advantages of the present method, the following provides a federally learned client update algorithm as a comparative example, which specifically includes the following steps:
step31, receiving global model parameters as local model parameters, and reserving original parameters by a personalized layer;
step32, training and updating the local model according to the local data to obtain updated local model parameters;
step33, transmitting the updated local parameters except the personalized layer to the server.
The application uses the super network based on the attention mechanism to ensure the rapid convergence of the new client model, avoid the overfitting caused by lack of data in the local training, and keep the generalization capability of the global model caused by wide coverage data.
Example 3
The present embodiment provides an electronic device, including: one or more processors and memory, the memory having stored therein one or more programs comprising computer program instructions for performing the personalized federal learning generalization method based on an attention mechanism as described in embodiment 1 or embodiment 2.
The method or apparatus set forth in the above embodiments may be implemented in particular by a computer chip or entity, or by a product having some function. One typical implementation is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
Referring to fig. 5, a schematic structural diagram of an electronic device includes a processor, an internal bus, a network interface, a memory, and a nonvolatile memory, and may include hardware required by other services. The non-volatile memory stores instructions for executing the personalized federal learning generalization method of embodiment 1 or embodiment 2, and the processor reads the corresponding computer program from the non-volatile memory into the memory and then runs the computer program to implement the data acquisition method described in fig. 1. Of course, other implementations, such as logic devices or combinations of hardware and software, are not excluded from the present application, that is, the execution subject of the following processing flows is not limited to each logic unit, but may be hardware or logic devices.
The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
Example 4
The present embodiment provides a computer-readable storage medium comprising one or more programs for execution by one or more processors of an electronic device, the one or more programs comprising computer program instructions for performing the personalized federal learning generalization method based on an attention mechanism as described in embodiment 1 or embodiment 2.
When the personalized federal learning generalization method of embodiment 1, the computer program instructions are:
s1, randomly initializing sharing parameters of a global model of a serverAnd personalization parameters of the client {>,…/>};
S2, the server sends initialization parametersTo each client;
s3, receiving the local parameter { updated by the client,/>…/>Sum {>,/>…/>};
S4, the server receives parameters uploaded by each client and { according to training data quantity of each client,/>…Weighting aggregation is carried out to obtain new +.>. Step S2 is skipped until the cycle number reaches a preset communication round K;
s5, when a new client is addedWhen participating in training, the sharing parameter stored in the server is +.>Personalized parameter {>,/>…/>Transmitting to the new client;
and S6, constructing a super network based on an attention mechanism at a new client to generate local model parameters, training the super network by using local data to obtain the parameters of the local model, receiving the parameters of the local model, and carrying out weighted aggregation according to the training data quantity of each client based on the model parameters of each client.
When the personalized federal learning generalization method of embodiment 2, the computer program instructions are:
s1, receiving personalized parameters of a plurality of existing client models and sharing parameters of a server subjected to weighted aggregation
S2, according to local data training, updating the super-network parameters to obtain local model personalized layer parameters, and using global average sharing parameters by other layers;
S3, constructing a super network based on an attention mechanism, wherein the network input is personality in a client modelChemical parameters {,/>…/>Output as local model personalization layer parameter +.>;
And S4, uploading the new parameters to the server.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Example 5
The embodiment provides a personalized federal learning generalization system based on an attention mechanism, which comprises N clients and 1 server.
Wherein the server is used for executing the following procedures:
s1, randomly initializing sharing parameters of a global model of a serverAnd personalization parameters of the client {>,…/>};
S2, the server sends initialization parametersTo each client;
s3, receiving the local parameter { updated by the client,/>…/>Sum {>,/>…/>};
S4, the server receives parameters uploaded by each client and { according to training data quantity of each client,/>…Weighting aggregation is carried out to obtain new +.>. Step S2 is skipped until the cycle number reaches a preset communication round K;
when the system has new entriesClient terminalThe server is further configured to perform:
s5, when a new client is addedWhen participating in training, the sharing parameter stored in the server is +.>Personalized parameter {>,/>…/>Transmitting to the new client;
and S6, constructing a super network based on an attention mechanism at a new client to generate local model parameters, training the super network by using local data to obtain the parameters of the local model, receiving the parameters of the local model, and carrying out weighted aggregation according to the training data quantity of each client based on the model parameters of each client.
The client is used for executing the following processes:
s1, receiving personalized parameters of a plurality of existing client models and sharing parameters of a server subjected to weighted aggregation
S2, according to local data training, updating the super-network parameters to obtain local model personalized layer parameters, and using global average sharing parameters by other layers;
S3, constructing a super network based on an attention mechanism, wherein the network is input into a personalized parameter { in a client model,/>…/>Output as local model personalization layer parameter +.>;
And S4, uploading the new parameters to the server.
While the application has been described with reference to certain preferred embodiments, it will be understood by those skilled in the art that various changes and substitutions of equivalents may be made and equivalents will be apparent to those skilled in the art without departing from the scope of the application. Therefore, the protection scope of the application is subject to the protection scope of the claims.
Claims (10)
1. The personalized federal learning generalization method based on the attention mechanism is characterized by being applied to a server and comprising the following steps of:
initializing the sharing parameters of the global model, sending the sharing parameters to at least one client which is pre-established with connection, receiving and storing the sharing parameters and personalized parameters of each client after local training, updating the sharing parameters of the server based on the sharing parameters of each client, and executing the steps for a plurality of times until reaching a termination condition;
and sending the personalized parameters of all the existing clients and the shared parameters of the server to an untrained new client, generating the personalized parameters at the new client by using the super network based on the attention mechanism, training the super network based on the local data of the new client, and finishing the local update of the personalized parameters of the super network of the new client.
2. The method for generalizing personalized federal learning based on an attention mechanism according to claim 1, wherein the termination condition is that the communication round reaches a preset value.
3. The personalized federal learning generalization method based on the attention mechanism according to claim 1, wherein the input of the super network is the personalized parameter of each existing client, and the output is the personalized parameter of the new client.
4. The method for generalizing personalized federal learning based on an attention mechanism according to claim 1, wherein the attention mechanism based super network comprises:
the full connection layer is used for generating hidden vectors;
a plurality of normalization layers and a plurality of self-attention layers arranged between the normalization layers for generating personalized parameters of the new client according to the hidden vectors.
5. The personalized federal learning generalization method based on the attention mechanism according to claim 1, wherein the sharing parameters of the new client use the sharing parameters of the server.
6. The personalized federal learning generalization method based on an attention mechanism according to claim 1, further comprising the steps of:
and receiving the sharing parameters and the personalized parameters of a plurality of clients including the new client after parameter initialization, and updating the sharing parameters of the server based on the sharing parameters of the clients in a weighted manner.
7. The personalized federal learning generalization method based on an attention mechanism of claim 1, wherein the sharing parameters of the server are updated by weighted aggregation based on the sharing parameters of the respective clients.
8. A personalized federal learning generalization method based on an attention mechanism, which is applied to a new untrained client, comprising the steps of:
receiving personalized parameters of a plurality of clients subjected to local training and sharing parameters of a server subjected to global summation;
updating parameters of the super network based on the attention mechanism by utilizing local data training, generating personalized parameters of a new client by utilizing the trained super network based on personalized parameters of a plurality of clients which have undergone local training, and taking the shared parameters of the server subjected to global summation as shared parameters of the new client;
uploading the updated personalized parameters and the updated shared parameters to the server.
9. An application of the personalized federal learning generalization method based on an attention mechanism according to any one of claims 1 to 8, wherein the personalized federal learning generalization method is applied to a service end and at least one vehicle-mounted end for a vehicle network comprising the service end, the service end is deployed with a global model, the vehicle-mounted end is deployed with a local model, the local model comprises sharing parameters and personalization parameters, and the vehicle-mounted end further comprises a super network for generating the personalization parameters when joining the vehicle network.
10. An electronic device, comprising: one or more processors and memory, the memory having stored therein one or more programs, the one or more programs comprising instructions for performing the personalized federal learning generalization method based on an attention mechanism of any of claims 1-8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311277193.0A CN117010484B (en) | 2023-10-07 | 2023-10-07 | Personalized federal learning generalization method, device and application based on attention mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311277193.0A CN117010484B (en) | 2023-10-07 | 2023-10-07 | Personalized federal learning generalization method, device and application based on attention mechanism |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117010484A true CN117010484A (en) | 2023-11-07 |
CN117010484B CN117010484B (en) | 2024-01-26 |
Family
ID=88562183
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311277193.0A Active CN117010484B (en) | 2023-10-07 | 2023-10-07 | Personalized federal learning generalization method, device and application based on attention mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117010484B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117892805A (en) * | 2024-03-18 | 2024-04-16 | 清华大学 | Personalized federal learning method based on supernetwork and hierarchy collaborative graph aggregation |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112329940A (en) * | 2020-11-02 | 2021-02-05 | 北京邮电大学 | Personalized model training method and system combining federal learning and user portrait |
WO2021115480A1 (en) * | 2020-06-30 | 2021-06-17 | 平安科技(深圳)有限公司 | Federated learning method, device, equipment, and storage medium |
CN113297396A (en) * | 2021-07-21 | 2021-08-24 | 支付宝(杭州)信息技术有限公司 | Method, device and equipment for updating model parameters based on federal learning |
CN115086399A (en) * | 2022-07-28 | 2022-09-20 | 深圳前海环融联易信息科技服务有限公司 | Federal learning method and device based on hyper network and computer equipment |
CN115169575A (en) * | 2022-06-23 | 2022-10-11 | 深圳前海环融联易信息科技服务有限公司 | Personalized federal learning method, electronic device and computer readable storage medium |
CN115600686A (en) * | 2022-10-18 | 2023-01-13 | 上海科技大学(Cn) | Personalized Transformer-based federal learning model training method and federal learning system |
WO2023284387A1 (en) * | 2021-07-15 | 2023-01-19 | 卡奥斯工业智能研究院(青岛)有限公司 | Model training method, apparatus, and system based on federated learning, and device and medium |
CN115840900A (en) * | 2022-09-16 | 2023-03-24 | 河海大学 | Personalized federal learning method and system based on self-adaptive clustering layering |
CN116227623A (en) * | 2023-01-29 | 2023-06-06 | 深圳前海环融联易信息科技服务有限公司 | Federal learning method, federal learning device, federal learning computer device, and federal learning storage medium |
-
2023
- 2023-10-07 CN CN202311277193.0A patent/CN117010484B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021115480A1 (en) * | 2020-06-30 | 2021-06-17 | 平安科技(深圳)有限公司 | Federated learning method, device, equipment, and storage medium |
CN112329940A (en) * | 2020-11-02 | 2021-02-05 | 北京邮电大学 | Personalized model training method and system combining federal learning and user portrait |
WO2023284387A1 (en) * | 2021-07-15 | 2023-01-19 | 卡奥斯工业智能研究院(青岛)有限公司 | Model training method, apparatus, and system based on federated learning, and device and medium |
CN113297396A (en) * | 2021-07-21 | 2021-08-24 | 支付宝(杭州)信息技术有限公司 | Method, device and equipment for updating model parameters based on federal learning |
CN115169575A (en) * | 2022-06-23 | 2022-10-11 | 深圳前海环融联易信息科技服务有限公司 | Personalized federal learning method, electronic device and computer readable storage medium |
CN115086399A (en) * | 2022-07-28 | 2022-09-20 | 深圳前海环融联易信息科技服务有限公司 | Federal learning method and device based on hyper network and computer equipment |
CN115840900A (en) * | 2022-09-16 | 2023-03-24 | 河海大学 | Personalized federal learning method and system based on self-adaptive clustering layering |
CN115600686A (en) * | 2022-10-18 | 2023-01-13 | 上海科技大学(Cn) | Personalized Transformer-based federal learning model training method and federal learning system |
CN116227623A (en) * | 2023-01-29 | 2023-06-06 | 深圳前海环融联易信息科技服务有限公司 | Federal learning method, federal learning device, federal learning computer device, and federal learning storage medium |
Non-Patent Citations (1)
Title |
---|
高雨佳等: "基于注意力增强元学习网络的个性化联邦学习方法", 计算机研究与发展 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117892805A (en) * | 2024-03-18 | 2024-04-16 | 清华大学 | Personalized federal learning method based on supernetwork and hierarchy collaborative graph aggregation |
CN117892805B (en) * | 2024-03-18 | 2024-05-28 | 清华大学 | Personalized federal learning method based on supernetwork and hierarchy collaborative graph aggregation |
Also Published As
Publication number | Publication date |
---|---|
CN117010484B (en) | 2024-01-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN117010484B (en) | Personalized federal learning generalization method, device and application based on attention mechanism | |
CN113297396B (en) | Method, device and equipment for updating model parameters based on federal learning | |
CN110942154A (en) | Data processing method, device, equipment and storage medium based on federal learning | |
CN110210514B (en) | Generative confrontation network training method, image completion method, device and storage medium | |
CN113240127B (en) | Training method and device based on federal learning, electronic equipment and storage medium | |
CN112084017B (en) | Memory management method and device, electronic equipment and storage medium | |
CN110874650B (en) | Alliance learning method, device and system fusing public domain data and private data | |
CN113361618A (en) | Industrial data joint modeling method and system based on federal learning | |
CN116911403B (en) | Federal learning server and client integrated training method and related equipment | |
CN113469206B (en) | Method, device, equipment and storage medium for acquiring artificial intelligent model | |
CN112990299A (en) | Depth map acquisition method based on multi-scale features, electronic device and storage medium | |
CN116486493A (en) | Living body detection method, device and equipment | |
CN113808157B (en) | Image processing method and device and computer equipment | |
CN116629381A (en) | Federal migration learning method and device, storage medium and electronic equipment | |
CN114638998A (en) | Model updating method, device, system and equipment | |
Izumi et al. | Distributed Hybrid Controllers for Multi‐Agent Mass Games By A Variable Number of Player Agents | |
CN110443746B (en) | Picture processing method and device based on generation countermeasure network and electronic equipment | |
CN112817898A (en) | Data transmission method, processor, chip and electronic equipment | |
CN114091649A (en) | Data processing method, device and equipment | |
CN116611536B (en) | Model training method and device, electronic equipment and storage medium | |
CN110390354B (en) | Prediction method and device for defense capability of deep network | |
CN115760563A (en) | Image super-resolution model training method and device and computer-readable storage medium | |
CN117729555B (en) | Air base station deployment method, cooperative system and related equipment | |
CN112863497B (en) | Method and device for speech recognition, electronic equipment and computer readable storage medium | |
CN118799419A (en) | Training method and device for simulated user image generation model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |