CN117010484B - Personalized federal learning generalization method, device and application based on attention mechanism - Google Patents

Personalized federal learning generalization method, device and application based on attention mechanism Download PDF

Info

Publication number
CN117010484B
CN117010484B CN202311277193.0A CN202311277193A CN117010484B CN 117010484 B CN117010484 B CN 117010484B CN 202311277193 A CN202311277193 A CN 202311277193A CN 117010484 B CN117010484 B CN 117010484B
Authority
CN
China
Prior art keywords
parameters
personalized
client
attention mechanism
sharing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311277193.0A
Other languages
Chinese (zh)
Other versions
CN117010484A (en
Inventor
张璐
杨耀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Lab
Original Assignee
Zhejiang Lab
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Lab filed Critical Zhejiang Lab
Priority to CN202311277193.0A priority Critical patent/CN117010484B/en
Publication of CN117010484A publication Critical patent/CN117010484A/en
Application granted granted Critical
Publication of CN117010484B publication Critical patent/CN117010484B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/098Distributed learning, e.g. federated learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Information Transfer Between Computers (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to a personalized federal learning generalization method, equipment and application based on an attention mechanism, which comprises the following steps: initializing the sharing parameters of the global model, sending the sharing parameters to a client which is pre-connected, receiving the sharing parameters and the personalized parameters of each client after local training, and updating the sharing parameters of the server based on the sharing parameters of each client; and sending the personalized parameters of the existing client and the shared parameters of the server to an untrained new client, and generating the personalized parameters at the new client by using the super network based on the attention mechanism. The new client trains with local data to update the super network parameters instead of the local model parameters. The sharing parameter part is unchanged, and personalized parameters of the new client are generated through super network learning. When the super network of the new client is constructed, the super network refers to the personalized parameters of each model at the same time so as to introduce the correlation information of the personalized parameters of the client and promote the final effect.

Description

Personalized federal learning generalization method, device and application based on attention mechanism
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a personalized federal learning generalization method, device and application based on an attention mechanism.
Background
The federation learning trains a general model on the premise of data island (namely, data among all clients is not communicated and uploaded to a server) by sharing parameters or gradients trained by the data of all clients, so that the data privacy of the clients is protected. Personalized federal learning is a common federal learning method, and aims to keep personalized model parameters aiming at different data distribution of each client, adapt to the data distribution of the client, and improve the effect of a local model.
Personalized federal learning involves an important issue, namely how to guarantee generalization of models. In particular, when a client is newly added, particularly a client with less trainable data, the effect of the new client is often difficult to guarantee. The reason is that when the data are less, the local model directly carries out the training of the overall parameters, the fitting phenomenon is easy to occur, and the model effect is reduced.
Chinese patent publication No. CN115600686a discloses a federal learning system based on personalized convertors, which trains a personalized model of a new client by setting a super network at a server and distributing randomly initialized embedded vectors to the newly added client for reuse with local data. However, the randomly initialized trainable embedded vectors do not converge easily, and the model structure of each client lacks flexibility, and is only applicable to a local model with an attention layer, such as a transducer.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a personalized federal learning generalization method, equipment and application based on an attention mechanism, which improve the convergence of a new client by alleviating fitting and improve the training effect.
The aim of the invention can be achieved by the following technical scheme:
the invention provides a personalized federal learning generalization method based on an attention mechanism, which is applied to a server and comprises the following steps of:
initializing the sharing parameters of the global model, sending the sharing parameters to at least one client which is pre-established with connection, receiving and storing the sharing parameters and personalized parameters of each client after local training, updating the sharing parameters of the server based on the sharing parameters of each client, and executing the steps for a plurality of times until reaching a termination condition;
and sending the personalized parameters of all the existing clients and the shared parameters of the server to an untrained new client, generating the personalized parameters at the new client by using the super network based on the attention mechanism, training the super network based on the local data of the new client, and finishing the local update of the personalized parameters of the super network of the new client.
As a preferable technical scheme, the termination condition is that the communication round reaches a preset value.
As a preferable technical scheme, the input of the super network is the personalized parameter of each existing client, and the input is the personalized parameter of the new client.
As a preferred technical solution, the super network based on the attention mechanism includes:
the full connection layer is used for generating hidden vectors;
a plurality of normalization layers and a plurality of self-attention layers arranged between the normalization layers for generating personalized parameters of the new client according to the hidden vectors.
As a preferable technical scheme, the sharing parameters of the new client adopt the sharing parameters of the server.
As a preferable technical scheme, the method further comprises the following steps:
and receiving the sharing parameters and the personalized parameters of a plurality of clients including the new client after parameter initialization, and updating the sharing parameters of the server based on the sharing parameters of the clients in a weighted manner.
As a preferable technical scheme, the sharing parameters of the server are updated through weighted aggregation based on the sharing parameters of the clients.
In another aspect of the present invention, there is provided a personalized federal learning generalization method based on an attention mechanism, applied to an untrained new client, comprising the steps of:
receiving personalized parameters of a plurality of clients subjected to local training and sharing parameters of a server subjected to global summation;
updating parameters of the super network based on the attention mechanism by utilizing local data training, generating personalized parameters of a new client by utilizing the trained super network based on personalized parameters of a plurality of clients which have undergone local training, and taking the shared parameters of the server subjected to global summation as shared parameters of the new client;
uploading the updated personalized parameters and the updated shared parameters to the server.
In another aspect of the present invention, there is provided an electronic apparatus including: one or more processors and memory, the memory having stored therein one or more programs comprising instructions for performing the personalized federal learning generalization method based on an attention mechanism described above.
The invention further provides an application of the personalized federal learning generalization method based on the attention mechanism, and the personalized federal learning generalization method is applied to a service end and at least one vehicle-mounted end aiming at the vehicle network comprising the service end, wherein the service end is provided with a global model, the vehicle-mounted end is provided with a local model, the local model comprises sharing parameters and personalized parameters, and the vehicle-mounted end further comprises a super network for generating the personalized parameters when joining the vehicle network.
Compared with the prior art, the invention has the following advantages:
(1) The convergence of new client training is improved, and the training effect is improved: compared with the scheme that a common global average model is used for making an initialization model of a new client and then local training is directly carried out, the method and the device generate personalized parameters of the new client by using the super network based on an attention mechanism, can ensure the rapid convergence of the new client model, avoid overfitting caused by data deficiency in the local training, and reserve the generalization capability of the global model caused by wide coverage data. Different from the existing scheme of distributing embedded vectors for each client to train, the super-network training input is the personalized parameter of each trained client, and is easy to converge.
(2) The method is suitable for scenes with various client model structures, and has strong applicability: different from the existing partial schemes, the client is limited to adopt a certain network structure, the local model structure of each client is not limited, and can be exemplified by CNN, a transformer or other structures, and a personalized layer in the network structure is used as the output of the super network, so that the local training of the client can be more flexible and is not limited by calculation conditions and the like.
Drawings
FIG. 1 is a flowchart of a federal learning generalization method applied to a server in an embodiment;
FIG. 2 is a schematic diagram of a super network in an embodiment;
FIG. 3 is a flow chart of a federal learning generalization method applied to new clients in an embodiment;
FIG. 4 is a flowchart of a parameter update process of an existing client in an embodiment;
fig. 5 is a schematic structural diagram of an electronic device in an embodiment.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
The features of the following examples and embodiments may be combined with each other without any conflict.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.
Example 1
In order to solve or partially solve the problem that in the prior art, when a new client is added in federal learning, the model effect of the new client is difficult to guarantee, the embodiment provides a personalized federal learning generalization method based on an attention mechanism, so as to be applied to a server. The method is based on the attention mechanism of model weight similarity, a super network based on the attention mechanism of model parameter correlation is used at a new client, and a plurality of models of the original clients are aggregated and trained to obtain model parameters of the new client.
In this embodiment, there are N clients, 1 server, and communication round K.
Referring to fig. 1, the method comprises the steps of:
s1, randomly initializing sharing parameters of a global model of a serverAnd personalization parameters of the client {>,…/>};
S2, the server sends initialization parametersTo each client;
s3, the client receives and updates the sharing parametersLocal parameters are performed based on local data (including +.>And->) Is to get { about }>,/>…/>Sum {>,/>…/>};
S4, the client side updates the updated local parameter {,/>…/>Sum {>,/>…/>Uploading to a server;
s5, the server receives parameters uploaded by each client and { according to training data quantity of each client,/>Weighting aggregation is carried out to obtain new +.>. Step S2 is skipped until the cycle number reaches a preset communication round K;
s6, adding a new clientParticipate in training, sharing parameter stored in server +.>Personalized parameter {,/>…/>Transmitting to the new client;
s7, constructing a super network based on an attention mechanism at the new client to generate local model parameters, and training the super network by using local data to obtain parameters of a local model;
and S8, transmitting the parameters of the local model to a server, receiving the model parameters of each client by the server, and carrying out weighted aggregation according to the training data quantity of each client.
Referring to fig. 2, a schematic diagram of the structure of an attention-based supernetwork is shown. The input of the model is the personalized parameter { of the existing client side,/>…/>Output as new client +.>Personalized parameters->. The model comprises a full connection layer, a standardization layer 1, a self-attention layer 1, a standardization layer 2, a self-attention layer 2 and a standardization layer 3 which are connected in sequence. The full connection layer is used for generating a plurality of hidden vectors matched with the number of the existing clients according to the personalized parameters of the existing clients. It is emphasized that the kind and number of layers in this embodiment may vary, for example, a structure of plural sets of standardized layers, self-attention layers may be used.
Referring to fig. 4, the parameter updating process of the existing client includes the following steps:
s1, receiving and updating shared parameters
S2, carrying out local parameters according to the local data (comprisingAnd->) Is to get { about }>,/>Sum {>,/>…/>};
S3, the client side updates the updated local parameter {,/>…/>Sum {>,/>…/>Upload to the server.
The method considers the relation between the clients, in particular introduces a attention mechanism, and the input of the same super network is the personalized parameters of a plurality of original clients so as to generate the personalized parameters of a new client.
To illustrate the advantages of the method, the following provides a federal learning server-side update algorithm as a comparative example, which specifically includes the following steps:
step1, randomly initializing parameters of a global model
Step2, sending global model parameters to each client;
step3, the client receives the global parameters and updates the local parameters;
step4, the server receives parameters of each client, carries out weighted aggregation according to training data quantity of each client, and jumps to Step2 until the circulation times reach a preset communication round K;
therefore, compared with the method that a common global average model is used for making an initialization model of a new client and then local training is directly carried out, the method and the device can ensure the rapid convergence of the new client model by using the super network based on the attention mechanism, avoid overfitting caused by lack of data in the local training, and retain the generalization capability of the global model caused by wide coverage data. The reason is that if the initialized client model is directly trained on complete parameters, the overall model of the client moves to a local optimal position biased to local data distribution, and when the local data is less, the optimal solution is far away from the global optimal solution, so that the effect of the local model is affected. However, the input of the super network is a model of other clients, so that the output model is constrained by the global training result, the over-fitting phenomenon can be greatly improved, and the convergence effect of the new client is still ensured.
In a specific application scenario, aiming at the Internet of vehicles comprising a service end and at least one vehicle-mounted end, the signed personalized federal learning generalization method is applied to the service end, the service end is provided with a global model, the vehicle-mounted end is provided with a local model, the local model comprises sharing parameters and personalized parameters, and the vehicle-mounted end further comprises a super network for generating the personalized parameters when joining the Internet of vehicles.
When the super network of the new client is constructed, the super network refers to the personalized parameters of each model at the same time, so that the correlation information of the personalized parameters of the client can be introduced, and the final effect is improved. Unlike previous solutions, the correlation between models is not considered in the training process.
Example 2
Based on embodiment 1, referring to fig. 3, the present embodiment provides a personalized federal learning generalization method based on an attention mechanism, so as to be applied to a new (i.e. untrained) client, the method includes the following steps:
s1, receiving personalized parameters of a plurality of existing client models and sharing parameters of a server subjected to weighted aggregation
S2, according to local data training, updating the super-network parameters to obtain local model personalized layer parameters, and using global average sharing parameters by other layers
S3, constructing a super network based on an attention mechanism, wherein the network is input into a personalized parameter { in a client model,/>…/>Output as local model personalization layer parameter +.>
And S4, uploading the new parameters to the server.
To illustrate the advantages of the present method, the following provides a federally learned client update algorithm as a comparative example, which specifically includes the following steps:
step31, receiving global model parameters as local model parameters, and reserving original parameters by a personalized layer;
step32, training and updating the local model according to the local data to obtain updated local model parameters;
step33, transmitting the updated local parameters except the personalized layer to the server.
The invention uses the super network based on the attention mechanism to ensure the rapid convergence of the new client model, avoid the overfitting caused by lack of data in the local training, and keep the generalization capability of the global model caused by wide coverage data.
Example 3
The present embodiment provides an electronic device, including: one or more processors and memory, the memory having stored therein one or more programs comprising computer program instructions for performing the personalized federal learning generalization method based on an attention mechanism as described in embodiment 1 or embodiment 2.
The method or apparatus set forth in the above embodiments may be implemented in particular by a computer chip or entity, or by a product having some function. One typical implementation is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
Referring to fig. 5, a schematic structural diagram of an electronic device includes a processor, an internal bus, a network interface, a memory, and a nonvolatile memory, and may include hardware required by other services. The non-volatile memory stores instructions for executing the personalized federal learning generalization method of embodiment 1 or embodiment 2, and the processor reads the corresponding computer program from the non-volatile memory into the memory and then runs the computer program to implement the data acquisition method described in fig. 1. Of course, other implementations, such as logic devices or combinations of hardware and software, are not excluded from the present invention, that is, the execution subject of the following processing flows is not limited to each logic unit, but may be hardware or logic devices.
The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
Example 4
The present embodiment provides a computer-readable storage medium comprising one or more programs for execution by one or more processors of an electronic device, the one or more programs comprising computer program instructions for performing the personalized federal learning generalization method based on an attention mechanism as described in embodiment 1 or embodiment 2.
When the personalized federal learning generalization method of embodiment 1, the computer program instructions are:
s1, randomly initializing sharing parameters of a global model of a serverAnd personalization parameters of the client {>,…/>};
S2, the server sends initialization parametersTo each client;
s3, receiving the local parameter { updated by the client,/>…/>Sum {>,/>…/>};
S4, the server receives parameters uploaded by each client and { according to training data quantity of each client,/>Weighting aggregation is carried out to obtain new +.>. Step S2 is skipped until the cycle number reaches a preset communication round K;
s5, when a new client is addedSharing stored in a server while participating in trainingParameter->Personalized parameter {>,/>…/>Transmitting to the new client;
and S6, constructing a super network based on an attention mechanism at a new client to generate local model parameters, training the super network by using local data to obtain the parameters of the local model, receiving the parameters of the local model, and carrying out weighted aggregation according to the training data quantity of each client based on the model parameters of each client.
When the personalized federal learning generalization method of embodiment 2, the computer program instructions are:
s1, receiving personalized parameters of a plurality of existing client models and sharing parameters of a server subjected to weighted aggregation
S2, according to local data training, updating the super-network parameters to obtain local model personalized layer parameters, and using global average sharing parameters by other layers
S3, constructing a super network based on an attention mechanism, wherein the network is input into a personalized parameter { in a client model,/>…/>Output as local model personalization layer parameter +.>
And S4, uploading the new parameters to the server.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Example 5
The embodiment provides a personalized federal learning generalization system based on an attention mechanism, which comprises N clients and 1 server.
Wherein the server is used for executing the following procedures:
s1, randomly initializing sharing parameters of a global model of a serverAnd personalization parameters of the client {>,…/>};
S2, the server sends initialization parametersTo each client;
s3, receiving the local parameter { updated by the client,/>…/>Sum {>,/>…/>};
S4, the server receives parameters uploaded by each client and { according to training data quantity of each client,/>Weighting aggregation is carried out to obtain new +.>. Step S2 is skipped until the cycle number reaches a preset communication round K;
when the system has a newly added clientThe server is further configured to perform:
s5, when a new client is addedWhen participating in training, the sharing parameter stored in the server is +.>Personalized parameter {>,/>…/>Transmitting to the new client;
and S6, constructing a super network based on an attention mechanism at a new client to generate local model parameters, training the super network by using local data to obtain the parameters of the local model, receiving the parameters of the local model, and carrying out weighted aggregation according to the training data quantity of each client based on the model parameters of each client.
The client is used for executing the following processes:
s1, receiving personalized parameters of a plurality of existing client models and sharing parameters of a server subjected to weighted aggregation
S2, according to local data training, updating the super-network parameters to obtain local model personalized layer parameters, and using global average sharing parameters by other layers
S3, constructing a super network based on an attention mechanism, wherein the network is input into a personalized parameter { in a client model,/>…/>Output as local model personalization layer parameter +.>
And S4, uploading the new parameters to the server.
While the invention has been described with reference to certain preferred embodiments, it will be understood by those skilled in the art that various changes and substitutions of equivalents may be made and equivalents will be apparent to those skilled in the art without departing from the scope of the invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.

Claims (9)

1. The personalized federal learning generalization method based on the attention mechanism is characterized by being applied to a server and comprising the following steps of:
initializing the sharing parameters of the global model, sending the sharing parameters to at least one client which is pre-established with connection, receiving and storing the sharing parameters and personalized parameters of each client after local training, updating the sharing parameters of the server based on the sharing parameters of each client, and executing the steps for a plurality of times until reaching a termination condition;
transmitting the personalized parameters of each existing client and the shared parameters of the server to an untrained new client, generating personalized parameters at the new client by using the super network based on the attention mechanism deployed at the new client, training the super network based on the local data of the new client, completing the local update of the personalized parameters of the super network of the new client,
the input of the super network is the personalized parameter of each existing client, and the input is the personalized parameter of the new client.
2. The method for generalizing personalized federal learning based on an attention mechanism according to claim 1, wherein the termination condition is that the communication round reaches a preset value.
3. The method for generalizing personalized federal learning based on an attention mechanism according to claim 1, wherein the attention mechanism based super network comprises:
the full connection layer is used for generating hidden vectors;
a plurality of normalization layers and a plurality of self-attention layers arranged between the normalization layers for generating personalized parameters of the new client according to the hidden vectors.
4. The personalized federal learning generalization method based on the attention mechanism according to claim 1, wherein the sharing parameters of the new client use the sharing parameters of the server.
5. The personalized federal learning generalization method based on an attention mechanism according to claim 1, further comprising the steps of:
and receiving the sharing parameters and the personalized parameters of a plurality of clients including the new client after parameter initialization, and updating the sharing parameters of the server based on the sharing parameters of the clients in a weighted manner.
6. The personalized federal learning generalization method based on an attention mechanism of claim 1, wherein the sharing parameters of the server are updated by weighted aggregation based on the sharing parameters of the respective clients.
7. A personalized federal learning generalization method based on an attention mechanism, which is applied to a new untrained client, comprising the steps of:
receiving personalized parameters of a plurality of clients subjected to local training and sharing parameters of a server subjected to weighted aggregation;
updating parameters of the super network based on the attention mechanism by utilizing local data training, generating personalized parameters of a new client by utilizing the trained super network based on personalized parameters of a plurality of clients which have undergone local training, and taking the shared parameters of the server subjected to global summation as shared parameters of the new client;
uploading the updated personalized parameters and the updated shared parameters to a server;
the super network based on the attention mechanism is deployed at the new client, the input of the super network is the personalized parameter of each existing client, and the input is the personalized parameter of the new client.
8. An application of the personalized federal learning generalization method based on an attention mechanism according to any one of claims 1 to 7, wherein the personalized federal learning generalization method is applied to a service end and at least one vehicle-mounted end for a vehicle network comprising the service end, the service end is deployed with a global model, the vehicle-mounted end is deployed with a local model, the local model comprises sharing parameters and personalization parameters, and the vehicle-mounted end further comprises a super network for generating the personalization parameters when joining the vehicle network.
9. An electronic device, comprising: one or more processors and memory, the memory having stored therein one or more programs, the one or more programs comprising instructions for performing the personalized federal learning generalization method based on an attention mechanism of any of claims 1-7.
CN202311277193.0A 2023-10-07 2023-10-07 Personalized federal learning generalization method, device and application based on attention mechanism Active CN117010484B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311277193.0A CN117010484B (en) 2023-10-07 2023-10-07 Personalized federal learning generalization method, device and application based on attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311277193.0A CN117010484B (en) 2023-10-07 2023-10-07 Personalized federal learning generalization method, device and application based on attention mechanism

Publications (2)

Publication Number Publication Date
CN117010484A CN117010484A (en) 2023-11-07
CN117010484B true CN117010484B (en) 2024-01-26

Family

ID=88562183

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311277193.0A Active CN117010484B (en) 2023-10-07 2023-10-07 Personalized federal learning generalization method, device and application based on attention mechanism

Country Status (1)

Country Link
CN (1) CN117010484B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117892805B (en) * 2024-03-18 2024-05-28 清华大学 Personalized federal learning method based on supernetwork and hierarchy collaborative graph aggregation

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112329940A (en) * 2020-11-02 2021-02-05 北京邮电大学 Personalized model training method and system combining federal learning and user portrait
WO2021115480A1 (en) * 2020-06-30 2021-06-17 平安科技(深圳)有限公司 Federated learning method, device, equipment, and storage medium
CN113297396A (en) * 2021-07-21 2021-08-24 支付宝(杭州)信息技术有限公司 Method, device and equipment for updating model parameters based on federal learning
CN115086399A (en) * 2022-07-28 2022-09-20 深圳前海环融联易信息科技服务有限公司 Federal learning method and device based on hyper network and computer equipment
CN115169575A (en) * 2022-06-23 2022-10-11 深圳前海环融联易信息科技服务有限公司 Personalized federal learning method, electronic device and computer readable storage medium
CN115600686A (en) * 2022-10-18 2023-01-13 上海科技大学(Cn) Personalized Transformer-based federal learning model training method and federal learning system
WO2023284387A1 (en) * 2021-07-15 2023-01-19 卡奥斯工业智能研究院(青岛)有限公司 Model training method, apparatus, and system based on federated learning, and device and medium
CN115840900A (en) * 2022-09-16 2023-03-24 河海大学 Personalized federal learning method and system based on self-adaptive clustering layering
CN116227623A (en) * 2023-01-29 2023-06-06 深圳前海环融联易信息科技服务有限公司 Federal learning method, federal learning device, federal learning computer device, and federal learning storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021115480A1 (en) * 2020-06-30 2021-06-17 平安科技(深圳)有限公司 Federated learning method, device, equipment, and storage medium
CN112329940A (en) * 2020-11-02 2021-02-05 北京邮电大学 Personalized model training method and system combining federal learning and user portrait
WO2023284387A1 (en) * 2021-07-15 2023-01-19 卡奥斯工业智能研究院(青岛)有限公司 Model training method, apparatus, and system based on federated learning, and device and medium
CN113297396A (en) * 2021-07-21 2021-08-24 支付宝(杭州)信息技术有限公司 Method, device and equipment for updating model parameters based on federal learning
CN115169575A (en) * 2022-06-23 2022-10-11 深圳前海环融联易信息科技服务有限公司 Personalized federal learning method, electronic device and computer readable storage medium
CN115086399A (en) * 2022-07-28 2022-09-20 深圳前海环融联易信息科技服务有限公司 Federal learning method and device based on hyper network and computer equipment
CN115840900A (en) * 2022-09-16 2023-03-24 河海大学 Personalized federal learning method and system based on self-adaptive clustering layering
CN115600686A (en) * 2022-10-18 2023-01-13 上海科技大学(Cn) Personalized Transformer-based federal learning model training method and federal learning system
CN116227623A (en) * 2023-01-29 2023-06-06 深圳前海环融联易信息科技服务有限公司 Federal learning method, federal learning device, federal learning computer device, and federal learning storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于注意力增强元学习网络的个性化联邦学习方法;高雨佳等;计算机研究与发展;全文 *

Also Published As

Publication number Publication date
CN117010484A (en) 2023-11-07

Similar Documents

Publication Publication Date Title
CN117010484B (en) Personalized federal learning generalization method, device and application based on attention mechanism
CN110942154B (en) Data processing method, device, equipment and storage medium based on federal learning
CN113297396B (en) Method, device and equipment for updating model parameters based on federal learning
CN112765677B (en) Federal learning method, device and system based on blockchain
CN110210514B (en) Generative confrontation network training method, image completion method, device and storage medium
CN110874637A (en) Multi-target fusion learning method, device and system based on privacy data protection
CN110874650B (en) Alliance learning method, device and system fusing public domain data and private data
CN114945044B (en) Method, device and equipment for constructing digital twin platform based on federal learning
EP3855388A1 (en) Image processing device and operation method thereof
WO2018050045A1 (en) Animation clip splicing method, and information sending method and device
CN114492746A (en) Federal learning acceleration method based on model segmentation
CN113673446A (en) Image recognition method and device, electronic equipment and computer readable medium
CN110489955B (en) Image processing, device, computing device and medium applied to electronic equipment
CN116486493A (en) Living body detection method, device and equipment
CN113808157B (en) Image processing method and device and computer equipment
CN116233844A (en) Physical layer equipment identity authentication method and system based on channel prediction
CN114638998A (en) Model updating method, device, system and equipment
Izumi et al. Distributed Hybrid Controllers for Multi‐Agent Mass Games By A Variable Number of Player Agents
CN116911403B (en) Federal learning server and client integrated training method and related equipment
CN112817898A (en) Data transmission method, processor, chip and electronic equipment
CN112990299A (en) Depth map acquisition method based on multi-scale features, electronic device and storage medium
CN115760563A (en) Image super-resolution model training method and device and computer-readable storage medium
CN116610868B (en) Sample labeling method, end-edge cloud cooperative training method and device
CN116611536B (en) Model training method and device, electronic equipment and storage medium
CN110390354B (en) Prediction method and device for defense capability of deep network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant