CN115115066A - Contrastive Learning-Based Federated Learning Personalization Approach - Google Patents

Contrastive Learning-Based Federated Learning Personalization Approach Download PDF

Info

Publication number
CN115115066A
CN115115066A CN202210956833.XA CN202210956833A CN115115066A CN 115115066 A CN115115066 A CN 115115066A CN 202210956833 A CN202210956833 A CN 202210956833A CN 115115066 A CN115115066 A CN 115115066A
Authority
CN
China
Prior art keywords
model
client
local
training
learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210956833.XA
Other languages
Chinese (zh)
Other versions
CN115115066B (en
Inventor
陈晋音
刘涛
李荣昌
李明俊
宣琦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University of Technology ZJUT
Original Assignee
Zhejiang University of Technology ZJUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Technology ZJUT filed Critical Zhejiang University of Technology ZJUT
Priority to CN202210956833.XA priority Critical patent/CN115115066B/en
Publication of CN115115066A publication Critical patent/CN115115066A/en
Application granted granted Critical
Publication of CN115115066B publication Critical patent/CN115115066B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Image Analysis (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a comparative learning-based federal learning personalization method, which realizes personalization by balancing a global model trained on a whole data set and a local model trained on a local subset. And correcting local updating through the consistency loss of the characteristics learned by the current local model and the characteristics learned by the global model, thereby realizing the individuation of federal learning. The characteristics of the global model, the characteristics of the model obtained by local data training and the characteristics of the local model trained by the current model are calculated at the same time, and then the distances among the three characteristics are normalized through personalized parameters, namely, comparative learning is realized on a model level, so that the global model is corrected to adapt to tasks on different clients. The method and the system can adapt to different tasks of the client, and meet the requirement of federal learning personalized customization.

Description

基于对比学习的联邦学习个性化方法Contrastive Learning-Based Federated Learning Personalization Approach

技术领域technical field

本发明属于面向联邦学习模型个性化领域,尤其涉及一种基于对比学习的联邦学习个性化方法。The invention belongs to the field of personalized federation learning model, and in particular relates to a federated learning personalization method based on contrastive learning.

背景技术Background technique

联合学习因为数据孤岛问题应运而生,其目标是协同训练数据,这些数据是由许多远程设备或本地客户端生成的,并因为隐私问题而无法共享。联邦学习使多方可以联合学习一个机器学习模型,而无需交换各自的本地数据。在每一轮联邦学习训练中,更新后的各方本地模型被传输到服务器,服务器进一步聚合本地模型以更新全局模型。在学习过程中不交换原始数据。联邦学习已经成为一个重要的机器学习领域,吸引了很多研究兴趣。此外,它还被应用于许多领域,如医学成像、智能家具等领域。Federated learning arises because of the problem of data silos, and its goal is to collaboratively train data that is generated by many remote devices or local clients and cannot be shared due to privacy concerns. Federated learning enables multiple parties to jointly learn a machine learning model without exchanging their respective local data. In each round of federated learning training, the updated local models of each party are transmitted to the server, which further aggregates the local models to update the global model. Raw data is not exchanged during the learning process. Federated learning has become an important field of machine learning, attracting a lot of research interest. In addition, it is also used in many fields, such as medical imaging, smart furniture, etc.

由于联邦学习侧重于学习所有客户端数据从而获得高质量的全局模型,因而无法捕获所有客户端的个人信息。对于那些数据质量较高、贡献度较大的客户端来说,全局模型能够很好的适应该客户端的任务,但对于贡献度较小的客户端来说,全局模型未必具有很好的适应性。当各个客户端更新其本地模型时,其本地目标可能与全局目标相差甚远。因此联邦学习在现实部署时面临着数据分布异构的问题,训练得到的全局模型未必适合在所有参与训练的客户端上进行工作。Since federated learning focuses on learning all client data to obtain a high-quality global model, it is impossible to capture the personal information of all clients. For clients with high data quality and high contribution, the global model can be well adapted to the client's task, but for clients with low contribution, the global model may not have good adaptability . When each client updates its local model, its local target may be far from the global target. Therefore, federated learning faces the problem of heterogeneous data distribution in real-world deployment, and the global model obtained by training may not be suitable for working on all clients participating in the training.

联邦学习的一个关键挑战是处理跨各方的本地数据分布的异构性。联邦学习训练得到的全局模型往往会忽略低贡献度客户端的个人信息,导致全局模型具有较低的泛化能力。为了应对这一挑战,一种简单并且有效的方法是在设备或者模型上进行个性化处理。对于广播得到的全局模型,客户端再进行对应的个性化训练,以确保最终的模型能够适应任务。A key challenge of federated learning is dealing with the heterogeneity of local data distribution across parties. The global model trained by federated learning often ignores the personal information of low-contribution clients, resulting in a low generalization ability of the global model. To address this challenge, a simple and effective approach is to personalize the device or model. For the global model obtained by broadcasting, the client performs corresponding personalized training to ensure that the final model can adapt to the task.

发明内容SUMMARY OF THE INVENTION

本发明的目的在于针对现有技术的不足,提供一种基于对比学习的联邦学习个性化方法。本发明可以保证全局模型能够根据客户端的数据分布进行修正。The purpose of the present invention is to provide a personalized method for federated learning based on contrastive learning in view of the deficiencies of the prior art. The present invention can ensure that the global model can be revised according to the data distribution of the client.

本发明的目的是通过以下技术方案来实现的:一种基于对比学习的联邦学习个性化方法,包括以下步骤:The object of the present invention is achieved through the following technical solutions: a method for personalized federated learning based on contrastive learning, comprising the following steps:

(1)本地训练出独立模型:(1) Train an independent model locally:

(1.1)本地客户端根据其持有的数据训练出独立模型,该独立模型仅包含单个客户端的数据信息,适应该客户端的任务;(1.1) The local client trains an independent model according to the data it holds, and the independent model only contains the data information of a single client and adapts to the task of the client;

(1.2)搭建对比学习网络,使用对比学习网络从独立模型中提取出表征;(1.2) Build a contrastive learning network, and use the contrastive learning network to extract representations from independent models;

(2)客户端参与联邦学习训练:(2) The client participates in federated learning training:

(2.1)客户端开始正常的联邦学习训练,根据服务器发布的全局模型以及本地数据得到本地更新;(2.1) The client starts normal federated learning training, and gets local updates according to the global model and local data published by the server;

(2.2)训练完成后,每个客户端将本地更新的本地模型,上传给服务器;(2.2) After the training is completed, each client uploads the locally updated local model to the server;

(2.3)服务器聚合得到新的全局模型并进行全局广播,客户端接收到新的全局模型后从中提取出全局表征;(2.3) The server aggregates to obtain a new global model and broadcasts it globally, and the client extracts the global representation from the new global model after receiving it;

(3)客户端进行个性化修正训练:(3) The client performs personalized correction training:

(3.1)在客户端接收到最新的全局模型之后,客户端开启个性化修正训练,在训练阶段分别得到三个表征:独立模型表征,全局模型表征以及正在被更新的本地模型表征;(3.1) After the client receives the latest global model, the client starts personalized correction training, and obtains three representations in the training phase: independent model representation, global model representation and the local model representation being updated;

(3.2)通过步骤(3.1)得到的三个表征,计算得到模型一致性损失,与训练损失一起更新本地模型;(3.2) Calculate the model consistency loss through the three representations obtained in step (3.1), and update the local model together with the training loss;

(3.3)重复步骤(3.1)~(3.2),进行迭代修正训练,最终得到具有个性化的本地模型。(3.3) Repeat steps (3.1) to (3.2), perform iterative correction training, and finally obtain a personalized local model.

进一步地,步骤(1.1)包括:Further, step (1.1) includes:

共有N个客户端,表示为P1,...,PN,客户端Pi拥有本地数据集Di,对于本地单独的训练来说,独立模型的训练目标如下:There are a total of N clients, denoted as P 1 ,...,P N , and the client P i has a local data set D i . For local individual training, the training objectives of the independent model are as follows:

Figure BDA0003791690210000021
Figure BDA0003791690210000021

其中,

Figure BDA0003791690210000022
为Pi的经验损失;
Figure BDA0003791690210000023
表示客户端Pi的独立模型权重;x表示Di中的一个训练数据,y表示x对应的标签;li表示客户端Pi本地的损失函数;E表示期望。in,
Figure BDA0003791690210000022
is the experience loss of Pi ;
Figure BDA0003791690210000023
represents the independent model weight of the client Pi ; x represents a training data in Di , y represents the label corresponding to x; li represents the local loss function of the client Pi ; E represents the expectation.

进一步地,步骤(1.2)包括:Further, step (1.2) includes:

独立模型权值为

Figure BDA0003791690210000024
Figure BDA0003791690210000025
表示独立模型输出层之前的网络;将训练完成的独立模型权重放入对比学习网络中,便得到独立模型的表征
Figure BDA0003791690210000026
The independent model weights are
Figure BDA0003791690210000024
use
Figure BDA0003791690210000025
Represents the network before the output layer of the independent model; put the weights of the independent model after training into the comparative learning network to obtain the representation of the independent model
Figure BDA0003791690210000026

进一步地,步骤(2.1)包括:Further, step (2.1) includes:

对于本地客户端来说,联邦学习本地模型的训练目标如下:For the local client, the training objectives of the federated learning local model are as follows:

Figure BDA0003791690210000031
Figure BDA0003791690210000031

其中,

Figure BDA0003791690210000032
为Pi的第t轮的本地经验损失Li,t;wi表示客户端Pi的本地模型权重;x表示Di中的一个训练数据,y表示x对应的标签;li,t表示客户端Pi第t轮本地的损失函数;E表示期望。in,
Figure BDA0003791690210000032
is the local experience loss Li,t of the t-th round of Pi ; wi represents the local model weight of the client Pi ; x represents a training data in Di , y represents the label corresponding to x; li ,t represents The loss function local to the t-th round of the client P i ; E represents the expectation.

进一步地,步骤(2.3)包括:Further, step (2.3) includes:

服务器接收到客户端的模型更新,采用FedAvg聚合算法形成新的全局模型;FedAvg计算客户端的局部模型更新的平均值作为全局模型更新,其中每个客户端根据其训练示例的数量进行加权;使用D=∪i∈[N]Di表示所有联邦训练的数据集,第t轮全局模型的训练目标表示为:The server receives the client's model update and uses the FedAvg aggregation algorithm to form a new global model; FedAvg calculates the average of the client's local model updates as the global model update, where each client is weighted according to the number of its training examples; using D = ∪ i∈[N] D i represents all federated training datasets, and the training objective of the global model in round t is expressed as:

Figure BDA0003791690210000033
Figure BDA0003791690210000033

其中,Li,t表示Pi定义第t轮的经验损失;Lg,t表示第t轮的全局模型的损失;wg为全局模型的权重;||表示求模;Among them, L i,t represents the empirical loss of the t- th round defined by Pi; L g,t represents the loss of the global model in the t-th round; w g is the weight of the global model; || represents the modulo;

在聚合完成得到新的全局模型

Figure BDA0003791690210000034
后,服务器将全局模型
Figure BDA0003791690210000035
进行广播,发送至各个客户端;客户端根据最新的全局模型计算得到全局模型的表征
Figure BDA0003791690210000036
Get the new global model after the aggregation is complete
Figure BDA0003791690210000034
After the server will global model
Figure BDA0003791690210000035
Broadcast and send to each client; the client calculates the representation of the global model according to the latest global model
Figure BDA0003791690210000036

进一步地,步骤(3.1)包括:Further, step (3.1) includes:

个性化修正训练的损失由两部分组成,第一部分是由损失函数

Figure BDA0003791690210000037
计算,第二部分为模型对比损失函数计算,模型对比损失函数定义为
Figure BDA0003791690210000038
对于每一个输入x以及本地模型
Figure BDA0003791690210000039
从独立模型中提取出表征
Figure BDA00037916902100000310
从全局模型中提取出表征
Figure BDA00037916902100000311
以及从正在训练的本地模型提取出的表征
Figure BDA00037916902100000312
则模型对比损耗函数定义为:The loss of personalized correction training consists of two parts, the first part is composed of the loss function
Figure BDA0003791690210000037
Calculation, the second part is the model comparison loss function calculation, the model comparison loss function is defined as
Figure BDA0003791690210000038
for each input x and the local model
Figure BDA0003791690210000039
Extract representations from independent models
Figure BDA00037916902100000310
Extract representations from global models
Figure BDA00037916902100000311
and representations extracted from the local model being trained
Figure BDA00037916902100000312
Then the model contrast loss function is defined as:

Figure BDA0003791690210000041
Figure BDA0003791690210000041

进一步地,步骤(3.1)中:Further, in step (3.1):

服务器根据是否收到客户端的个性化申请,决定对应客户端是否进行个性化修正训练,当服务器接收到客户端的个性化申请后,判断该客户端需要进行个性化修正训练,之后客户端开启个性化修正训练;倘若未收到客户端的个性化申请则跳转到步骤(2.1)进行下一轮联邦学习训练。The server decides whether to perform personalized correction training for the corresponding client according to whether it receives the personalized application from the client. When the server receives the personalized application from the client, it determines that the client needs to undergo personalized correction training, and then the client starts personalized training. Correct the training; if the personalized application from the client is not received, jump to step (2.1) for the next round of federated learning training.

进一步地,步骤(3.2)包括:Further, step (3.2) includes:

更新的本地模型的目标是:The goals of the updated local model are:

Figure BDA0003791690210000042
Figure BDA0003791690210000042

其中,μ控制模型对比损失权重的个性化超参数,由客户端设定;wg表示全局模型的权重,wi表示本地模型的权重,

Figure BDA0003791690210000043
表示独立模型的权重,x表示Di中的一个训练数据,y表示x对应的标签。Among them, μ controls the personalized hyperparameter of the model contrast loss weight, which is set by the client; w g represents the weight of the global model, w i represents the weight of the local model,
Figure BDA0003791690210000043
represents the weight of the independent model, x represents a training data in Di , and y represents the label corresponding to x.

本发明的有益效果如下:The beneficial effects of the present invention are as follows:

(1)本发明修正训练与正常的联邦训练独立,个性化过程不会对全局模型造成影响,即不会影响全局模型的收敛过程;(1) The revised training of the present invention is independent of the normal federated training, and the individualization process will not affect the global model, that is, the convergence process of the global model will not be affected;

(2)本发明通过模型层面的对比学习计算表征,能够高效映射出特征;(2) The present invention can efficiently map the features through the comparative learning and calculation representation at the model level;

(3)本发明通过一致性损失能够控制模型的个性化程度,能够进行细粒度定制。(3) The present invention can control the degree of personalization of the model through consistency loss, and can perform fine-grained customization.

附图说明Description of drawings

图1为本发明方法的基于对比学习的联邦学习个性化方法的示意图。FIG. 1 is a schematic diagram of a federated learning personalization method based on contrastive learning of the method of the present invention.

图2为本发明方法的基于对比学习的联邦学习个性化方法的具体流程图。FIG. 2 is a specific flow chart of the method of the present invention for the individualized method of federated learning based on contrastive learning.

具体实施方式Detailed ways

下面结合说明书附图对本发明的具体实施方式作进一步详细描述。The specific embodiments of the present invention will be further described in detail below with reference to the accompanying drawings.

本发明通过权衡在整个数据集上训练的全局模型,与在本地子集上训练的本地模型,来实现个性化。基于这个观点,本发明一种基于对比学习的联邦学习个性化方法,通过当前本地模型学习到的表征与全局模型学习到的表征的一致性损失,来修正本地更新,进而实现联邦学习的个性化。具体来说,该方法将同时计算全局模型的表征、本地数据训练得到模型的表征,以及当前模型训练的本地模型的表征,然后通过个性化参数,规范这三个表征之间的距离,即在模型层面实现对比学习,以此来修正全局模型适应不同客户端上的任务。The present invention achieves personalization by weighing a global model trained on the entire dataset versus a local model trained on a local subset. Based on this point of view, the present invention is a method for personalized federated learning based on contrastive learning, which corrects the local update through the loss of consistency between the representation learned by the current local model and the representation learned by the global model, thereby realizing the personalization of federated learning. . Specifically, the method will simultaneously calculate the representation of the global model, the representation of the model trained on local data, and the representation of the local model trained by the current model, and then normalize the distance between these three representations through personalized parameters, that is, in Contrastive learning is implemented at the model level to modify the global model to adapt to tasks on different clients.

如图1所示,本发明具体包括以下步骤:As shown in Figure 1, the present invention specifically comprises the following steps:

(1)本地训练出独立模型。(1) An independent model is trained locally.

联邦学习中的每个客户端拥有不同的训练数据,这些数据是存储在本地,无法进行传输的。在开始联邦学习训练之前,本发明要求每个客户端使用本地数据训练出一个独立模型,作为后续个性化的表征之一。由于客户端之间数据的异质性,可能会存在客户端缺少数据,无法训练出一个良好的独立模型的情况。但这并不影响后续的个性化定制,因为个性化定制的目标是为了让全局模型适应某个具体的任务,独立模型起到的是导向作用。Each client in federated learning has different training data, which is stored locally and cannot be transmitted. Before starting federated learning training, the present invention requires each client to use local data to train an independent model as one of the subsequent personalized representations. Due to the heterogeneity of data between clients, there may be cases where clients lack data and cannot train a good independent model. However, this does not affect subsequent personalized customization, because the goal of personalized customization is to adapt the global model to a specific task, and the independent model plays a guiding role.

因此,步骤(1)包括如下子步骤:Therefore, step (1) includes the following sub-steps:

(1.1)本地客户端根据其持有的数据,训练出独立模型,独立模型权重表示为wind,该独立模型仅包含单个客户端的数据信息,适应该客户端的任务。(1.1) The local client trains an independent model according to the data it holds. The weight of the independent model is expressed as wind . The independent model only contains the data information of a single client and adapts to the task of the client.

假设联邦学习中共有N个客户端,表示为P1,...,PN,客户端Pi拥有本地数据集Di,对于本地单独的训练来说,独立模型的训练目标如下:Suppose there are N clients in federated learning, denoted as P 1 ,...,P N , and the client P i owns the local dataset D i . For local individual training, the training objectives of the independent model are as follows:

Figure BDA0003791690210000051
Figure BDA0003791690210000051

其中,

Figure BDA0003791690210000052
为Pi的经验损失。
Figure BDA0003791690210000053
表示客户端Pi的独立模型权重;x表示Di中的一个训练数据,y表示x对应的标签;li表示客户端Pi本地常规的损失函数;E表示期望。in,
Figure BDA0003791690210000052
is the experience loss of Pi .
Figure BDA0003791690210000053
represents the independent model weight of the client Pi ; x represents a training data in Di , y represents the label corresponding to x; li represents the local conventional loss function of the client Pi ; E represents the expectation.

(1.2)搭建对比学习网络,使用对比学习网络从独立模型中提取出表征。(1.2) Build a contrastive learning network and use the contrastive learning network to extract representations from independent models.

与典型的对比学习框架SimCLR类似,该对比学习网络有三个组成部分:基础编码器、神经网络投影头和输出层。基编码器用于从输入中提取表示向量,神经网络投影头用来将表示向量映射到具有固定维数的空间。最后,输出层将用于产生每个类的预测值。与上述类似,假设独立模型权值为

Figure BDA0003791690210000054
Figure BDA0003791690210000055
表示独立模型整个网络,用
Figure BDA0003791690210000056
表示独立模型输出层之前的网络。将训练完成的独立模型权重放入对比学习网络中,便可得到独立模型的表征
Figure BDA0003791690210000057
其中输入x表示一个训练数据。Similar to the typical contrastive learning framework SimCLR, this contrastive learning network has three components: the base encoder, the neural network projection head, and the output layer. The base encoder is used to extract the representation vector from the input, and the neural network projection head is used to map the representation vector to a space with fixed dimensionality. Finally, the output layer will be used to produce predictions for each class. Similar to the above, assume that the independent model weights are
Figure BDA0003791690210000054
use
Figure BDA0003791690210000055
represents the entire network of independent models, using
Figure BDA0003791690210000056
Represents the network before the output layer of the independent model. Put the weights of the trained independent models into the comparative learning network to obtain the representation of the independent models
Figure BDA0003791690210000057
where input x represents a training data.

(2)客户端参与联邦学习训练。(2) The client participates in federated learning training.

通过本地单独训练出独立模型并获得表征后,将开始正式的联邦训练流程。为了保证全局模型能够进行收敛,本发明将全局模型的训练与个性化模型修正训练分离开,彼此维护两个任务。即在客户端进行个性化训练之前需要向服务器发送报告,服务器在接收到个性化申请后,对该客户端进行标记,确保该客户端不参与全局模型聚合。同样,客户端在结束该轮的联邦学习训练后,才会进行修正训练,以此来保证客户端的个性化不会对全局模型操作影响。因此在步骤(2)中,只涉及正常的联邦学习训练流程。After the independent model is trained locally and the representation is obtained, the formal federated training process will begin. In order to ensure that the global model can be converged, the present invention separates the training of the global model from the correction training of the individualized model, and maintains two tasks for each other. That is, before the client performs personalized training, a report needs to be sent to the server. After receiving the personalized application, the server will mark the client to ensure that the client does not participate in the global model aggregation. Similarly, the client will only perform correction training after the round of federated learning training is completed, so as to ensure that the client's personalization will not affect the global model operation. Therefore, in step (2), only the normal federated learning training process is involved.

因此,步骤(2)包括如下子步骤:Therefore, step (2) includes the following substeps:

(2.1)客户端开始正常的联邦学习训练,根据服务器发布的全局模型以及本地数据得到本地更新。(2.1) The client starts the normal federated learning training, and gets local updates according to the global model and local data published by the server.

服务器初始化全局模型,并将全局模型广播至各个客户端。客户端接收到全局模型,利用本地数据并使用随机梯度下降执行一次或者多次迭代来更新本地模型。The server initializes the global model and broadcasts the global model to various clients. The client receives the global model, utilizes the local data and performs one or more iterations using stochastic gradient descent to update the local model.

对于本地客户端来说,联邦学习本地模型的训练目标如下:For the local client, the training objectives of the federated learning local model are as follows:

Figure BDA0003791690210000061
Figure BDA0003791690210000061

其中,

Figure BDA0003791690210000062
为Pi的第t轮的本地经验损失Li,t。wi表示客户端Pi的本地模型权重;x表示Di中的一个训练数据,y表示x对应的标签;li,t表示客户端Pi第t轮本地常规的损失函数;E表示期望。本地训练完成后,客户端将本地模型更新上传给服务器。in,
Figure BDA0003791690210000062
is the local experience loss Li,t for the t- th round of Pi. w i represents the local model weight of the client Pi ; x represents a training data in Di, y represents the label corresponding to x; l i , t represents the local conventional loss function of the t- th round of the client Pi; E represents the expectation . After the local training is completed, the client uploads the local model update to the server.

(2.2)训练完成后,每个客户端发送其本地模型,将本地更新上传给服务器,确保服务器能够接收到不具备个性化的本地更新。(2.2) After training is completed, each client sends its local model and uploads local updates to the server to ensure that the server can receive local updates without personalization.

(2.3)服务器聚合得到新的全局模型并进行全局广播,客户端接收到新的全局模型后从中提取出全局表征。(2.3) The server aggregates the new global model and broadcasts it globally, and the client extracts the global representation from the new global model after receiving it.

服务器接收到客户端的模型更新,采用FedAvg聚合算法形成新的全局模型。FedAvg计算客户端的局部模型更新的平均值作为全局模型更新,其中每个客户端根据其训练示例的数量进行加权。使用D=∪i∈[N]Di表示所有联邦训练的数据集,第t轮全局模型的训练目标可以表示为:The server receives the model update from the client and uses the FedAvg aggregation algorithm to form a new global model. FedAvg computes the average of the client's local model updates as the global model update, where each client is weighted according to the number of its training examples. Using D = ∪ i∈[N] D i to denote all federated training datasets, the training objective of the global model at round t can be expressed as:

Figure BDA0003791690210000071
Figure BDA0003791690210000071

其中,Li,t表示Pi定义第t轮的经验损失;Lg,t表示全局模型的损失;wg为全局模型的权重;||表示求模。Among them, L i,t represents the empirical loss of the t- th round defined by Pi; L g,t represents the loss of the global model; w g is the weight of the global model; || represents the modulo.

在聚合完成得到新的全局模型

Figure BDA0003791690210000072
后,服务器将全局模型
Figure BDA0003791690210000073
进行广播,发送至各个客户端。客户端根据最新的全局模型可以计算得到全局模型的表征
Figure BDA0003791690210000074
其中x表示一个训练数据。Get the new global model after the aggregation is complete
Figure BDA0003791690210000072
After the server will global model
Figure BDA0003791690210000073
Broadcast and send to each client. The client can calculate the representation of the global model according to the latest global model
Figure BDA0003791690210000074
where x represents a training data.

(3)客户端进行个性化修正训练。(3) The client performs personalized correction training.

客户端在获得最新的全局模型后,将进行本地模型的个性化修正训练。个性化修正训练的目标是修正本地模型表征与全局模型表征以及独立模型表征之间的距离。例如客户端想让本地模型适应本地任务,则其修正训练的目标为减小本地模型学习到的表征与全局模型学习到的表征之间的距离,增加本地模型学习到的表征与独立模型学习到的表征之间的距离。After the client obtains the latest global model, it will perform personalized correction training of the local model. The goal of personalized correction training is to correct the distance between local model representations and global model representations as well as independent model representations. For example, if the client wants the local model to adapt to the local task, the goal of its correction training is to reduce the distance between the representation learned by the local model and the representation learned by the global model, and increase the representation learned by the local model and the independent model. the distance between the representations.

因此,如图2所示,步骤(3)包括如下子步骤:Therefore, as shown in Figure 2, step (3) includes the following sub-steps:

(3.1)客户端接收到最新的全局模型,并向服务器发送个性化训练的申请。服务器根据是否收到客户端的个性化申请,决定对应客户端是否进行个性化修正训练,当服务器接收到客户端的个性化申请后,判断该客户端需要进行个性化修正训练,之后客户端开启个性化修正训练。倘若未收到客户端的个性化申请则跳转到步骤(2.1)进行下一轮联邦学习训练。其中,在个性化修正训练阶段分别得到三个表征:独立模型表征、全局模型表征以及正在被更新的本地模型表征。(3.1) The client receives the latest global model and sends an application for personalized training to the server. The server decides whether to perform personalized correction training for the corresponding client according to whether it receives the personalized application from the client. When the server receives the personalized application from the client, it determines that the client needs to undergo personalized correction training, and then the client starts personalized training. Corrected training. If the personalized application from the client is not received, jump to step (2.1) for the next round of federated learning training. Among them, three representations are obtained in the personalized correction training stage: independent model representation, global model representation, and local model representation being updated.

修正训练的损失由两部分组成,第一部分是由常规的损失函数

Figure BDA0003791690210000075
计算,第二部分为模型对比损失函数计算,模型对比损失函数定义为
Figure BDA0003791690210000076
对于每一个输入x以及本地模型
Figure BDA0003791690210000077
可以从独立模型中提取出表征
Figure BDA0003791690210000078
从全局模型中提取出表征
Figure BDA0003791690210000079
以及从正在训练的本地模型提取出的表征
Figure BDA00037916902100000710
则模型对比损耗函数可以定义为:The loss for correction training consists of two parts, the first part is composed of the regular loss function
Figure BDA0003791690210000075
Calculation, the second part is the model comparison loss function calculation, the model comparison loss function is defined as
Figure BDA0003791690210000076
for each input x and the local model
Figure BDA0003791690210000077
Representations can be extracted from independent models
Figure BDA0003791690210000078
Extract representations from global models
Figure BDA0003791690210000079
and representations extracted from the local model being trained
Figure BDA00037916902100000710
Then the model contrast loss function can be defined as:

Figure BDA0003791690210000081
Figure BDA0003791690210000081

(3.2)通过三个表征计算得到模型一致性损失,与传统的训练损失一起更新本地模型,其目标是最小化下式:(3.2) The model consistency loss is calculated through three representations, and the local model is updated together with the traditional training loss. The goal is to minimize the following formula:

Figure BDA0003791690210000082
Figure BDA0003791690210000082

其中,μ控制模型对比损失权重的个性化超参数,控制本地模型的个性化程度,由客户端设定。wg表示全局模型的权重,wi表示本地模型的权重,

Figure BDA0003791690210000083
表示独立模型的权重,x表示Di中的一个训练数据,y表示x对应的标签。Among them, μ controls the personalized hyperparameter of the model contrast loss weight, and controls the degree of personalization of the local model, which is set by the client. w g represents the weight of the global model, w i represents the weight of the local model,
Figure BDA0003791690210000083
represents the weight of the independent model, x represents a training data in Di , and y represents the label corresponding to x.

(3.3)重复步骤(3.1)~(3.2),通过多次迭代修正训练,直至修正训练的总体损失更新值小于0.0001或达到预设次数时结束训练,最终得到具有个性化的联邦学习本地模型。(3.3) Repeat steps (3.1) to (3.2), and correct the training through multiple iterations until the overall loss update value of the corrected training is less than 0.0001 or when the preset number of times is reached, and the training is ended, and finally a personalized federated learning local model is obtained.

需要注意的是,本发明中的修正训练与联邦学习训练是相分离的,全局模型的更新也是和传统联邦学习方法一样的,因此不会对正常的联邦学习训练产生干扰。对于本地模型来说,是否进行修正训练其实都不影响全局模型的更新,只影响本地模型的个性化程度。It should be noted that the correction training in the present invention is separated from the federated learning training, and the update of the global model is the same as the traditional federated learning method, so it will not interfere with the normal federated learning training. For the local model, whether to perform revision training does not affect the update of the global model, but only affects the degree of personalization of the local model.

本发明一种实施例,在CIFAR-10十分类数据集的实验中相较于联邦平均学习算法,通过本发明修正的模型测试精度平均提高了3.1%(±0.4%),其中μ设定为5。In an embodiment of the present invention, compared with the federated average learning algorithm in the experiment on the CIFAR-10 decadal data set, the test accuracy of the model corrected by the present invention is improved by an average of 3.1% (±0.4%), where μ is set as 5.

联邦学习的目标是学习所有参与训练的数据特征,但由于数据异质性的存在,客户端上一些较为隐蔽的数据特征往往会被忽略,导致全局模型在各个客户端上的表现具有差异性。为了解决这一问题,本发明一种基于对比学习的联邦学习个性化方法,通过对比学习技术,在正常的本地训练后,加入了修正训练这一步骤。通过计算独立模型、全局模型以及正在被更新的本地模型的表征,以此对本地模型进行修正,保证其能够适应客户端不同的任务,达到联邦学习个性化定制的要求。The goal of federated learning is to learn all the data features involved in training. However, due to the existence of data heterogeneity, some relatively hidden data features on the client are often ignored, resulting in differences in the performance of the global model on each client. In order to solve this problem, the present invention is a method for personalized federated learning based on contrastive learning. Through the contrastive learning technology, the step of correction training is added after normal local training. By calculating the representation of the independent model, the global model and the local model being updated, the local model is modified to ensure that it can adapt to different tasks of the client and meet the requirements of personalized customization of federated learning.

本说明书实施例所述的内容仅仅是对发明构思的实现形式的列举,本发明的保护范围不应当被视为仅限于实施例所陈述的具体形式,本发明的保护范围也及于本领域技术人员根据本发明构思所能够想到的等同技术手段。The content described in the embodiments of the present specification is only an enumeration of the realization forms of the inventive concept, and the protection scope of the present invention should not be regarded as limited to the specific forms stated in the embodiments, and the protection scope of the present invention also extends to those skilled in the art. Equivalent technical means that can be conceived by a person based on the inventive concept.

Claims (8)

1. A federal learning individualization method based on comparative learning is characterized by comprising the following steps:
(1) training out an independent model locally:
(1.1) training an independent model by the local client according to data held by the local client, wherein the independent model only contains data information of a single client and adapts to tasks of the client;
(1.2) building a contrast learning network, and extracting the representation from the independent model by using the contrast learning network;
(2) the client participates in the federal learning training:
(2.1) the client starts normal federal learning training and obtains local update according to a global model and local data issued by the server;
(2.2) after the training is finished, each client uploads the locally updated local model to the server;
(2.3) aggregating the server to obtain a new global model and carrying out global broadcasting, and extracting global representations from the new global model after the client receives the new global model;
(3) the client carries out personalized correction training:
(3.1) after the client receives the latest global model, the client starts personalized correction training, and three representations are obtained in a training stage respectively: an independent model representation, a global model representation and a local model representation being updated;
(3.2) calculating to obtain model consistency loss through the three representations obtained in the step (3.1), and updating the local model together with the training loss;
and (3.3) repeating the steps (3.1) to (3.2), and performing iterative correction training to finally obtain the personalized local model.
2. The comparative learning-based federal learning personalization method of claim 1, wherein step (1.1) comprises:
there are N clients, denoted P 1 ,...,P N Client P i Having local data sets D i For local individual training, the training objectives of the independent model are as follows:
Figure FDA0003791690200000011
wherein,
Figure FDA0003791690200000012
is P i (ii) a loss of experience;
Figure FDA0003791690200000013
representing a client P i The independent model weights of (a); x represents D i Y represents the label to which x corresponds; l i Representing a client P i A local penalty function; e represents expectation.
3. The comparative learning-based federal learning personalization method of claim 1, wherein step (1.2) comprises:
the independent model weight is
Figure FDA0003791690200000021
By using
Figure FDA0003791690200000022
Representing the network before the independent model output layer; putting the trained independent model weight into a comparative learning network to obtain the representation of the independent model
Figure FDA0003791690200000023
4. The comparative learning-based federal learning personalization method of claim 1, wherein step (2.1) comprises:
for the local client, the training objectives of the federated learning local model are as follows:
Figure FDA0003791690200000024
wherein,
Figure FDA0003791690200000025
is P i Local experience loss L of the t-th round i,t ;w i Representing a client P i Local model weight of (2); x represents D i Y represents the label to which x corresponds; l i,t Representing a client P i Local loss function of the t round; e represents expectation.
5. The comparative learning-based federal learning personalization method of claim 1, wherein step (2.3) comprises:
the server receives the model update of the client and forms a new global model by adopting a FedAvg aggregation algorithm; the FedAvg calculates the average of the local model updates of the clients as global model updates, wherein each client is weighted according to the number of training examples thereof; using D ═ U i∈[N] D i Representing all the data sets of federal training, the training targets of the t-th round global model are represented as:
Figure FDA0003791690200000026
wherein L is i,t Represents P i Defining the experience loss of the t-th round; l is a radical of an alcohol g,t Represents the loss of the global model for the t-th round; w is a g Is the weight of the global model; | represents modulo;
to obtain new compounds at the completion of the polymerizationOffice model
Figure FDA0003791690200000027
The server then maps the global model
Figure FDA0003791690200000028
Broadcasting and sending the broadcast to each client; the client calculates the representation of the global model according to the latest global model
Figure FDA0003791690200000029
6. The comparative learning-based federal learning personalization method of claim 1, wherein step (3.1) comprises:
the loss of the personalized correction training is composed of two parts, wherein the first part is a loss function
Figure FDA00037916902000000210
The second part is the calculation of a model contrast loss function defined as
Figure FDA00037916902000000211
For each input x and the local model
Figure FDA00037916902000000212
Extracting tokens from independent models
Figure FDA0003791690200000031
Extracting tokens from a global model
Figure FDA0003791690200000032
And tokens extracted from the local model being trained
Figure FDA0003791690200000033
Then model contrast loss functionThe number is defined as:
Figure FDA0003791690200000034
7. the comparative learning-based federal learning personalization method of claim 1, wherein in step (3.1):
the server determines whether the corresponding client performs the personalized correction training according to whether the personalized application of the client is received or not, judges that the client needs to perform the personalized correction training after the server receives the personalized application of the client, and then starts the personalized correction training; and if the client-side personalized application is not received, skipping to the step (2.1) to perform the next round of federal learning training.
8. The comparative learning-based federal learning personalization method of claim 1, wherein step (3.2) comprises:
the goal of the updated local model is:
Figure FDA0003791690200000035
wherein, the mu control model compares the individualized hyperparameter of the loss weight, and is set by the client; w is a g Weights, w, representing the global model i The weights of the local model are represented by,
Figure FDA0003791690200000036
weights representing independent models, x representing D i Y represents the label to which x corresponds.
CN202210956833.XA 2022-08-10 2022-08-10 Federated learning personalization method based on contrastive learning Active CN115115066B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210956833.XA CN115115066B (en) 2022-08-10 2022-08-10 Federated learning personalization method based on contrastive learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210956833.XA CN115115066B (en) 2022-08-10 2022-08-10 Federated learning personalization method based on contrastive learning

Publications (2)

Publication Number Publication Date
CN115115066A true CN115115066A (en) 2022-09-27
CN115115066B CN115115066B (en) 2025-02-11

Family

ID=83335386

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210956833.XA Active CN115115066B (en) 2022-08-10 2022-08-10 Federated learning personalization method based on contrastive learning

Country Status (1)

Country Link
CN (1) CN115115066B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113344131A (en) * 2021-06-30 2021-09-03 商汤国际私人有限公司 Network training method and device, electronic equipment and storage medium
CN113762524A (en) * 2020-06-02 2021-12-07 三星电子株式会社 Federal learning system and method and client device
US20220114500A1 (en) * 2021-12-22 2022-04-14 Intel Corporation Mechanism for poison detection in a federated learning system
CN114529012A (en) * 2022-02-18 2022-05-24 厦门大学 Double-stage-based personalized federal learning method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113762524A (en) * 2020-06-02 2021-12-07 三星电子株式会社 Federal learning system and method and client device
CN113344131A (en) * 2021-06-30 2021-09-03 商汤国际私人有限公司 Network training method and device, electronic equipment and storage medium
US20220114500A1 (en) * 2021-12-22 2022-04-14 Intel Corporation Mechanism for poison detection in a federated learning system
CN114529012A (en) * 2022-02-18 2022-05-24 厦门大学 Double-stage-based personalized federal learning method

Also Published As

Publication number Publication date
CN115115066B (en) 2025-02-11

Similar Documents

Publication Publication Date Title
CN114595632B (en) A mobile edge cache optimization method based on federated learning
US20250028996A1 (en) An adaptive personalized federated learning method supporting heterogeneous model
CN113010305B (en) Federal learning system deployed in edge computing network and learning method thereof
CN112329940A (en) A personalized model training method and system combining federated learning and user portraits
CN113518007B (en) Multi-internet-of-things equipment heterogeneous model efficient mutual learning method based on federal learning
CN114091667B (en) A federated mutual learning model training method for non-independent and identically distributed data
CN116681144A (en) Federal learning model aggregation method based on dynamic self-adaptive knowledge distillation
CN117994635B (en) A federated meta-learning image recognition method and system with enhanced noise robustness
CN114650227B (en) Network topology construction method and system in hierarchical federation learning scene
CN117236421B (en) Large model training method based on federal knowledge distillation
CN115796271A (en) Federal learning method based on client selection and gradient compression
CN115495771A (en) Data privacy protection method and system based on adaptive weight adjustment
CN115331069A (en) Personalized image classification model training method based on federal learning
CN116665319A (en) A Multimodal Biometric Recognition Method Based on Federated Learning
CN114925848A (en) Target detection method based on transverse federated learning framework
CN116663675A (en) Block chain enabling federal learning system suitable for edge car networking
Grassucci et al. Enhancing semantic communication with deep generative models: An overview
CN118228841B (en) Personalized federal learning training method, system and equipment based on consistency modeling
CN115115066A (en) Contrastive Learning-Based Federated Learning Personalization Approach
CN116719607A (en) Model updating method and system based on federal learning
CN116975683A (en) Personalized federal learning method based on user clustering and model layering
CN114373033B (en) Image processing method, apparatus, device, storage medium, and computer program
CN116541712A (en) Federal modeling method and system based on non-independent co-distributed data
TWI748794B (en) Beam selection method based on neural network and management server
CN111967612B (en) Horizontal federation modeling optimization method, device, equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant