CN115130683A

CN115130683A - An asynchronous federated learning method and system based on a multi-agent model

Info

Publication number: CN115130683A
Application number: CN202210842680.6A
Authority: CN
Inventors: 余国先; 刘礼亮; 王峻; 郭伟
Original assignee: Shandong University
Current assignee: Shandong University
Priority date: 2022-07-18
Filing date: 2022-07-18
Publication date: 2022-09-30
Anticipated expiration: 2042-07-18
Also published as: CN115130683B

Abstract

The invention relates to the technical field of asynchronous federated learning, and provides an asynchronous federated learning method and system based on a multi-agent model, including: randomly selecting several pre-training clients in each group of clients, and obtaining each pre-training client The decision result of whether to participate in the training and uploading of the model; receive the local model obtained from the training of each group of clients participating in the training of the model and the uploaded client to obtain the group model; perform weighted aggregation on the group model to obtain the global model. It can not only solve the long waiting delay problem in synchronous federated learning, but also solve the communication bottleneck problem in fully semi-asynchronous federated learning.

Description

An asynchronous federated learning method and system based on a multi-agent model

技术领域technical field

本发明属于异步联邦学习技术领域，尤其涉及一种基于多代理模型的异步联邦学习方法及系统。The invention belongs to the technical field of asynchronous federated learning, and in particular relates to an asynchronous federated learning method and system based on a multi-agent model.

背景技术Background technique

本部分的陈述仅仅是提供了与本发明相关的背景技术信息，不必然构成在先技术。The statements in this section merely provide background information related to the present invention and do not necessarily constitute prior art.

联邦学习算法涉及数以亿计的远程设备在其设备生成的数据上进行本地训练，并在作为聚合器的中央服务器的协调下集体训练一个全局的、共享的模型。Federated learning algorithms involve hundreds of millions of remote devices training locally on data generated by their devices and collectively training a global, shared model under the coordination of a central server acting as an aggregator.

联邦平均算法的提出是联邦学习的一个开端，它打破了传统的集中式和分布式训练的固有模式，通过只传递模型梯度参数来保护数据的隐私。实验表明，这种方式能够实现灵活高效的通信，降低通信成本。但由于联邦平均算法运行时的高度理想环境需求，在现实异构客户端场景中也是存在着诸多问题，最典型的问题就是在每一轮训练中，训练较快的客户端需要等待最慢的客户端完成训练才能上传聚合更新全局模型，整个联邦学习的训练效率是由最慢的客户端决定的，这将大大的降低模型的训练效率，延长训练时间。The proposal of federated average algorithm is the beginning of federated learning, which breaks the inherent mode of traditional centralized and distributed training, and protects the privacy of data by only passing model gradient parameters. Experiments show that this method can realize flexible and efficient communication and reduce communication costs. However, due to the highly ideal environment requirements when the federated averaging algorithm runs, there are also many problems in real heterogeneous client scenarios. The most typical problem is that in each round of training, the client that trains faster needs to wait for the slowest client. The client can upload the aggregated update global model only after completing the training. The training efficiency of the entire federated learning is determined by the slowest client, which will greatly reduce the training efficiency of the model and prolong the training time.

对于像联邦平均这种同步联邦学习，是在一个高度理想的场景中进行实验的，而在现实场景中由于设备的异构性和网络的不可靠性，不可避免的会出现一些掉队者(滞后的设备或者退出训练的设备)，所以在实际场景中更多的是使用一种异步的联邦训练方式，其中服务器无需等待滞后设备进行聚合。For synchronous federated learning like federated averaging, the experiment is carried out in a highly ideal scenario, while in real scenarios, due to the heterogeneity of devices and the unreliability of the network, there will inevitably be some laggards (lag) equipment or equipment that exits training), so in practical scenarios, an asynchronous federated training method is used, in which the server does not need to wait for the lagging equipment to aggregate.

目前存在的一种指数加权平均的异步联邦学习算法，不同客户端利用自己的本地数据集进行各自的本地模型训练，一旦有客户端完成训练，立刻将模型参数发送到中心服务器，然后中心服务器立刻聚合模型参数，而不需等待其他任何边缘设备。它的核心思想在于，越晚上传的本地模型给予它的权重越低，这样一种方式能够自适应的在收敛速度和方差减少之间进行权衡。但是它还是无法解决完全异步联邦学习中的固有问题，即本地客户端频繁的与中心服务器进行通信所造成的的通信瓶颈问题。There is currently an exponentially weighted average asynchronous federated learning algorithm. Different clients use their own local data sets to train their own local models. Once a client completes the training, it immediately sends the model parameters to the central server, and then the central server immediately Aggregate model parameters without waiting for any other edge device. Its core idea is that the later the local model is transmitted, the lower the weight is given to it, which can adaptively balance the convergence speed and variance reduction. However, it still cannot solve the inherent problem in fully asynchronous federated learning, that is, the communication bottleneck caused by the frequent communication between the local client and the central server.

针对同步联邦学习中存在的训练等待延迟问题和完全异步联邦学习中的通信瓶颈问题，也有方法提出一种半异步的方式来中和存在的这两项巨大的问题，但是面对现实生活中更加复杂的异构客户端场景，还没有一种很好的方法可以在既能不损失模型精度的情况下，又能最大化的减少通信开销与训练延迟。For the training waiting delay problem in synchronous federated learning and the communication bottleneck problem in fully asynchronous federated learning, there is also a method to propose a semi-asynchronous method to neutralize these two huge problems, but in real life, it is more difficult to In complex heterogeneous client scenarios, there is no good method to minimize communication overhead and training delay without losing model accuracy.

发明内容SUMMARY OF THE INVENTION

为了解决上述背景技术中存在的技术问题，本发明提供一种基于多代理模型的异步联邦学习方法及系统，不仅能够解决同步联邦学习中的长时间的等待延迟问题，同时能够解决完全半异步联邦学习中的通信瓶颈问题。In order to solve the technical problems existing in the above background technology, the present invention provides an asynchronous federated learning method and system based on a multi-agent model, which can not only solve the long waiting delay problem in synchronous federated learning, but also solve the problem of complete semi-asynchronous federation Communication bottlenecks in learning.

为了实现上述目的，本发明采用如下技术方案：In order to achieve the above object, the present invention adopts the following technical solutions:

本发明的第一个方面提供一种基于多代理模型的异步联邦学习方法，其包括：A first aspect of the present invention provides an asynchronous federated learning method based on a multi-agent model, which includes:

在每组客户端中随机选择若干个预训练客户端，并获取每个预训练客户端是否参与模型的训练与上传的决策结果；Randomly select several pre-training clients in each group of clients, and obtain the decision result of whether each pre-training client participates in model training and uploading;

接收每组客户端中参与模型的训练与上传的客户端训练得到的本地模型，得到组模型；Receive the local model obtained from the training of the model and the uploaded client in each group of clients, and obtain the group model;

对组模型进行加权聚合得到全局模型。A weighted aggregation of the group models results in a global model.

进一步地，所述决策结果的获取方法为：Further, the acquisition method of the decision result is:

获取每个预训练客户端的状态；Get the status of each pretrained client;

将每个预训练客户端的状态输入到强化学习代理网络，得到每个预训练客户端是否参与模型的训练与上传的决策结果。The state of each pre-training client is input to the reinforcement learning agent network, and the decision result of whether each pre-training client participates in the training and uploading of the model is obtained.

进一步地，所述预训练客户端的状态包括：训练轮次索引t、客户端上的数据量大小、到第t轮为止客户端参与本地模型更新与上传的次数、到第t轮为止客户端所在的组模型的更新次数、所有预训练客户端的通信开销和所有预训练客户端的训练延迟。Further, the state of the pre-training client includes: the training round index t, the amount of data on the client, the number of times the client has participated in local model update and uploading by the t round, and the location of the client by the t round. The number of updates to the group model, the communication overhead for all pretrained clients, and the training latency for all pretrained clients.

进一步地，所述强化学习代理网络以最大化累积回报为目标。Further, the reinforcement learning agent network aims to maximize cumulative return.

进一步地，对组模型进行加权聚合时，每个组模型的权重与每一组的组模型的更新次数相关。Further, when the group models are weighted and aggregated, the weight of each group model is related to the number of updates of the group model of each group.

本发明的第二个方面提供一种基于多代理模型的异步联邦学习系统，其包括：A second aspect of the present invention provides an asynchronous federated learning system based on a multi-agent model, comprising:

客户端智能选择模块，其被配置为：在每组客户端中随机选择若干个预训练客户端，并获取每个预训练客户端是否参与模型的训练与上传的决策结果；a client intelligent selection module, which is configured to: randomly select several pre-training clients in each group of clients, and obtain the decision result of whether each pre-training client participates in the training and uploading of the model;

组内同步训练模块，其被配置为：接收每组客户端中参与模型的训练与上传的客户端训练得到的本地模型，得到组模型；An intra-group synchronous training module, which is configured to: receive a local model obtained from the training of the participating models in each group of clients and the uploaded client-side training to obtain a group model;

组间异步训练模块，其被配置为：对组模型进行加权聚合得到全局模型。An asynchronous training module between groups, which is configured to: perform weighted aggregation on the group model to obtain a global model.

进一步地，所述客户端智能选择模块，具体被配置为：Further, the client intelligent selection module is specifically configured as:

本发明的第三个方面提供一种计算机可读存储介质，其上存储有计算机程序，该程序被处理器执行时实现如上述所述的一种基于多代理模型的异步联邦学习方法中的步骤。A third aspect of the present invention provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, implements the steps in the above-mentioned asynchronous federated learning method based on a multi-agent model .

本发明的第四个方面提供一种计算机设备，包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序，所述处理器执行所述程序时实现如上述所述的一种基于多代理模型的异步联邦学习方法中的步骤。A fourth aspect of the present invention provides a computer device, comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, the processor implementing the above-mentioned one when executing the program Steps in an asynchronous federated learning approach based on a multi-agent model.

与现有技术相比，本发明的有益效果是：Compared with the prior art, the beneficial effects of the present invention are:

本发明提供了一种基于多代理模型的异步联邦学习方法，其引入多智能体强化学习执行高效的客户端智能选择模式，用来替代以往联邦学习方法中的随机选择策略，不仅优化了模型精度，并且大大的提升了模型训练效率，比其他较为先进的半异步联邦学习方法在训练达到指定精度时花费更少的通信开销与训练延迟，可以应用于现实生活中的普遍广泛的异构客户端机器学习模型训练场景。The invention provides an asynchronous federated learning method based on a multi-agent model, which introduces a multi-agent reinforcement learning to execute an efficient client intelligent selection mode, which is used to replace the random selection strategy in the previous federated learning method, and not only optimizes the model accuracy , and greatly improves the efficiency of model training. Compared with other more advanced semi-asynchronous federated learning methods, when the training reaches the specified accuracy, it costs less communication overhead and training delay, and can be applied to a wide range of heterogeneous clients in real life. Machine learning model training scenarios.

本发明提供了一种基于多代理模型的异步联邦学习方法，其每一个客户端有一个强化学习代理，它将根据自己的观测来决定是否参与本轮模型的训练与上传聚合，能够解决较为复杂的异构客户端场景中的机器学习模型训练问题，不仅能够解决同步联邦学习中的长时间的等待延迟问题，同时能够解决完全半异步联邦学习中的通信瓶颈问题。The present invention provides an asynchronous federated learning method based on a multi-agent model. Each client has a reinforcement learning agent, which will decide whether to participate in the training and uploading and aggregation of the current round of models according to its own observations, which can solve complex problems. The machine learning model training problem in the heterogeneous client scenario can not only solve the long waiting delay problem in synchronous federated learning, but also solve the communication bottleneck problem in fully semi-asynchronous federated learning.

附图说明Description of drawings

构成本发明的一部分的说明书附图用来提供对本发明的进一步理解，本发明的示意性实施例及其说明用于解释本发明，并不构成对本发明的不当限定。The accompanying drawings forming a part of the present invention are used to provide further understanding of the present invention, and the exemplary embodiments of the present invention and their descriptions are used to explain the present invention, and do not constitute an improper limitation of the present invention.

图1是本发明实施例一的一种基于多代理模型的异步联邦学习方法的整体流程图；1 is an overall flow chart of an asynchronous federated learning method based on a multi-agent model according to Embodiment 1 of the present invention;

图2是本发明实施例一的F-MNIST数据集上复杂场景下的精度变化图；Fig. 2 is the accuracy change diagram under the complex scene on the F-MNIST data set of Embodiment 1 of the present invention;

图3(a)是本发明实施例一的F-MNIST数据集上复杂场景下的通信开销图；Fig. 3 (a) is the communication overhead diagram under the complex scene on the F-MNIST data set of Embodiment 1 of the present invention;

图3(b)是本发明实施例一的F-MNIST数据集上复杂场景下的训练延迟图；Fig. 3 (b) is the training delay diagram under the complex scene on the F-MNIST data set according to the first embodiment of the present invention;

图4是本发明实施例一的CIFAR-10数据集上复杂场景下的精度变化图；FIG. 4 is a graph of the accuracy change in a complex scene on the CIFAR-10 dataset according to Embodiment 1 of the present invention;

图5(a)是本发明实施例一的CIFAR-10数据集上复杂场景下的通信开销图；Figure 5(a) is a communication overhead diagram in a complex scenario on the CIFAR-10 data set according to Embodiment 1 of the present invention;

图5(b)是本发明实施例一的CIFAR-10数据集上复杂场景下的训练延迟图。Fig. 5(b) is a training delay diagram in a complex scene on the CIFAR-10 dataset according to Embodiment 1 of the present invention.

具体实施方式Detailed ways

下面结合附图与实施例对本发明作进一步说明。The present invention will be further described below with reference to the accompanying drawings and embodiments.

应该指出，以下详细说明都是例示性的，旨在对本发明提供进一步的说明。除非另有指明，本文使用的所有技术和科学术语具有与本发明所属技术领域的普通技术人员通常理解的相同含义。It should be noted that the following detailed description is exemplary and intended to provide further explanation of the invention. Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

需要注意的是，这里所使用的术语仅是为了描述具体实施方式，而非意图限制根据本发明的示例性实施方式。如在这里所使用的，除非上下文另外明确指出，否则单数形式也意图包括复数形式，此外，还应当理解的是，当在本说明书中使用术语“包含”和/或“包括”时，其指明存在特征、步骤、操作、器件、组件和/或它们的组合。It should be noted that the terminology used herein is for the purpose of describing specific embodiments only, and is not intended to limit the exemplary embodiments according to the present invention. As used herein, unless the context clearly dictates otherwise, the singular is intended to include the plural as well, furthermore, it is to be understood that when the terms "comprising" and/or "including" are used in this specification, it indicates that There are features, steps, operations, devices, components and/or combinations thereof.

术语解释：Terminology Explanation:

联邦学习算法的目标函数：The objective function of the federated learning algorithm:

其中，

是客户端k的本地经验损失，l_i(x_i,y_i；w)是数据样本{x_i,y_i}对应的损失函数值，w为需要训练的机器学习模型；K是客户端总数，D_k(k∈{1,...,K})表示存储在本地客户端k上的数据样本，n_k＝|D_k|表示客户端k上的数据样本数量；

为存储在K个客户端上的总数据样本数量；假设对于对于任何k≠k'，

in,

is the local experience loss of client k, _{li (x i , y i} _; _w ) is the loss function value corresponding to the data sample {x _i , y _i }, w is the machine learning model to be trained; K is the total number of clients , D _k (k∈{1,...,K}) represents the data samples stored on the local client k, n _k = |D _k | represents the number of data samples on the client k;

is the total number of data samples stored on K clients; suppose that for any k≠k',

联邦学习算法的最终目标：找到一个模型w_*来最小化目标函数：The ultimate goal of the federated learning algorithm: find a model w _* that minimizes the objective function:

w_*＝arg min f(w)w _* = arg min f(w)

联邦平均算法：一种在非凸设置下用同步更新方式解决联邦学习算法的最终目标中定义的优化问题的常用方法。这种方法通过在每一轮以一定的概率随机抽样客户端子集来进行模型训练，每个本地客户端使用一个优化器(比如随机梯度下降)使用自己的数据进行若干次次本地迭代。Federated Average Algorithm: A common method for solving the optimization problem defined in the ultimate goal of a federated learning algorithm with simultaneous updates in a nonconvex setting. This approach trains the model by randomly sampling a subset of clients with a certain probability at each round, each local client using an optimizer (such as stochastic gradient descent) for several local iterations with its own data.

指数加权平均的异步联邦学习算法中的全局模型的更新方式：The update method of the global model in the exponentially weighted average asynchronous federated learning algorithm:

α_t←α×s(t-τ)α _t ←α×s(t-τ)

w^t←(1-α_t)w^t-1+α_tw^new w ^t ←(1-α _t )w ^t-1 +α _t w ^new

其中，τ为最快的客户端上传更新全局模型时的轮次索引，t为当前轮次索引，α∈(0,1)为混合系数，α_t为当前轮次t动态更新后的系数，w^t-1为上一轮次训练得到的陈旧模型，w^new为当前训练得到的新的模型，w^t为加权更新得到的用于下一次训练的新的模型，s(·)为模型陈旧度函数，可以取值

Among them, τ is the round index when the fastest client uploads and updates the global model, t is the current round index, α∈(0,1) is the mixing coefficient, α _t is the dynamically updated coefficient of the current round t, w ^t-1 is the old model obtained from the previous round of training, w ^new is the new model obtained by the current training, w ^t is the new model obtained by the weighted update for the next training, s( ) is the old model degree function, which can take values

实施例一Example 1

本实施例提供了一种基于多代理模型的异步联邦学习方法，如图1所示，具体包括以下步骤：This embodiment provides an asynchronous federated learning method based on a multi-agent model, as shown in Figure 1, which specifically includes the following steps:

步骤1、客户端智能选择阶段。在每轮(第t轮)训练时，在每组客户端group_m中随机选择若干个(|P|个)预训练客户端，获取每个预训练客户端的状态，将每个预训练客户端的状态输入到强化学习代理网络，得到每个预训练客户端是否参与模型的训练与上传的决策结果。Step 1. Client intelligent selection stage. During each round (t-th round) of training, randomly select several (|P|) pre-training clients in each group of clients group _m , obtain the status of each pre-training client, and compare the status of each pre-training client The state is input to the reinforcement learning agent network, and the decision result of whether each pre-training client participates in the training and uploading of the model is obtained.

假设现在有一个联邦学习任务将所有客户端分成M组，针对其中一个组group_m进行形式化描述，其他组类似。group_m在每一次训练轮t中随机选择|P|个预训练客户端，然后|P|个客户端分别根据自身的强化学习代理决策是否参与本轮模型的训练与上传，决策后将有|P'|个客户端进行本地同步联邦训练与模型上传更新。客户端n与中心服务器每次通信(上传或下载模型)的通信开销为一个固定值B_n，客户端n的响应时间CP_n用该客户端进行一次本地训练所经历的全局轮次数量来表示，CP_n值越大，表示该客户端进行一次本地训练的时间越长，响应越慢，客户端n的训练延迟

组模型的更新需要等待组内最慢的客户端完成本地训练。Suppose there is now a federated learning task that divides all clients into M groups, and a formal description is made for one of the groups, group _m , and the other groups are similar. group _m randomly selects |P| pre-training clients in each training round t, and then |P| clients decide whether to participate in the training and uploading of the model according to their own reinforcement learning agents. After the decision, there will be | P'| clients perform local synchronous federated training and model upload updates. The communication overhead of each communication (uploading or downloading model) between client n and the central server is a fixed value B _n , and the response time CP _n of client n is represented by the number of global rounds experienced by the client for one local training , the larger the value of CP _n , the longer the local training time of the client, the slower the response, and the training delay of client n.

The update of the group model needs to wait for the slowest client in the group to finish local training.

步骤101、被选中进行预训练的客户端得到各自当前的状态：Step 101: The clients selected for pre-training obtain their respective current states:

每一个强化学习代理n的状态空间由六个部分组成，分别是：当前训练轮次索引t、对应的客户端n上的数据量大小|D_n|、到第t轮为止客户端n参与本地模型更新与上传的次数

到第t轮为止客户端n所在的group_m组模型更新次数为

所有预训练客户端的通信开销B^t＝{B_j|j∈P}、所有预训练客户端的训练延迟

The state space of each reinforcement learning agent n consists of six parts: the index t of the current training round, the size of the data on the corresponding client n |D _n |, and the client n participating in the local Number of model updates and uploads

Up to the t round, the number of model updates of group _m where client n belongs is:

Communication overhead of all pre-training clients B ^t = {B _j |j∈P}, training delay of all pre-training clients

步骤102、将当前状态输入到强化学习代理网络得到决策结果：Step 102: Input the current state into the reinforcement learning agent network to obtain the decision result:

其中，1表示该客户端参与本轮模型的训练与上传，0表示不参与。Among them, 1 indicates that the client participates in the training and uploading of the model in this round, and 0 indicates that it does not participate.

其中，强化学习代理网络以最大化累积回报为目标，具体的：Among them, the reinforcement learning agent network aims to maximize the cumulative return, specifically:

组内完成训练后上传组模型更新全局模型然后计算奖励：After completing the training in the group, upload the group model to update the global model and then calculate the reward:

其中，u为一个大于1的常数，将根据实验条件选取合适的值，acc_t表示在第t轮选取合适的客户端训练完成后更新得到的全局模型在测试集上的精度，acc_last表示最新全局模型精度，

表示第t轮智能选择的所有客户端的通信开销之和，

表示第t轮智能选择的所有客户端的训练延迟之和；Among them, u is a constant greater than 1, and an appropriate value will be selected according to the experimental conditions, acc _t represents the accuracy of the global model updated after the t-th round of selecting a suitable client for training, and acc _last represents the latest global model accuracy,

represents the sum of the communication overheads of all clients intelligently selected in the t-th round,

represents the sum of the training delays of all clients intelligently selected in round t;

强化学习代理网络将被训练从而最大化累积回报R的期望，累计回报R的描述如下：The reinforcement learning agent network will be trained to maximize the expected cumulative return R, which is described as follows:

其中，E为全局模型的总更新次数，γ∈(0,1]是未来奖励的折扣因子。where E is the total number of updates of the global model, and γ∈(0,1] is the discount factor for future rewards.

步骤2、组内同步训练阶段。接收每组客户端中参与模型的训练与上传的客户端训练得到的本地模型，得到组模型。Step 2: Intra-group synchronous training phase. Receive the local model obtained from the training of the model and the uploaded client in each group of clients to obtain the group model.

一个组内决策进行模型训练与上传的所有客户端将进行同步的联邦训练得到组模型：An intra-group decision for model training and uploading of all clients will perform synchronous federated training to get the group model:

其中，P_t'、|P_t'|、n_k、N_c、η、

分别表示在第t轮group_m智能选择的客户端子集、智能选择的客户端数量、客户端k上的数据量、P_t'中的所有数据量、学习率、客户端k的本地经验损失的梯度。Among them, P _t ', |P _t '|, n _k , N _c , η,

represent the subset of clients intelligently selected by group _m in round t, the number of intelligently selected clients, the amount of data on client k, the amount of all data in P _t ', the learning rate, and the local experience loss of client k, respectively gradient.

步骤3、组间异步训练阶段。对组模型进行加权聚合得到全局模型。对组模型进行加权聚合时，每个组模型的权重与每一组的组模型的更新次数相关。Step 3, the asynchronous training phase between groups. A weighted aggregation of the group models results in a global model. When weighted aggregation of group models is performed, the weight of each group model is related to the number of updates of the group model for each group.

假设将所有客户端分成了M组，到目前为止，每一组的更新次数分别为T₁，T₂,…,T_M，所有组的更新总数为T₁+T₂+...+T_M＝T，则加权聚合得到全局模型的描述如下所示：Assuming that all clients are divided into M groups, so far, the update times of each group are T ₁ , T ₂ ,..., T _M respectively, and the total number of updates for all groups is T ₁ +T ₂ +...+T _M = T, then the description of the global model obtained by weighted aggregation is as follows:

其中，

为group_m对应的权重，

根据这个公式相对较慢的组M+1-m的值更小，T_(M+1-m)值更大，从而被分配一个更大的权重值。in,

is the weight corresponding to group _m ,

According to this formula, the relatively slow group M+1-m has a smaller value and a larger T _(M+1-m) value, and thus is assigned a larger weight value.

本发明为了能够解决同步联邦学习中的长时间的等待延迟问题，同时能够解决完全半异步联邦学习中的通信瓶颈问题，本实施例提出了一种基于多代理模型的异步联邦学习方法(简记为MAAFL)，与其他半异步方法不同的是，本发明引入多智能体强化学习执行高效的组内客户端智能选择，每一个客户端有一个强化学习代理，它将根据自己的观测来决定是否参与本轮模型的训练与上传聚合；该发明通过在异步场景中结合多智能体强化学习执行高效的客户端智能选择策略，不仅能优化模型精度，并且大大的提升了模型训练效率，比其他较为先进的半异步联邦学习方法在训练达到指定精度时花费更少的通信开销与训练延迟。In order to solve the long waiting delay problem in the synchronous federated learning and the communication bottleneck problem in the fully semi-asynchronous federated learning, the present embodiment proposes an asynchronous federated learning method based on a multi-agent model (abbreviation is MAAFL), different from other semi-asynchronous methods, the present invention introduces multi-agent reinforcement learning to perform efficient intelligent selection of intra-group clients, each client has a reinforcement learning agent, which will decide whether to Participate in this round of model training and upload aggregation; this invention combines multi-agent reinforcement learning to execute efficient client-side intelligent selection strategies in asynchronous scenarios, which not only optimizes model accuracy, but also greatly improves model training efficiency, which is more efficient than others. Advanced semi-asynchronous federated learning methods spend less communication overhead and training latency when training to a specified accuracy.

本实施例将MAAFL和同步的联邦平均方法(简记为FedAvg)以及完全异步的异步联邦方法(简记为FedAsync)与分层的半异步方法(简记为FedAT)进行对比实验。This embodiment compares MAAFL with the synchronous federated averaging method (abbreviated as FedAvg), the fully asynchronous asynchronous federated method (abbreviated as FedAsync) and the layered semi-asynchronous method (abbreviated as FedAT).

同步的联邦平均方法(简记为FedAvg)是一种基线联邦学习方法。在每一轮中，随机抽取所有客户端的一定比例进行训练，服务器将从所选客户端接收到的权重进行平均聚合。Simultaneous federated averaging method (abbreviated as FedAvg) is a baseline federated learning method. In each round, a certain percentage of all clients are randomly selected for training, and the server averagely aggregates the weights received from the selected clients.

完全异步的异步联邦方法(简记为FedAsync)使用加权平均更新服务器全局模型的一种基线异步联邦学习方法。与之前的同步联邦学习方法不同，所有客户端同时进行训练，当服务器从任意客户端接收到权重时，立即将这些权重与当前全局模型权值加权平均得到最新的全局模型，然后与当前所有可用的客户端通信进行训练Fully Asynchronous Asynchronous Federated Approach (FedAsync for short) is a baseline asynchronous federated learning method that uses weighted average to update the global model of the server. Different from the previous synchronous federated learning method, all clients are trained at the same time. When the server receives weights from any client, it immediately weights these weights with the weights of the current global model to obtain the latest global model, which is then compared with all currently available weights. client communication for training

分层的半异步方法(简记为FedAT)一种半异步联邦学习方法，结合了同步层内训练和跨层异步训练。层内采用随机选择策略选取部分客户端进行同步联邦训练，层与层之间使用异步的方式与中心服务器通信更新全局模型。Hierarchical Semi-Asynchronous Method (FedAT for short) is a semi-asynchronous federated learning method that combines synchronous intra-layer training and cross-layer asynchronous training. In the layer, a random selection strategy is used to select some clients for synchronous federation training, and the layers communicate with the central server in an asynchronous manner to update the global model.

为了验证基于多代理模型的异步联邦学习方法的有效性，实验比较了FedAvg，FedAsync，FedAT，MAAFL训练的模型精度以及达到指定精度所花费的通信开销与训练延迟。In order to verify the effectiveness of the asynchronous federated learning method based on the multi-agent model, the experiments compare the model accuracy trained by FedAvg, FedAsync, FedAT, and MAAFL, as well as the communication overhead and training delay to achieve the specified accuracy.

在实验中，使用了三种数据集，分别是MNIST，Fashion-MNIST，CIFAR-10,MNIST是一个包含60,000个样本的训练集和10,000个样本的测试集的手写识别体数据集。每个示例都是一个28×28维度的灰度图像，与来自10个类的标签相关联；Fashion-MNIST是一个包含60,000个样本的训练集和10,000个样本的测试集的服饰图片数据集。每个示例都是一个28×28维度的灰度图像，与来自10个类的标签相关联；CIFAR-10数据集包含60000幅32×32维度的彩色图像，分为10个类，每类6000幅，有5000幅用于训练和1000幅用于测试。In the experiments, three datasets are used, namely MNIST, Fashion-MNIST, CIFAR-10, and MNIST is a handwriting recognition dataset containing a training set of 60,000 samples and a test set of 10,000 samples. Each example is a 28 × 28 dimensional grayscale image associated with labels from 10 classes; Fashion-MNIST is a dataset of clothing images with a training set of 60,000 samples and a test set of 10,000 samples. Each example is a 28×28 dimensional grayscale image associated with labels from 10 classes; the CIFAR-10 dataset contains 60,000 32×32 dimensional color images divided into 10 classes of 6000 There are 5000 images for training and 1000 images for testing.

实验分别在这三个不同数据集上对模型进行评估，都采用非独立同分布的数据划分方式。具体来说，首先根据类别标签将每个数据集的所有数据划分成200个碎片，在10个类别标签中，每个标签对应的所有数据分别划分到20个碎片，为每个本地客户端分配2个属于不同标签类别的碎片。这样划分是为了保证在100个本地客户端中，每个客户端上只有两个类别标签的数据，并且每个客户端上的数据大小不相同，从而保证了训练是在非独立同分布的数据环境中进行。Experiments evaluate the models on these three different datasets, all using non-IID data partitioning. Specifically, firstly, all data of each dataset is divided into 200 fragments according to the category labels. Among the 10 category labels, all the data corresponding to each label are divided into 20 fragments, which are allocated to each local client. 2 shards belonging to different tag categories. This division is to ensure that among the 100 local clients, each client has only data of two category labels, and the data size on each client is different, thus ensuring that the training is on non-IID data carried out in the environment.

实验将每个客户的本地数据随机分成80％的训练集和20％的测试集，对于组内同步训练，采用与联邦平均相同的抽样方法，MAAFL先随机选择一部分客户端作为预训练客户端，之后每个预训练客户端分别执行智能决策来决定是否参与本轮的训练，使用随机梯度下降(SGD)作为局部优化器。对于所有数据集，本地模型的训练配置参数统一：学习率为0.01，本地迭代次数为3，批量大小为10，对于所有算法，随机选择的预训练客户端数量设置为20。In the experiment, the local data of each client is randomly divided into 80% training set and 20% test set. For intra-group synchronous training, the same sampling method as the federated average is used. MAAFL randomly selects a part of clients as pre-training clients. Afterwards, each pre-training client performs an intelligent decision to decide whether to participate in this round of training, using stochastic gradient descent (SGD) as a local optimizer. For all datasets, the training configuration parameters for the local model are uniform: the learning rate is 0.01, the number of local iterations is 3, the batch size is 10, and the number of randomly selected pre-training clients is set to 20 for all algorithms.

实验模拟了不同的性能组，首先将所有的客户端平均划分为5组，然后为每组的客户端分别随机分配响应时间1～3轮，3～5轮，5～7轮，7～9轮，9～11轮。此外，为了模拟不稳定的网络连接，对于运行的所有测试，随机选择20个“不稳定”客户端，这些客户端在训练过程中会以一定的小概率退出训练。一旦客户退出，就不会再回来重新加入联邦训练。The experiment simulates different performance groups. First, all clients are equally divided into 5 groups, and then the clients in each group are randomly assigned response times of 1 to 3 rounds, 3 to 5 rounds, 5 to 7 rounds, and 7 to 9 rounds. rounds, 9 to 11 rounds. Furthermore, to simulate unstable network connections, for all tests run, 20 "unstable" clients are randomly selected that drop out of training with some small probability during training. Once a client quits, it never comes back to rejoin federation training.

实验中每个本地客户端与中央服务器每次通信(上传或下载模型)的通信开销是固定的，为了模拟实际场景中本地客户端的异构性，给每个客户端在固定范围内随机分配一个固定的通信开销值，不同的客户端具有不同的通信开销。总的通信开销为所有的客户端在整个训练过程中与中央服务器的通信开销总和。In the experiment, the communication overhead of each communication between each local client and the central server (uploading or downloading the model) is fixed. In order to simulate the heterogeneity of local clients in the actual scene, each client is randomly assigned a fixed range. Fixed communication cost value, different clients have different communication cost. The total communication overhead is the sum of the communication overhead between all clients and the central server during the entire training process.

实验中由于不同本地客户端的计算能力等因素的差异，从而带来每一轮本地训练模型的延迟上传。在一轮中被智能选择进行训练的本地客户端中，快的客户端需要等待慢的客户端，比如响应时间为2轮的客户端需要等待响应时间为5轮的客户端，延迟为3个全局轮次。总训练延迟就是整个训练过程中每个客户端的训练延迟的累加。In the experiment, due to the differences in the computing power of different local clients and other factors, there is a delay in uploading each round of local training models. Among the local clients that are intelligently selected for training in one round, fast clients need to wait for slow clients. For example, a client with a response time of 2 rounds needs to wait for a client with a response time of 5 rounds and a delay of 3 rounds. Global rounds. The total training latency is the accumulation of the training latency of each client during the entire training process.

在三个数据集上分别评估算法MAAFL和三个对比方法，每个算法在每个数据集上运行三次，每一次都运行至模型收敛得到最优全局模型精度，将三次最优结果平均得到最优全局模型平均精度，并且计算得到三次结果的标准差得到结果如表1所示。The algorithm MAAFL and the three comparison methods were evaluated on three datasets. Each algorithm was run three times on each dataset, each time running until the model converged to obtain the best global model accuracy, and the three best results were averaged to obtain the best global model accuracy. The average precision of the best global model, and the standard deviation of the three results obtained by calculating the results are shown in Table 1.

表1、不同算法在三个数据集上的精度表现Table 1. Accuracy performance of different algorithms on three datasets

在三个数据集上达到指定精度所花费的通信开销与训练延迟如表2、表3和表4所示。The communication overhead and training latency spent to achieve the specified accuracy on the three datasets are shown in Tables 2, 3 and 4.

表2、不同算法在MNIST数据集上达到指定精度所花费的通信开销和训练延迟Table 2. Communication overhead and training delay for different algorithms to achieve the specified accuracy on the MNIST dataset

表3、不同算法在F-MNIST数据集上达到指定精度所花费的通信开销和训练延迟Table 3. Communication overhead and training delay for different algorithms to achieve the specified accuracy on the F-MNIST dataset

表4、不同算法在CIFAR-10数据集上达到指定精度所花费的通信开销和训练延迟Table 4. Communication overhead and training delay for different algorithms to achieve the specified accuracy on the CIFAR-10 dataset

实验进一步模拟了更加复杂的场景，将最慢客户端的响应时间设置为落后21轮，得到的实验结果如图2、图3(a)、图3(b)、图4、图5(a)和图5(b)所示。The experiment further simulates a more complex scenario, setting the response time of the slowest client to be 21 rounds behind, and the experimental results obtained are shown in Figure 2, Figure 3(a), Figure 3(b), Figure 4, Figure 5(a) and shown in Figure 5(b).

从这些实验结果可以看出，在一般场景下虽然MAAFL方法整体性能比不过同为半异步方式的FedAT，但是它也有自身的优点，并且在更加复杂的场景下MAAFL在精度和训练效率上均超过FedAT。实验结果的亮点总结如下：From these experimental results, it can be seen that in general scenarios, although the overall performance of the MAAFL method is inferior to that of FedAT, which is also a semi-asynchronous method, it also has its own advantages, and in more complex scenarios, MAAFL surpasses both in accuracy and training efficiency. FedAT. The highlights of the experimental results are summarized as follows:

(1)相比于FedAvg，MAAFL能够很好地解决FedAvg中存在的训练延迟问题，这在现实异构场景中具有非常重要的意义，避免了客户端之间的长时间等待过程；(1) Compared with FedAvg, MAAFL can well solve the training delay problem in FedAvg, which is of great significance in realistic heterogeneous scenarios and avoids the long waiting process between clients;

(2)相比于FedAsync，MAAFL避免了通信瓶颈问题，解决了完全异步方法中频繁的与中心服务器通信所造成的通信瓶颈问题；(2) Compared with FedAsync, MAAFL avoids the communication bottleneck problem and solves the communication bottleneck problem caused by frequent communication with the central server in the completely asynchronous method;

(3)与同为半异步方法的FedAT相比，MAAFL在复杂场景中更能凸显其优势，不仅在精度是超过FedAT，而且在两个较复杂的数据集上达到指定精度时比FedAT平均最高减少44％的训练延迟以及36％的通信开销。(3) Compared with FedAT, which is also a semi-asynchronous method, MAAFL can highlight its advantages in complex scenes, not only in accuracy exceeding FedAT, but also in two more complex datasets when the specified accuracy is achieved. The average is the highest than FedAT Reduce training latency by 44% and communication overhead by 36%.

本实施例首先将所有参加联邦训练的客户端根据它们的响应时间分组，组内进行同步的联邦训练得到组模型，组与组之间通过异步加权的方式更新组模型，并且引入了多智能体强化学习，每一个客户端有一个强化学习代理，它们将根据自己的观测来决定是否参与本轮模型的训练。本发明能够解决较为复杂的异构客户端场景中的机器学习模型训练问题，不仅能够解决同步联邦学习中的长时间的等待延迟问题，同时能够解决完全半异步联邦学习中的通信瓶颈问题，此外，本发明引入多智能体强化学习执行高效的客户端智能选择模式用来替代以往联邦学习方法中的随机选择策略，不仅优化了模型精度，并且大大的提升了模型训练效率，比其他较为先进的半异步联邦学习方法在训练达到指定精度时花费更少的通信开销与训练延迟，可以应用于现实生活中的普遍广泛的异构客户端机器学习模型训练场景。In this embodiment, all clients participating in federated training are firstly grouped according to their response time, and the federated training is performed synchronously within the group to obtain a group model, and the group model is updated by asynchronous weighting between groups, and multi-agent In reinforcement learning, each client has a reinforcement learning agent, and they will decide whether to participate in the current round of model training based on their own observations. The invention can solve the problem of machine learning model training in a relatively complex heterogeneous client scene, not only can solve the long waiting delay problem in the synchronous federated learning, but also can solve the communication bottleneck problem in the completely semi-asynchronous federated learning, in addition , the present invention introduces a multi-agent reinforcement learning efficient client-side intelligent selection mode to replace the random selection strategy in the previous federated learning method, which not only optimizes the model accuracy, but also greatly improves the model training efficiency. The semi-asynchronous federated learning method spends less communication overhead and training delay when training reaches the specified accuracy, and can be applied to a wide range of heterogeneous client machine learning model training scenarios in real life.

实施例二Embodiment 2

本实施例提供了一种基于多代理模型的异步联邦学习系统，其具体包括如下模块：This embodiment provides an asynchronous federated learning system based on a multi-agent model, which specifically includes the following modules:

其中，客户端智能选择模块，具体被配置为：Among them, the client intelligent selection module is specifically configured as:

其中，对组模型进行加权聚合时，每个组模型的权重与每一组的组模型的更新次数相关。Wherein, when weighted aggregation is performed on the group models, the weight of each group model is related to the number of updates of the group model of each group.

此处需要说明的是，本实施例中的各个模块与实施例一中的各个步骤一一对应，其具体实施过程相同，此处不再累述。It should be noted here that each module in this embodiment corresponds to each step in Embodiment 1 one by one, and the specific implementation process thereof is the same, which is not repeated here.

实施例三Embodiment 3

本实施例提供了一种计算机可读存储介质，其上存储有计算机程序，该程序被处理器执行时实现如上述实施例一所述的一种基于多代理模型的异步联邦学习方法中的步骤。This embodiment provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, implements the steps in the multi-agent model-based asynchronous federated learning method described in the first embodiment above .

实施例四Embodiment 4

本实施例提供了一种计算机设备，包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序，所述处理器执行所述程序时实现如上述实施例一所述的一种基于多代理模型的异步联邦学习方法中的步骤。This embodiment provides a computer device, including a memory, a processor, and a computer program stored in the memory and running on the processor, where the processor implements the one described in the first embodiment when the processor executes the program. Steps in an asynchronous federated learning approach based on a multi-agent model.

本领域内的技术人员应明白，本发明的实施例可提供为方法、系统、或计算机程序产品。因此，本发明可采用硬件实施例、软件实施例、或结合软件和硬件方面的实施例的形式。而且，本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器和光学存储器等)上实施的计算机程序产品的形式。As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the invention may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media having computer-usable program code embodied therein, including but not limited to disk storage, optical storage, and the like.

本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器，使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each process and/or block in the flowchart illustrations and/or block diagrams, and combinations of processes and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to the processor of a general purpose computer, special purpose computer, embedded processor or other programmable data processing device to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing device produce Means for implementing the functions specified in a flow or flow of a flowchart and/or a block or blocks of a block diagram.

这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中，使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品，该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions The apparatus implements the functions specified in the flow or flow of the flowcharts and/or the block or blocks of the block diagrams.

这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上，使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理，从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded on a computer or other programmable data processing device to cause a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process such that The instructions provide steps for implementing the functions specified in the flow or blocks of the flowcharts and/or the block or blocks of the block diagrams.

本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程，是可以通过计算机程序来指令相关的硬件来完成，所述的程序可存储于一计算机可读取存储介质中，该程序在执行时，可包括如上述各方法的实施例的流程。其中，所述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory，ROM)或随机存储记忆体(RandomAccessMemory，RAM)等。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented by instructing relevant hardware through a computer program, and the program can be stored in a computer-readable storage medium. During execution, the processes of the embodiments of the above-mentioned methods may be included. The storage medium may be a magnetic disk, an optical disk, a read-only memory (Read-Only Memory, ROM), or a random access memory (Random Access Memory, RAM) or the like.

以上所述仅为本发明的优选实施例而已，并不用于限制本发明，对于本领域的技术人员来说，本发明可以有各种更改和变化。凡在本发明的精神和原则之内，所作的任何修改、等同替换、改进等，均应包含在本发明的保护范围之内。The above descriptions are only preferred embodiments of the present invention, and are not intended to limit the present invention. For those skilled in the art, the present invention may have various modifications and changes. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention shall be included within the protection scope of the present invention.

Claims

1. an asynchronous federated learning method based on a multi-agent model is characterized in that, comprising:

Randomly select several pre-training clients in each group of clients, and obtain the decision result of whether each pre-training client participates in model training and uploading;

Receive the local model obtained from the training of the model and the uploaded client in each group of clients, and obtain the group model;

A weighted aggregation of the group models results in a global model.

2. a kind of asynchronous federated learning method based on multi-agent model as claimed in claim 1 is characterized in that, the acquisition method of described decision result is:

Get the status of each pretrained client;

The state of each pre-training client is input to the reinforcement learning agent network, and the decision result of whether each pre-training client participates in the training and uploading of the model is obtained.

3. The asynchronous federated learning method based on a multi-agent model according to claim 2, wherein the state of the pre-training client comprises: the training round index t, the data size on the client, the The number of times the client participated in local model updates and uploads until the t round, the number of updates of the group model the client was in until the t round, the communication overhead of all pre-trained clients, and the training delay of all pre-trained clients.

4 . The asynchronous federated learning method based on a multi-agent model according to claim 2 , wherein the reinforcement learning agent network aims at maximizing cumulative returns. 5 .

5. A kind of asynchronous federated learning method based on multi-agent model as claimed in claim 1, is characterized in that, when group model is weighted and aggregated, the weight of each group model is related to the update times of the group model of each group .

6. An asynchronous federated learning system based on a multi-agent model, comprising:

a client intelligent selection module, which is configured to: randomly select several pre-training clients in each group of clients, and obtain the decision result of whether each pre-training client participates in the training and uploading of the model;

An intra-group synchronous training module, which is configured to: receive a local model obtained from the training of the participating models in each group of clients and the uploaded client-side training to obtain a group model;

An asynchronous training module between groups, which is configured to: perform weighted aggregation on the group model to obtain a global model.

7. A kind of asynchronous federated learning system based on multi-agent model as claimed in claim 6, it is characterised in that the client intelligent selection module is specifically configured as:

Get the status of each pretrained client;

8. A multi-agent model-based asynchronous federated learning system as claimed in claim 6, characterized in that, when the group model is weighted and aggregated, the weight of each group model is related to the number of updates of the group model of each group .

9. A computer-readable storage medium on which a computer program is stored, characterized in that, when the program is executed by a processor, a multi-agent model-based asynchronous system as claimed in any one of claims 1-5 is implemented Steps in a federated learning approach.

10. A computer device, comprising a memory, a processor and a computer program stored on the memory and running on the processor, wherein the processor implements any of claims 1-5 when the processor executes the program. Steps in an asynchronous federated learning method based on a multi-agent model described in a paper.