CN114819181A

CN114819181A - Multi-objective federated learning evolution method based on improved NSGA-III

Info

Publication number: CN114819181A
Application number: CN202210396629.7A
Authority: CN
Inventors: 马武彬; 钟佳淋; 王翔汉; 谢宇晗; 吴亚辉; 周浩浩; 刘梦祥
Original assignee: National University of Defense Technology
Current assignee: National University of Defense Technology
Priority date: 2022-04-15
Filing date: 2022-04-15
Publication date: 2022-07-29
Anticipated expiration: 2042-04-15
Also published as: CN114819181B

Abstract

The invention belongs to the field of artificial intelligence, and discloses a multi-objective federated learning evolution method based on improved NSGA-III, comprising the following steps: acquiring learning data; constructing a federated learning multi-objective optimization model; fast greedy initialization and multi-objective evaluation; Dominant sorting; perform iteration, each iteration performs the following operations: selection, crossover, mutation operator operations, generating Q _t , performing federated learning training evaluation on the offspring population, and calculating the three goals of each individual; mixing parent Generation and offspring populations R _t =Q _t +P _t ; sort R _t non-dominantly, and select P _t+1 based on the reference point; find out the Pareto optimal solution, and output the label corresponding to the Pareto optimal solution result. Compared with the classical algorithm, the present invention can obtain a better Pareto solution, and under the condition of ensuring the accuracy of the global model, the communication cost and the distribution variance of the accuracy of the global model are reduced.

Description

Multi-objective federated learning evolutionary method based on improved NSGA-III

技术领域technical field

本发明属于人工智能技术领域，尤其涉及基于改进NSGA-III的多目标联邦学习进化方法。The invention belongs to the technical field of artificial intelligence, and particularly relates to a multi-objective federated learning evolution method based on improved NSGA-III.

背景技术Background technique

人工智能的快速发展给社会带来极大便利的同时，也带来一些隐患，如数据孤岛和隐私泄露。传统的集中式机器学习，需要将分散的数据聚集在一起进行机器学习训练，但实际上许多领域的数据很难聚合在一起，如医院之间很难共享数据，存在严重的“数据孤岛”问题。此外，隐私泄露问题显现，人们隐私保护意识逐渐提高，世界各国也出台了隐私保护的法律法规。The rapid development of artificial intelligence brings great convenience to society, but also brings some hidden dangers, such as data silos and privacy leakage. Traditional centralized machine learning needs to gather scattered data together for machine learning training, but in fact it is difficult to aggregate data in many fields. For example, it is difficult to share data between hospitals, and there is a serious problem of "data silos". . In addition, the problem of privacy leakage has emerged, people's awareness of privacy protection has gradually increased, and countries around the world have also introduced privacy protection laws and regulations.

因此，联邦学习作为解决数据孤岛和隐私泄露问题的可行解决方案应运而生。联邦学习可以在使数据留在参与者本地的同时，训练出一个良好的全局模型。每个参与者从服务器下载当前的全局模型后，在本地数据中训练，然后将训练后的本地模型上传到服务器中，进行模型的聚合更新，经过多轮迭代，最后得到一个性能良好的全局模型。Therefore, federated learning emerges as a viable solution to the problems of data silos and privacy leaks. Federated learning can train a good global model while keeping the data local to the participants. After each participant downloads the current global model from the server, trains it in the local data, and then uploads the trained local model to the server for aggregated update of the model. After multiple rounds of iterations, a global model with good performance is finally obtained. .

但是，传统的联邦学习仍然面临着通信成本高和结构异质性的挑战。联邦学习的服务器和参与者之间的参数传输消耗着大量的通信成本；同时，不同参与者之间由于计算存储能力不同、网络环境不同，参与者会在训练时出现离线、传输模型参数丢失等情况，导致联邦学习效率、准确性、公平性等受到影响。目前有许多文献致力于通信成本或结构异质性研究，但对这些问题的综合考虑较少。However, traditional federated learning still faces the challenges of high communication cost and structural heterogeneity. The parameter transmission between the federated learning server and the participants consumes a lot of communication costs; at the same time, due to different computing and storage capabilities and different network environments between different participants, the participants will be offline during training, and the parameters of the transmission model will be lost, etc. The efficiency, accuracy, and fairness of federated learning are affected. There is currently much literature devoted to the study of communication cost or structural heterogeneity, but less comprehensive consideration of these issues.

针对以上问题，本发明综合考虑了通信成本和结构异质性问题，对模型有效性、公平性和通信成本之间的平衡问题进行研究。本发明首先定义联邦学习为一个三目标优化模型，以同时最大化全局模型准确率、最小化全局模型准确率分布方差和通信成本为目标。在结合联邦学习训练特点的基础上，对第3代非支配排序遗传算法(non-dominatedsorted genetic algorithm-III， NSGA-III)的初始化进行改进，设计出一种面向多目标联邦学习的基于快速贪婪初始化的NSGA-III算法(Fast greedy initialization NSGA-III，FNSGA-III)。实验结果表明，FNSGA-III算法可以实现三个目标的均衡，能够在保持FL模型整体性能不严重损失的情况下，有效降低通信成本和有效减小各参与者准确率分布的方差，使参与者准确率分布更均衡。本发明所作的工作主要包括：In view of the above problems, the present invention comprehensively considers the problem of communication cost and structural heterogeneity, and studies the balance between model validity, fairness and communication cost. The present invention first defines federated learning as a three-objective optimization model, aiming at maximizing the global model accuracy rate, minimizing the global model accuracy rate distribution variance and communication cost at the same time. Based on the training characteristics of federated learning, the initialization of the third-generation non-dominated sorted genetic algorithm-III (NSGA-III) is improved, and a multi-objective federated learning based fast greedy algorithm is designed. The initialized NSGA-III algorithm (Fast greedy initialization NSGA-III, FNSGA-III). The experimental results show that the FNSGA-III algorithm can achieve the balance of the three objectives, and can effectively reduce the communication cost and the variance of the accuracy distribution of each participant without seriously losing the overall performance of the FL model. The accuracy distribution is more balanced. The work done by the present invention mainly includes:

(1)据作者目前所知，本发明首次综合考虑到全局模型准确率最大化、全局模型准确率分布方差最小化和通信成本最小化等目标，构建联邦学习多目标优化模型。(1) According to the author's current knowledge, the present invention comprehensively considers the objectives of maximizing the accuracy of the global model, minimizing the distribution variance of the accuracy of the global model, and minimizing the communication cost, and constructs a federated learning multi-objective optimization model for the first time.

(2)提出FNSGA-III算法。为了快速收敛NSGA-III算法并得到高质量解，更便于对联邦学习多目标优化模型的求解，提出基于快速贪婪初始化的初始解构建算法，引入二进制和实值编码和解码策略，加快NSGA-III算法进化效率。(2) Propose FNSGA-III algorithm. In order to quickly converge the NSGA-III algorithm and obtain high-quality solutions, it is more convenient to solve the multi-objective optimization model of federated learning, an initial solution construction algorithm based on fast greedy initialization is proposed, and binary and real-valued encoding and decoding strategies are introduced to speed up NSGA-III. Algorithm evolution efficiency.

(3)通过MNIST数据集实验，验证FNSGA-III算法所得Pareto解优于 NSGA-III算法，FNSGA-III算法的超体积HV指标值最高能达到NSGA-III算法的127.55％，并且运行时间最优情况下是NSGA-III算法的73.96％。此外，将FNSGA-III算法与其它经典进化算法NSGA-II、SPEA2进行对比，结果表明FNSGA-III算法所得Pareto解质量更高。最后，选择部分Pareto解进行联邦学习实验，算法所得Pareto解能在保证全局模型准确率的情况下，有效降低通信成本和全局模型准确率分布方差。(3) Through the MNIST data set experiment, it is verified that the Pareto solution obtained by the FNSGA-III algorithm is better than the NSGA-III algorithm. The FNSGA-III algorithm's hyper-volume HV index value can reach 127.55% of the NSGA-III algorithm, and the running time is optimal. The case is 73.96% of the NSGA-III algorithm. In addition, comparing the FNSGA-III algorithm with other classical evolutionary algorithms NSGA-II and SPEA2, the results show that the Pareto solution quality obtained by the FNSGA-III algorithm is higher. Finally, some Pareto solutions are selected for federated learning experiments. The Pareto solutions obtained by the algorithm can effectively reduce the communication cost and the distribution variance of the global model accuracy while ensuring the accuracy of the global model.

近年来，联邦学习受到了广泛的关注。McMahan在2016年首次提出了联邦学习的概念和联邦平均算法(Federated Averaging，FedAvg)，该算法对隐私保护和数据孤岛下的机器学习训练有重要应用意义。联邦学习的研究不断深入，但它仍然存在一些未克服的挑战，如通信成本高、结构异质性等挑战。In recent years, federated learning has received extensive attention. McMahan first proposed the concept of federated learning and Federated Averaging (FedAvg) in 2016, which has important application significance for privacy protection and machine learning training under data silos. The research of federated learning continues to deepen, but it still has some unsurmountable challenges, such as high communication cost and structural heterogeneity.

为了使联邦学习在海量数据中也能适用，必须考虑降低联邦学习的通信开销问题。McMahan等人提出的FedAvg算法通过增加每一轮通信中的本地训练计算量，来减少全局通信轮次，从而提高通信效率。还有学者通过减少参与者上传参数大小，来降低通信传输量。Chen等人提出了一种分层异步更新算法，作者根据深度神经网络模型的结构特点将参数分层为浅层参数和深层参数，在前期的全局通信迭代过程中，本地参与者与服务器之间只传输浅层参数，并只在最后几轮通信中对全局模型的深层参数进行传输聚合更新，该算法通过减少传输模型参数大小和降低神经网络中深层参数的更新频率来降低通信开销，不足之处是模型的准确率会受到影响。Zhu等人将稀疏进化训练算法(Sparseevolutionary Training，SET)引入到联邦学习中，SET算法的主要思想是在神经网络的全连接层之间，通过稀疏性参数来控制全连接网络之间的连接稀疏度，以此来减少传输模型参数大小，有效降低了通信成本。In order to make federated learning applicable to massive data, we must consider reducing the communication overhead of federated learning. The FedAvg algorithm proposed by McMahan et al. reduces the number of global communication rounds by increasing the local training computation in each round of communication, thereby improving communication efficiency. Some scholars reduce the amount of communication transmission by reducing the size of the parameters uploaded by the participants. Chen et al. proposed a layered asynchronous update algorithm. The authors layered the parameters into shallow parameters and deep parameters according to the structural characteristics of the deep neural network model. In the early global communication iteration process, between local participants and the server Only the shallow parameters are transmitted, and the deep parameters of the global model are only transmitted and aggregated in the last few rounds of communication. The algorithm reduces the communication overhead by reducing the size of the parameters of the transmission model and the update frequency of the deep parameters in the neural network. However, the accuracy of the model will be affected. Zhu et al. introduced the sparse evolutionary training (SET) algorithm into federated learning. The main idea of the SET algorithm is to control the connection sparseness between the fully connected networks through the sparsity parameter between the fully connected layers of the neural network. In order to reduce the size of the parameters of the transmission model, the communication cost is effectively reduced.

除了通信成本外，结构异质性也是联邦学习优化的主要问题之一。不同参与者之间由于计算存储能力不同、网络环境不同，训练时参与者会出现离线、传输模型参数丢失等情况。为了增强联邦学习的鲁棒性，有学者对结构异质性问题进行了多方面的研究。Hao等人设计了安全聚合协议，允许参与者随时退出，只要剩余参与者数能够满足联邦学习更新，提高了系统的容错率和鲁棒性。有学者研究了如何合理分配异构设备资源，Kang等人考虑了参与者的开销差异，以激励更多高质量参与者进行联邦学习训练。Li等人使用全局模型准确率方差作为公平性衡量指标，设计了一个q-FFL(q-Fair federated learning)优化算法，该算法增加了高损耗参与者的模型聚合权重，实验表明，算法能使低准确率参与者的准确率提高，实现不同参与者之间性能分布均衡，推动联邦学习的公平资源分配。Besides communication cost, structural heterogeneity is also one of the main problems for federated learning optimization. Due to different computing and storage capabilities and different network environments among different participants, the participants will be offline during training, and the parameters of the transmission model will be lost. In order to enhance the robustness of federated learning, some scholars have conducted various studies on the problem of structural heterogeneity. Hao et al. designed a secure aggregation protocol that allows participants to withdraw at any time, as long as the number of remaining participants can satisfy the federated learning update, which improves the fault tolerance and robustness of the system. Some scholars have studied how to reasonably allocate heterogeneous equipment resources. Kang et al. considered the cost difference of participants to motivate more high-quality participants to perform federated learning training. Using the global model accuracy variance as a fairness measure, Li et al. designed a q-FFL (q-Fair federated learning) optimization algorithm, which increased the model aggregation weight of high-loss participants. The accuracy of low-accuracy participants is improved, the performance distribution between different participants is balanced, and the fair resource allocation of federated learning is promoted.

上述研究针对通信成本或结构异质性某一方面进行，在不同程度、不同目标上实现了对联邦学习算法的优化，但联邦学习的应用更多情况下对模型的准确率、公平性、通信成本等都有要求，为了实现联邦学习框架下多个目标的平衡，有学者尝试将智能优化算法与联邦学习结合。Zhu等人定义联邦学习为双目标优化问题，以最小化模型测试错误率和通信成本为目标，使用 NSGA-II(non-dominated sorted genetic algorithm-II)算法优化了联邦学习的神经网络结构参数，算法进化出的Pareto解与标准的FedAvg算法相比，在一定程度上同时提高模型性能和通信效率，但该算法并未考虑由联邦学习的结构异质性问题带来的通信不稳定及参与者之间准确分布不均衡等其它情况，同时使用的NSGA-II算法对于多目标联邦学习模型的扩展性较差。Basheer等人使用粒子群算法来更新优化联邦学习的神经网络的隐藏层数、神经元数和全局通信轮次，但其优化目标为单目标，没有综合考虑联邦学习的其它目标。The above research is carried out on one aspect of communication cost or structural heterogeneity, and the optimization of the federated learning algorithm has been achieved to different degrees and different goals. There are requirements for costs, etc. In order to achieve the balance of multiple goals under the federated learning framework, some scholars have tried to combine intelligent optimization algorithms with federated learning. Zhu et al. defined federated learning as a dual-objective optimization problem, aiming to minimize the model test error rate and communication cost, and used the NSGA-II (non-dominated sorted genetic algorithm-II) algorithm to optimize the neural network structure parameters of federated learning. Compared with the standard FedAvg algorithm, the Pareto solution evolved by the algorithm can simultaneously improve the model performance and communication efficiency to a certain extent, but the algorithm does not consider the communication instability and participants caused by the structural heterogeneity of federated learning. In other cases, the accurate distribution is not balanced, and the NSGA-II algorithm used at the same time has poor scalability for the multi-objective federated learning model. Basheer et al. used particle swarm optimization to update the number of hidden layers, neurons and global communication rounds of the neural network optimized for federated learning, but the optimization objective was a single objective, and other objectives of federated learning were not considered comprehensively.

发明内容SUMMARY OF THE INVENTION

针对上述问题，本发明综合考虑通信成本和结构异质性问题，引入公平性作为优化目标，以探究联邦学习的模型准确率、公平性和通信成本的多目标均衡关系。并在实验中设定通信环境是不稳定的，增强算法的鲁棒性。In view of the above problems, the present invention comprehensively considers the communication cost and structural heterogeneity, and introduces fairness as an optimization objective to explore the multi-objective equilibrium relationship between model accuracy, fairness and communication cost of federated learning. And the communication environment is set to be unstable in the experiment to enhance the robustness of the algorithm.

本发明提出的基于改进NSGA-III的多目标联邦学习进化方法，应用于服务器和多个参与者，包括以下步骤：The multi-objective federated learning evolution method based on the improved NSGA-III proposed by the present invention is applied to the server and multiple participants, including the following steps:

获取学习数据，所述学习数据用于进行标注；acquiring learning data, the learning data is used for labeling;

构建联邦学习多目标优化模型，所述多目标优化模型包括最大化全局模型准确率、最小化全局模型准确率分布方差和通信成本三个目标；Constructing a federated learning multi-objective optimization model, the multi-objective optimization model includes three objectives of maximizing the accuracy of the global model, minimizing the distribution variance of the accuracy of the global model, and communication cost;

快速贪婪初始化P_t＝0，使用FedAvg算法进行多目标评估；Fast greedy initialization P _t = 0, multi-objective evaluation using FedAvg algorithm;

对P_t进行非占优排序；Perform non-dominant sorting on P _t ;

进行迭代，每次迭代进行如下操作：选择、交叉、变异算子操作，产生 Q_t，对子代种群进行联邦学习的训练评估，计算得到每个个体的三个目标；混合父代和子代种群R_t＝Q_t+P_t；对R_t非占优排序，并依靠参考点选择P_t+1；Iteration is performed, and the following operations are performed in each iteration: selection, crossover, and mutation operator operations to generate Q _t , the training evaluation of federated learning is performed on the offspring population, and the three goals of each individual are obtained by calculation; the parent and offspring populations are mixed. R _t =Q _t +P _t ; sort R _t non-dominantly, and select P _t+1 depending on the reference point;

找出Pareto最优解，输出Pareto最优解对应的标注结果。Find the Pareto optimal solution and output the labeling result corresponding to the Pareto optimal solution.

进一步的，在联邦学习进化过程中，拥有数据集为D_k的第k个参与者的损失函数为：Further, in the evolutionary process of federated learning, the loss function of the _kth participant with the dataset Dk is:

联邦学习进化方法的全局目标是最小化如下的全局损失函数L(w)：The global goal of federated learning evolutionary methods is to minimize the global loss function L(w) as follows:

其中，k是参与者的序号，L_k(w)为第k个参与者的损失函数，l_i(w)是数据样本i上的损失函数，n_k为参与者k的数据集D_k大小n_k＝|D_k|，n为K个参与者的数据样本总大小。where k is the serial number of the participant, L _k (w) is the loss function of the kth participant, l _i (w) is the loss function on the data sample i, and n _k is the size of the dataset D _k of participant k n _k = |D _k |, where n is the total data sample size of K participants.

进一步的，在联邦学习过程的每一轮训练中，各个参与者从服务器接收全局模型w_t，并使用本地数据对全局模型进行训练，得到更新的本地局部模型

然后参与者将更新后的本地模型发送给服务器，服务器以一定的规则聚合各个模型，得到一个新的全局模型w_t+1，用于下一轮次的迭代训练，下标 t表示联邦学习的通信轮次。Further, in each round of training in the federated learning process, each participant receives the global model _wt from the server, and uses the local data to train the global model to obtain an updated local local model.

Then the participant sends the updated local model to the server, and the server aggregates each model according to certain rules to obtain a new global model w _t+1 for the next round of iterative training, where the subscript t represents the federated learning Communication rounds.

进一步的，联邦学习进化的三目标优化模型为：Further, the three-objective optimization model of federated learning evolution is:

其中，F(v)是三目标优化模型的目标函数,此模型有3个最小化目标，分别为最小化全局模型测试错误率f₁、全局模型准确率分布方差f₂、通信成本f₃， Conv为卷积层数、kc为卷积核数目、ks为卷积核大小、L为全连接层数、N为全连接层神经元数、η为学习率，ε为神经网络的连通性参数，两层全连接层之间的连接数由连通性参数ε决定，总连接数为n＝ε(n^k+n^k-1)，其中，n^k和 n^k-1分别为k层和k-1层的神经元数。Among them, F(v) is the objective function of the three-objective optimization model. This model has three minimization objectives, which are to minimize the global model test error rate f ₁ , the global model accuracy distribution variance f ₂ , and the communication cost f ₃ , Conv is the number of convolution layers, kc is the number of convolution kernels, ks is the size of the convolution kernels, L is the number of fully connected layers, N is the number of neurons in the fully connected layer, η is the learning rate, and ε is the connectivity parameter of the neural network , the number of connections between the two fully connected layers is determined by the connectivity parameter ε, and the total number of connections is n=ε(n ^k +n ^k-1 ), where n ^k and n ^k-1 are the k layers and k respectively - Number of neurons in layer 1.

进一步的，目标f₁的全局模型测试错误率E＝1-A，A为全局模型平均测试准确率

{a₁,a₂,...,a_K}为各个参与者的准确率。Further, the global model test error rate E=1-A of the target f ₁ , A is the average test accuracy rate of the global model

{a ₁ ,a ₂ ,...,a _K } is the accuracy of each participant.

进一步的，目标f₂为全局模型准确率分布方差

Further, the target f ₂ is the global model accuracy distribution variance

进一步的，目标f₃可以表示为

K是参与者总数，C是参与者每轮参与比例，σ是模型参数大小。Further, the target _f3 can be expressed as

K is the total number of participants, C is the proportion of participants in each round, and σ is the size of the model parameters.

进一步的，所述对子代种群进行联邦学习的训练评估的过程，通过基于静态SET的FedAvg算法实现，具体包括如下步骤：Further, the described process of carrying out the training evaluation of federated learning to the offspring population is realized by the FedAvg algorithm based on static SET, and specifically includes the following steps:

i是FNSGA-III算法中种群的一个个体，P是种群规模，对个体i解码后，得到相关的联邦学习神经网络超参数、神经网络的连通性和参与者每轮参与比例C_i；i is an individual of the population in the FNSGA-III algorithm, P is the population size, after decoding the individual i, the relevant federated learning neural network hyperparameters, the connectivity of the neural network and the participation ratio C _i of the participants in each round are obtained;

使用连通信参数ε_i初始化静态SET拓扑，将所述静态SET拓扑作为算法中使用的全局模型；Use the communication parameter ε _i to initialize the static SET topology, and use the static SET topology as the global model used in the algorithm;

在每一轮训练过程中，使用小批量随机梯度下降方法训练本地数据；During each round of training, use the mini-batch stochastic gradient descent method to train the local data;

经过一定轮次后，计算全局模型的测试错误率、全局模型准确率分布方差和通信成本三个目标。After a certain number of rounds, the test error rate of the global model, the distribution variance of the global model accuracy rate and the communication cost are calculated.

进一步的，MLP和CNN的卷积层数、卷积核数目、卷积核大小、全连接层数、全连接层每层神经元数和SET参数ε使用二进制编码，学习率η和每轮参与比例C使用实值编码。Further, the number of convolutional layers, the number of convolutional kernels, the size of convolutional kernels, the number of fully connected layers, the number of neurons in each layer of the fully connected layer, and the SET parameter ε of MLP and CNN use binary encoding, the learning rate η and the participation in each round Scale C uses real-valued encoding.

进一步的，所述学习数据为MNIST数据集。Further, the learning data is the MNIST data set.

本发明的有益效果如下：The beneficial effects of the present invention are as follows:

本发明提出FNSGA-III算法来解决多目标联邦学习模型问题，并在通信不稳定下进行实验验证。我们首先构建了联邦学习的三目标模板，设定优化目标为最小化全局模型测试错误率、通信成本、全局模型准确率分布方差，决策变量为神经网络的超参数以及联邦学习参数。引入NSGA-III算法对联邦学习多目标模型进行求解，并改变了NSGA-III的初始化，实验结果表明，改进的FNSGA-III算法优于原NSGA-III算法。并且使用FNSGA-III算法优化得到的 Pareto最优解与基准的联邦平均算法比较，有效提高了全局模型准确率、降低了全局模型准确率分布方差及通信成本。The present invention proposes FNSGA-III algorithm to solve the problem of multi-objective federated learning model, and conducts experimental verification under unstable communication conditions. We first constructed a three-objective template for federated learning, and set the optimization objective to minimize the global model test error rate, communication cost, and global model accuracy distribution variance. The decision variables were the hyperparameters of the neural network and the federated learning parameters. The NSGA-III algorithm is introduced to solve the federated learning multi-objective model, and the initialization of NSGA-III is changed. The experimental results show that the improved FNSGA-III algorithm is better than the original NSGA-III algorithm. And compared with the benchmark federated average algorithm, the Pareto optimal solution optimized by the FNSGA-III algorithm effectively improves the accuracy of the global model and reduces the distribution variance and communication cost of the global model accuracy.

附图说明Description of drawings

图1联邦学习训练过程图；Figure 1. The federated learning training process diagram;

图2本发明的MLP模型的染色体编码实例图；Fig. 2 chromosome coding example diagram of MLP model of the present invention;

图3本发明的CNN模型的染色体编码实例图；Fig. 3 chromosome coding example diagram of CNN model of the present invention;

图4本发明的算法流程图；Fig. 4 algorithm flow chart of the present invention;

图5本发明与NSGA-III算法的MLP在IID上实验对比图；Fig. 5 is an experiment contrast diagram on IID of the MLP of the present invention and NSGA-III algorithm;

图6本发明与NSGA-III算法的MLP在non-IID上实验对比图；Fig. 6 is an experiment comparison diagram of the present invention and the MLP of NSGA-III algorithm on non-IID;

图7本发明与NSGA-III算法的CNN在IID上实验对比图；Fig. 7 is the experiment contrast diagram of the CNN of the present invention and NSGA-III algorithm on IID;

图8本发明与NSGA-III算法的CNN在non-IID上实验对比图；Fig. 8 is the experiment comparison diagram of the present invention and the CNN of NSGA-III algorithm on non-IID;

图9本发明、NSGA-II和SPEA2的Pareto最优解图。Figure 9 Pareto optimal solution diagram of the present invention, NSGA-II and SPEA2.

具体实施方式Detailed ways

下面结合附图对本发明作进一步的说明，但不以任何方式对本发明加以限制，基于本发明教导所作的任何变换或替换，均属于本发明的保护范围。The present invention is further described below in conjunction with the accompanying drawings, but the present invention is not limited in any way, and any transformation or replacement based on the teaching of the present invention belongs to the protection scope of the present invention.

为达到该目的，本发明采用的技术方案包括步骤如下：In order to achieve this purpose, the technical scheme adopted in the present invention comprises the steps as follows:

为了使本发明的技术方案和有益效果更加清楚，以下结合实际例子，对本发明进行进一步描述。应当理解，此处所描述的具体实施例仅用以解释本发明，并不用于限定本发明。In order to make the technical solutions and beneficial effects of the present invention clearer, the present invention will be further described below with reference to practical examples. It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.

实施例Example

联邦学习是一种隐私保护机器学习技术，它使参与者共同训练出一个全局模型，而无需上传本地隐私数据到服务器中。假设有K个参与者，拥有的数据为{D₁,D₂,…,D_K}，传统的集中式学习是把所有的数据放在一起，使用 D＝D₁∪D₂…D_K来训练模型。Federated learning is a privacy-preserving machine learning technique that enables participants to jointly train a global model without uploading local private data to the server. Suppose there are K participants with data {D ₁ ,D ₂ ,...,D _K }, the traditional centralized learning is to put all the data together, using D=D ₁ ∪ D ₂ ... D _K to Train the model.

在联邦学习过程中，拥有数据集为D_k的第k个参与者的损失函数为：In the federated learning process, the loss function of the _kth participant with the dataset Dk is:

联邦学习全局目标是最小化全局损失函数L(w)：The global goal of federated learning is to minimize the global loss function L(w):

式(1)和(2)中，k是参与者的序号，L_k(w)为第k个参与者的损失函数，l_i(w) 是数据样本i上的损失函数，n_k为参与者k的数据集D_k大小n_k＝|D_k|，n为K个参与者的数据样本总大小。联邦学习的目标是通过最小化参与者损失函数 L_k(w)的加权平均，来优化全局损失函数L(w)。联邦学习是一个协作过程，如图1所示。In formulas (1) and (2), k is the serial number of the participant, L _k (w) is the loss function of the kth participant, l _i (w) is the loss function on the data sample i, and n _k is the participation function. The size of the dataset D _k of participant k is n _k = |D _k |, where n is the total size of the data samples of K participants. The goal of federated learning is to optimize the global loss function L(w) by minimizing the weighted average of the participant loss functions _Lk (w). Federated learning is a collaborative process, as shown in Figure 1.

在每一轮训练中，各个参与者从服务器接收全局模型w_t，并使用本地数据对全局模型进行训练，得到更新的本地局部模型

然后参与者将更新后的本地模型发送给服务器，服务器以一定的规则聚合各个模型，得到一个新的全局模型w_t+1，用于下一轮次的迭代训练。下标t表示联邦学习的通信轮次。第3代非支配排序遗传算法(NSGA-III)In each round of training, each participant receives the global model _wt from the server, and uses the local data to train the global model to obtain an updated local local model

Then the participant sends the updated local model to the server, and the server aggregates each model according to certain rules to obtain a new global model w _t+1 for the next round of iterative training. The subscript t denotes the communication round of federated learning. The third generation of non-dominated sorting genetic algorithm (NSGA-III)

基于遗传算法和Pareto最优解的多目标优化进化算法的研究成果有许多，比如第2代非支配排序遗传算法(NSGA-II)、基于分解的多目标进化算法 (Multi-objectiveevolutionary algorithm based on decomposition，MOEA/D)、 SPEA2(Strength ParetoEvolutionary Algorithm 2)、PAESPareto archived evolution strategy)。NSGA-II是一种强大的、鲁棒的多目标进化算法，适用于具有两个或三个目标的问题。如果目标数大于3个，则可以采用较新的进化算法，如基于参考点的第3代非支配排序遗传算法(NSGA-III)，NSGA-III 在四个或四个以上目标的优化问题上的性能优于NSGA--II。本发明将联邦学习定义为三目标优化问题模型，为了保证算法目标的可扩展性，例如当增加联邦学习的目标扩展为四个及以上时，算法仍然适用，本发明采用NSGA-III 算法，其基本步骤可以概述为：There are many research results of multi-objective optimization evolutionary algorithm based on genetic algorithm and Pareto optimal solution, such as the second generation non-dominated sorting genetic algorithm (NSGA-II), multi-objective evolutionary algorithm based on decomposition (Multi-objective evolutionary algorithm based on decomposition) , MOEA/D), SPEA2 (Strength ParetoEvolutionary Algorithm 2), PAESPareto archived evolution strategy). NSGA-II is a powerful and robust multi-objective evolutionary algorithm suitable for problems with two or three objectives. If the number of objectives is greater than 3, newer evolutionary algorithms can be used, such as the 3rd generation non-dominated sorting genetic algorithm based on reference points (NSGA-III), NSGA-III is used for optimization problems with four or more objectives The performance is better than NSGA--II. The present invention defines federated learning as a three-objective optimization problem model. In order to ensure the scalability of the algorithm objectives, for example, when the objectives of the federated learning are expanded to four or more, the algorithm is still applicable. The present invention adopts the NSGA-III algorithm, which The basic steps can be outlined as:

步骤1初始化参考点和父代种群P_t(种群大小为N)，种群内个体进行非支配排序、个体归一化、关联到参考点。Step 1: Initialize the reference point and the parent population P _t (population size is N), and perform non-dominated sorting of individuals in the population, normalize individuals, and associate to the reference point.

步骤2在P_t上使用选择、交叉、变异算子，创建一个与父代种群P_t大小相同的子代种群Q_t。Step 2 uses selection, crossover, and mutation operators on P _t to create a child population Q _{t with the same size as the parent population P t} _.

步骤3混合P_t和Q_t成一个新种群R_t，其中R_t种群大小为2N。对合并种群进行非支配排序，将其划分为不同的非支配解集(F₁,F₂,...,F_s)、并进行个体归一化、关联到参考点。Step 3 mixes P _t and Q _t into a new population R _t , where R _t population size is 2N. The merged population is sorted non-dominated, divided into different non-dominated solution sets (F ₁ , F ₂ ,..., F _s ), and individual normalization is performed to associate with reference points.

步骤4从排序后的种群R_t中选择N个解产生下一代父代种群P_t+1。如果选择的非支配前沿解的数量大于N，则在待选的最后一层非支配解集中，采用基于参考点的选择方法进行解的选择。Step 4: Select N solutions from the sorted population R _t to generate the next generation parent population P _t+1 . If the number of selected non-dominated frontier solutions is greater than N, in the non-dominated solution set of the last layer to be selected, a reference point-based selection method is used to select solutions.

步骤5转到步骤2，重复整个过程，直到满足预先设定的停止条件，输出Pareto最优解。Step 5, go to Step 2, repeat the whole process until the preset stopping conditions are met, and output the Pareto optimal solution.

本发明在典型的多目标优化模型的基础上构建了联邦学习的三目标优化模型，并对其目标、决策变量、变量编码等进行阐述。其中本发明的联邦学习的三目标优化模型为：The present invention builds a three-objective optimization model of federated learning on the basis of a typical multi-objective optimization model, and elaborates its objectives, decision variables, variable codes, and the like. The three-objective optimization model of the federated learning of the present invention is:

(1)目标函数：(1) Objective function:

F(v)是该模型的目标函数，此模型有3个最小化目标，分别为最小化全局模型测试错误率f₁、全局模型准确率分布方差f₂、通信成本f₃。F(v) is the objective function of the model. This model has three minimization objectives, which are to minimize the global model test error rate f ₁ , the global model accuracy distribution variance f ₂ , and the communication cost f ₃ .

本发明中，联邦学习三个目标的具体评估过程为，使用FedAvg算法结合 SET算法进行一定通信轮次的训练，将训练好的全局模型w进行测试，得到各个参与者的准确率{a₁,a₂,...,a_K}。计算得到全局模型平均测试准确率

由此，可以计算出目标f₁全局模型测试错误率E＝1-A。In the present invention, the specific evaluation process of the three goals of federated learning is to use the FedAvg algorithm combined with the SET algorithm to conduct training for a certain round of communication, test the trained global model w, and obtain the accuracy rate {a ₁ , a ₂ ,...,a _K }. Calculate the average test accuracy of the global model

Thus, the target f ₁ global model test error rate E=1-A can be calculated.

目标f₂为全局模型准确率分布方差

方差可以看作公平性的度量指标之一，将公平性作为优化目标，可以尽量避免出现平均精度高，但单个参与者没有准确率保证的情况。通过减小参与方之间全局模型准确率分布的方差，从而使聚合后参与者准确率分布更均匀、更公平。The target f ₂ is the variance of the global model accuracy distribution

Variance can be regarded as one of the measures of fairness. Taking fairness as the optimization goal can try to avoid the situation where the average precision is high, but there is no guarantee of accuracy for a single participant. By reducing the variance of the global model accuracy distribution among the participants, the accuracy distribution of the participants after aggregation is more uniform and fair.

目标f₃为参与者的平均通信成本，假设每个参与者的通信成本只与其传输的模型参数大小相关，由于本发明使用的神经网络结构在联邦学习训练过程中并未发生改变，因此，各参与者的模型参数大小σ相同并保持不变，则目标f₃可以表示为

K是参与者总数，C是参与者每轮参与比例，σ是模型参数大小。The target _f3 is the average communication cost of the participants. It is assumed that the communication cost of each participant is only related to the size of the model parameters it transmits. Since the neural network structure used in the present invention does not change during the federated learning training process, each The model parameter size σ of the participants is the same and remains unchanged, then the target _f3 can be expressed as

(2)模型决策变量及其约束(2) Model decision variables and their constraints

因联邦学习是机器学习模型进行协作训练的过程，在本发明，决策变量和约束条件为待优化的各参数及其范围，用v来表示。我们设定优化的参数包括联邦学习的神经网络超参数、神经网络的连通性参数ε和每轮参与比例C三部分。Since federated learning is a process of collaborative training of machine learning models, in the present invention, decision variables and constraints are parameters to be optimized and their ranges, which are represented by v. The parameters we set for optimization include the neural network hyperparameters of federated learning, the connectivity parameter ε of the neural network, and the participation ratio C in each round.

本发明的神经网络选择了多层感知机(Multilayer perceptron，MLP)和卷积神经网络(Convolutional neural network，CNN)，其中MLP模型下超参数包括隐藏层数L、每一层隐藏层的神经元数N、学习率η；CNN超参数包括卷积层数Conv、卷积核数目kc、卷积核大小ks、全连接层数L、全连接层神经元数N、学习率η。即v＝{Conv,kc,ks,L,N,ε,η,C}，各个变量的取值范围在实验部分设定。The neural network of the present invention selects a multi-layer perceptron (MLP) and a convolutional neural network (CNN), wherein the hyperparameters under the MLP model include the number of hidden layers L, the number of neurons in each hidden layer Number N, learning rate η; CNN hyperparameters include the number of convolution layers Conv, the number of convolution kernels kc, the size of the convolution kernel ks, the number of fully connected layers L, the number of fully connected layer neurons N, and the learning rate η. That is, v={Conv,kc,ks,L,N,ε,η,C}, the value range of each variable is set in the experimental part.

神经网络的连通性参数ε借鉴了Mocanu提出的SET算法中的静态SET 算法，即首先在两层全连接层之间使用ER随机图初始化稀疏权重矩阵，其后保持网络的拓扑固定不变。两层之间的连接数由参数ε决定，总连接数为 n＝ε(n^k+n^k-1)，其中，n^k和n^k-1分别为k层和k-1层的神经元数。在本发明中，静态SET算法使用于MLP及CNN的全连接层中。The connectivity parameter ε of the neural network draws on the static SET algorithm in the SET algorithm proposed by Mocanu, that is, firstly, the sparse weight matrix is initialized with an ER random graph between the two fully connected layers, and then the topology of the network is kept fixed. The number of connections between the two layers is determined by the parameter ε, and the total number of connections is n=ε(n ^k +n ^k-1 ), where n ^k and n ^k-1 are neurons in layers k and k-1 respectively number. In the present invention, the static SET algorithm is used in the fully connected layers of MLP and CNN.

(3)决策变量编码(3) Coding of decision variables

我们采用FNSGA-III算法来优化联邦学习的神经网络超参数、连通性参数ε和每轮参与比例C。染色体是算法操作的主体，在本发明，我们有整数和实数两种类型的决策变量需要编码，其中所有整数使用二进制编码，实数使用实值编码。则MLP和CNN的卷积层数、卷积核数目、卷积核大小、全连接层数、全连接层每层神经元数和SET参数ε使用二进制编码，学习率η和每轮参与比例C使用实值编码。MLP和CNN的编码实例如图2和图3所示。We adopt the FNSGA-III algorithm to optimize the neural network hyperparameters, connectivity parameter ε and participation ratio C of each round of federated learning. Chromosome is the main body of algorithm operation. In the present invention, we have two types of decision variables, integer and real numbers, which need to be coded. All integers use binary coding, and real numbers use real-value coding. Then the number of convolutional layers, the number of convolutional kernels, the size of convolutional kernels, the number of fully connected layers, the number of neurons in each layer of the fully connected layer and the SET parameter ε of MLP and CNN use binary coding, the learning rate η and the participation ratio C in each round Use real-value encoding. Encoding examples of MLP and CNN are shown in Figure 2 and Figure 3.

解码过程中，二进制解码时自动增加1，例如000000解码为1，MLP例子中，N₁编码为000111，解码为N₁＝8。为了方便，CNN中卷积核大小只在3 和5之间选择，并始终保持卷积输出不变，则在CNN的神经网络结构中，只在最后增加一个池化层。During the decoding process, 1 is automatically added during binary decoding. For example, 000000 is decoded as 1. In the MLP example, N ₁ is encoded as 000111 and decoded as N ₁ =8. For convenience, the size of the convolution kernel in CNN is only selected between 3 and 5, and the convolution output is always kept unchanged. In the neural network structure of CNN, only a pooling layer is added at the end.

为加快搜索性能，本发明改进NSGA-III中的随机初始化为适应于联邦学习的快速贪婪初始化，并将改进后的NSGA-III算法记为FNSGA-III算法。其中，快速贪婪初始化过程简述如下：In order to speed up the search performance, the present invention improves the random initialization in NSGA-III to be a fast greedy initialization suitable for federated learning, and records the improved NSGA-III algorithm as the FNSGA-III algorithm. Among them, the fast greedy initialization process is briefly described as follows:

(1)随机产生l倍种群初始解；(1) Randomly generate the initial solution of l-ploid population;

(2)对所有参与者随机分成相同大小的组后，在每个组内同步进行初始解的联邦学习训练评估过程。其中联邦学习的每轮参与者数、本地训练轮次、全局通信轮次都有所降低，可以较快速得到联邦学习训练后的三个目标，并完成对所有初始解的评估；(2) After randomly dividing all participants into groups of the same size, the federated learning training and evaluation process of the initial solution is performed synchronously in each group. Among them, the number of participants in each round of federated learning, the number of local training rounds, and the number of global communication rounds are all reduced, which can quickly obtain the three goals after federated learning training, and complete the evaluation of all initial solutions;

(3)然后分别选三个目标最优的一倍种群解；(3) Then select the optimal doubling population solutions for the three objectives respectively;

(4)混合去掉重复解后，从中随机选择指定种群数量解。(4) After mixing and removing duplicate solutions, randomly select the solution with the specified population number.

FNSGA-III算法流程FNSGA-III algorithm flow

对于联邦学习的三目标优化模型，本发明使用FNSGA-III算法来求解，以获得一组Pareto最优解，其算法流程图如图4所示。For the three-objective optimization model of federated learning, the present invention uses the FNSGA-III algorithm to solve the problem to obtain a set of Pareto optimal solutions. The algorithm flowchart is shown in FIG. 4 .

FNSGA-III首先采用快速贪婪初始化产生大小为N的初始种群，即第一代父代种群，并将对应的变量编码为二进制和实值染色体。其迭代过程主要是采用二元锦标赛选择两个父代个体来产生两个子代个体，交叉变异算法分别是在二进制染色体上采用单点交叉和翻转变异，在实值染色体上采用模拟二值交叉(Simulated binary crossover，SBX)和多项式变异。重复这个过程直到产生N个子代个体。FNSGA-III first uses fast greedy initialization to generate an initial population of size N, that is, the first-generation parent population, and encodes the corresponding variables as binary and real-valued chromosomes. The iterative process mainly uses binary tournaments to select two parent individuals to generate two offspring individuals. The crossover mutation algorithm uses single-point crossover and flip mutation on binary chromosomes, and simulated binary crossover on real-valued chromosomes ( Simulated binary crossover, SBX) and polynomial variation. Repeat this process until N offspring individuals are produced.

然后对子代种群进行联邦学习的训练评估，计算得到每个个体的三个目标。混合父代种群和子代种群，并对混合种群进行非支配排序。从中选择N个个体作为下一代的父代种群。重复这些步骤，直到满足迭代停止条件。最后得到一组Pareto最优解，对其进行深入分析。Then, the training evaluation of federated learning is performed on the offspring population, and the three goals of each individual are calculated. The parent and child populations are mixed, and the mixed population is sorted non-dominantly. N individuals are selected as the parent population of the next generation. Repeat these steps until the iteration stop condition is met. Finally, a set of Pareto optimal solutions are obtained and analyzed in depth.

其中联邦学习的具体评估过程，是在FedAvg算法的基础上结合静态SET 算法。FNSGA-III算法下的联邦学习评估过程伪代码如算法1所示。The specific evaluation process of federated learning is based on the FedAvg algorithm combined with the static SET algorithm. The pseudo-code of the federated learning evaluation process under the FNSGA-III algorithm is shown in Algorithm 1.

在算法1中，i是FNSGA-III算法中种群的一个个体，P是种群规模。对个体i解码后，得到相关的联邦学习神经网络超参数、神经网络的连通性和参与者每轮参与比例C_i。首先使用连通信参数ε_i初始化静态SET拓扑，将其作为算法中使用的全局模型；在每一轮训练过程中，使用小批量随机梯度下降方法(Mini-batch SGD)训练本地数据。经过一定轮次后，计算全局模型的测试错误率、全局模型准确率分布方差和通信成本三个目标。In Algorithm 1, i is an individual of the population in the FNSGA-III algorithm, and P is the population size. After decoding the individual i, the relevant federated learning neural network hyperparameters, the connectivity of the neural network and the participation ratio C _i of each round of participants are obtained. First, the static SET topology is initialized with the connectivity parameter _εi as the global model used in the algorithm; during each round of training, the local data is trained using Mini-batch SGD. After a certain number of rounds, the test error rate of the global model, the distribution variance of the accuracy rate of the global model, and the communication cost are calculated.

在本部分，我们将介绍本发明实验设置。主要包括以下几个部分：(1)实验环境及实验数据集；(2)实验中使用的神经网络相关参数及稀疏连通性参数； (3)联邦学习的参数及数据划分方式；(4)FNSGA-III参数。In this section, we will introduce the experimental setup of the present invention. It mainly includes the following parts: (1) experimental environment and experimental data set; (2) neural network related parameters and sparse connectivity parameters used in the experiment; (3) federated learning parameters and data division methods; (4) FNSGA -III parameter.

实验环境是基于Intel(R)Core(TM)i9-9900KF CPU@3.60GHz×16的 Ubuntu系统。每个实验都在MNIST数据集上训练和测试，MNIST数据集由 28×28像素的手写数字图像组成，有60 000个训练图像和10 000个测试图像。The experimental environment is an Ubuntu system based on Intel(R) Core(TM) i9-9900KF CPU@3.60GHz×16. Each experiment is trained and tested on the MNIST dataset, which consists of 28 × 28 pixel images of handwritten digits, with 60 000 training images and 10 000 testing images.

我们选择了MLP和CNN作为联邦学习训练的神经网络模型，并依经验设置了本发明的标准MLP和CNN参数。其中，MLP中有2个隐藏层，每层有200个神经元(参数量为199210)，使用ReLu函数作为激活函数。CNN 模型设置两个5×5卷积层(第一个有32个通道，第二个有64个通道)，其后接着一个2×2 Max池化层，一个128神经元的全连接层，使用ReLu激活函数，最后是一个10类的softmax输出层(参数量为1 659 146)。在MLP和CNN 中，Mini-batch SGD算法的学习率η为0.05，批量大小B为10。静态SET算法使用于MLP及CNN的全连接层中，本发明设置网络稀疏度参数ε＝20。上述参数设置作为本发明实验中的标准神经网络结构。We chose MLP and CNN as the neural network models for federated learning training, and empirically set the standard MLP and CNN parameters of the present invention. Among them, there are 2 hidden layers in the MLP, each layer has 200 neurons (the parameter amount is 199210), and the ReLu function is used as the activation function. The CNN model sets two 5×5 convolutional layers (the first has 32 channels and the second has 64 channels), followed by a 2×2 Max pooling layer, a fully connected layer of 128 neurons, The ReLu activation function is used, and finally a 10-class softmax output layer (with a parameter size of 1 659 146). In MLP and CNN, the learning rate η of the Mini-batch SGD algorithm is 0.05 and the batch size B is 10. The static SET algorithm is used in the fully connected layer of MLP and CNN, and the present invention sets the network sparsity parameter ε=20. The above parameter settings are used as the standard neural network structure in the experiments of the present invention.

在联邦学习中，我们设置参与者总数K为100，参与者参与比例C为1，即每轮通信中有100×1个参与者。对于参与者本地模型训练，迭代轮次epoch 设置为5。由于数据在不同参与者之间的大小和分布往往是不相同的，我们研究以下两个现实场景：第一种独立同分布(Independent Identically Distribution， IID)，首先将MNIST数据打乱，则100个参与者中每个包含600样本；第二种非独立同分布(Non-Independent IdenticallyDistribution，non-IID)，首先对标记数字标签进行排序，将其按照300样本平均划分为200个片段，并给100 个参与者分配两个片段，则每个参与者只含有两个标签且数据样本大小相同。因为本发明假定联邦学习的通信环境不稳定，模型参数传输存在丢失，在此设定丢失率Drop＝30％。In federated learning, we set the total number of participants K as 100, and the participant participation ratio C as 1, that is, there are 100 × 1 participants in each round of communication. For participant local model training, the iteration epoch is set to 5. Since the size and distribution of data between different participants are often different, we study the following two real-world scenarios: the first independent identical distribution (IID), first shuffle the MNIST data, then 100 Each of the participants contains 600 samples; the second non-Independent Identically Distribution (non-IID), first sorts the marked digital labels, divides them into 200 segments on average according to 300 samples, and gives 100 Each participant is assigned two segments, each participant contains only two labels and the data sample size is the same. Because the present invention assumes that the communication environment of federated learning is unstable and the model parameter transmission is lost, the loss rate Drop=30% is set here.

接着，我们对FNSGA-III进行参数设置，其中种群规模设置为20，迭代 20次。选择算子采用二轮锦标赛，二进制染色体中采用概率为0.9的单点交叉，和概率为0.1的位翻转变异；实值染色体中采用概率为0.9且n_c＝2的模拟二进制交叉，和概率为0.1且n_m＝20的多项式变异。Next, we set the parameters for FNSGA-III, where the population size was set to 20, and 20 iterations were performed. The selection operator adopts the two-round tournament, the single-point crossover with probability 0.9 is used in the binary chromosome, and the bit-flip mutation with probability 0.1 is used; the simulated binary crossover with probability 0.9 and _nc = 2 is used in the real-valued chromosome, and the sum probability is Polynomial variation of 0.1 and _nm = 20.

本发明使用FNSGA-III算法来优化联邦学习的相关参数，以实现联邦学习中全局模型测试错误率、全局模型准确率分布方差和通信成本之间我们选择了MLP和CNN作为联邦学习训练的神经网络模型，并依经验设置了本发明的标准MLP和CNN参数。其中，MLP中有2个隐藏层，每层有200个神经元(参数量为199 210)，使用ReLu函数作为激活函数。CNN模型设置两个5×5卷积层(第一个有32个通道，第二个有64个通道)，其后接着一个2×2 Max池化层，一个128神经元的全连接层，使用ReLu激活函数，最后是一个 10类的softmax输出层(参数量为1 659 146)。在MLP和CNN中，Mini-batch SGD算法的学习率η为0.05，批量大小B为10。静态SET算法使用于MLP 及CNN的全连接层中，本发明设置网络稀疏度参数ε＝20。上述参数设置作为本发明实验中的标准神经网络结构。The present invention uses the FNSGA-III algorithm to optimize the relevant parameters of federated learning, so as to realize the difference between the global model test error rate, the global model accuracy rate distribution variance and the communication cost in the federated learning. We choose MLP and CNN as the neural network for federated learning training. model, and empirically set the standard MLP and CNN parameters of the present invention. Among them, there are 2 hidden layers in MLP, each layer has 200 neurons (parameters are 199 210), and the ReLu function is used as the activation function. The CNN model sets two 5×5 convolutional layers (the first has 32 channels, the second has 64 channels), followed by a 2×2 Max pooling layer, a fully connected layer of 128 neurons, Using the ReLu activation function, finally a 10-class softmax output layer (with a parameter size of 1 659 146). In MLP and CNN, the learning rate η of the Mini-batch SGD algorithm is 0.05 and the batch size B is 10. The static SET algorithm is used in the fully connected layer of MLP and CNN, and the present invention sets the network sparsity parameter ε=20. The above parameter settings are used as the standard neural network structure in the experiments of the present invention.

本发明使用FNSGA-III算法来优化联邦学习的相关参数，以实现联邦学习中全局模型测试错误率、全局模型准确率分布方差和通信成本之间的平衡。首先对FNSGA-III算法和NSGA-III算法进行对比实验，探究提出方法的有效性。多目标进化算法实验分析的相关参数设置如所示。The present invention uses the FNSGA-III algorithm to optimize the relevant parameters of the federated learning, so as to achieve a balance between the global model test error rate, the global model accuracy rate distribution variance and the communication cost in the federated learning. Firstly, a comparative experiment is carried out on the FNSGA-III algorithm and the NSGA-III algorithm to explore the effectiveness of the proposed method. The relevant parameter settings for the experimental analysis of the multi-objective evolutionary algorithm are shown in the figure.

表1多目标联邦学习进化算法相关参数设置Table 1 Multi-objective federated learning evolutionary algorithm related parameter settings

参数parameter MLPMLP CNNCNN 种群大小population size 2020 2020 最大迭代次数The maximum number of iterations 2020 2020 联邦学习参与比例Federated Learning Participation Rate 0.1-10.1-1 0.1-10.1-1 学习率learning rate 0.01-0.20.01-0.2 0.01-0.20.01-0.2 SET参数SET parameter 1-1281-128 1-1281-128 MLP隐藏层数MLP hidden layers 1-41-4 MLP隐藏层神经元数The number of neurons in the hidden layer of MLP 1-2561-256 CNN卷积层数CNN convolutional layers 1-31-3 CNN卷积核通道数CNN convolution kernel channel number 1-641-64 CNN卷积核大小CNN convolution kernel size 3 or 53 or 5 CNN全连接层数CNN fully connected layers 1-31-3 CNN全连接层神经元数The number of neurons in the fully connected layer of CNN 1-256 1-256

我们设置种群大小为20，种群迭代代数为20，对每个个体进行联邦学习评估过程的通信轮数为5次。设置参与者参与比例参数C的取值范围是0.1到 1，保证有一定的参与者参与训练，学习率为0.01至0.2，太高的学习率会影响收敛。We set the population size to 20, the population iteration generation to 20, and the number of communication rounds for each individual to perform the federated learning evaluation process to 5. Set the value range of the participant participation ratio parameter C from 0.1 to 1, to ensure that a certain number of participants participate in the training, and the learning rate is 0.01 to 0.2. Too high a learning rate will affect the convergence.

神经网络的参数设置中，MLP的最大隐藏层数为4，每层的最大神经元数为256。对于CNN，设置最大卷积层数为3，最大核通道数为64，最大全连接层数为3，卷积层中最大神经元数为256，卷积核大小是3或5。网络稀疏度参数最大值设置为128。In the parameter setting of the neural network, the maximum number of hidden layers of MLP is 4, and the maximum number of neurons in each layer is 256. For CNN, set the maximum number of convolutional layers to 3, the maximum number of kernel channels to 64, the maximum number of fully connected layers to 3, the maximum number of neurons in the convolutional layer to be 256, and the size of the convolution kernel to be 3 or 5. The maximum value of the network sparsity parameter is set to 128.

MLP和CNN在IID和non-IID数据上，使用FNSGA-III算法和NSGA-III 算法进化得到的最终Pareto解如图5-图8所示，其中每个点代表联邦学习中特定结构参数对应的一个解。浅色的表示为本发明的FNSGA-III算法下得到的Pareto最优解，深色的点表示为随机初始化的NSGA-III算法下得到的Pareto 最优解。The final Pareto solutions of MLP and CNN using FNSGA-III algorithm and NSGA-III algorithm evolution on IID and non-IID data are shown in Figure 5-Figure 8, where each point represents the corresponding structure parameter in federated learning. a solution. The light-colored points represent the Pareto optimal solution obtained under the FNSGA-III algorithm of the present invention, and the dark-colored points represent the Pareto optimal solution obtained under the randomly initialized NSGA-III algorithm.

FNSGA-III得到的Pareto解的数量比NSGA-III下得到解数量更稳定，其中NSGA-III在CNN下的Pareto解数量较少，比如CNN IID下的NSGA-III 的Pareto解数量只有3个。图5-图8中浅色的解要占优于深色的解，即本发明得到Pareto解优于NSGA-III算法的Pareto解，除了在CNN IID下出现了一个深色解占优于浅色解，但在CNN IID下的解数量非常少。并且IID下的 FNSGA-III算法占优效果更明显，FNSGA-III和NSGA-III的解之间距离相差较大，而non-IID下两个算法的解之间距离差别较小。同时，我们发现 FNSGA-III会更收敛于拐点处，拐点处的解特点是各个目标值小、解质量更高， FNSGA-III的Pareto解集中在拐点处使得其在MLP下的均匀性劣于NSGA-III 的Pareto解的均匀性。The number of Pareto solutions obtained by FNSGA-III is more stable than that obtained by NSGA-III, and the number of Pareto solutions of NSGA-III under CNN is less, for example, the number of Pareto solutions of NSGA-III under CNN IID is only 3. In Fig. 5-Fig. 8, the light-colored solution is superior to the dark-colored solution, that is, the Pareto solution obtained by the present invention is superior to the Pareto solution of the NSGA-III algorithm, except that under CNN IID, a dark-colored solution is superior to the light-colored solution. color solution, but the number of solutions under CNN IID is very small. And the FNSGA-III algorithm under IID has a more dominant effect. The distance between the solutions of FNSGA-III and NSGA-III is quite different, while the distance between the solutions of the two algorithms under non-IID is smaller. At the same time, we find that FNSGA-III will converge more at the inflection point. The solution at the inflection point is characterized by small target values and higher solution quality. The Pareto solution of FNSGA-III is concentrated at the inflection point, which makes its uniformity under MLP inferior to Homogeneity of the Pareto solution of NSGA-III.

此外，表2给出了FNSGA-III与NSGA-III最后得到的Pareto解的相关评价指标结果，从多个维度对两个算法的Pareto解进行分析。In addition, Table 2 shows the related evaluation index results of the Pareto solutions finally obtained by FNSGA-III and NSGA-III. The Pareto solutions of the two algorithms are analyzed from multiple dimensions.

单一目标的最小值反映各个目标函数的极值情况，体现算法的寻优能力。从表2可以看出，FNSGA-III得到的单一目标下的最小全局模型测试错误率、最小方差、最小通信成本基本都比NSGA-Ⅲ要小，例如在最小全局模型测试错误率上，FNSGA-III的值比NSGA-III小，并在CNN non-IID下差距明显，相差4.89％。The minimum value of a single objective reflects the extreme value of each objective function and reflects the optimization ability of the algorithm. It can be seen from Table 2 that the minimum global model test error rate, minimum variance, and minimum communication cost obtained by FNSGA-III under a single target are basically smaller than NSGA-III. For example, in terms of the minimum global model test error rate, FNSGA- The value of III is smaller than that of NSGA-III, and the difference is obvious under CNN non-IID, with a difference of 4.89%.

FNSGA-III算法获得的Pareto非支配解数量稳定，除了MLP IID，其余数量都大于NSGA-III。MLP下两个算法得到的解数量平均来看大于CNN下的解数量。NSGA-III的MLP解数量明显大于CNN的解数量，变化明显。则 FNSGA-III算法在解数量上比NSGA-III算法更具有鲁棒性。The number of Pareto non-dominated solutions obtained by the FNSGA-III algorithm is stable, except for MLP IID, the other numbers are larger than those of NSGA-III. On average, the number of solutions obtained by the two algorithms under MLP is larger than that under CNN. The number of MLP solutions of NSGA-III is significantly larger than that of CNN, and the change is obvious. Then the FNSGA-III algorithm is more robust than the NSGA-III algorithm in terms of the number of solutions.

综合性指标超体积指标(Hypervolume，HV)，计算的是由所有非支配解与参考点构成的超立方体的超体积之和，是评价Pareto解的综合指标，一般来说，HV值越大，所评估的Pareto解的质量越好。由表2可知，FNSGA-III 算法的HV值始终优于NSGA-III，表现出更好的质量。The comprehensive index Hypervolume index (Hypervolume, HV), which calculates the sum of the hypervolume of the hypercube composed of all non-dominated solutions and reference points, is a comprehensive index for evaluating the Pareto solution. The quality of the evaluated Pareto solution is better. It can be seen from Table 2 that the HV value of the FNSGA-III algorithm is always better than that of NSGA-III, showing better quality.

表2 FNSGA-III算法与NSGA-III算法的各指标分析Table 2 Analysis of each index of FNSGA-III algorithm and NSGA-III algorithm

覆盖率C(A,B)计算的是解集B中的解至少被A中的一个解支配的比例，衡量的是两个解集之间的重合程度，指标C越大，代表解集A的质量较解集 B的质量越好。在表2中，C(F,N)中的F表示为FNSGA-III的解集，N表示为 NSGA-III解集，C(F,N)度量值基本都大于C(N,F)，CNN IID下C(F,N)＝50％小于C(N,F)＝53％，但两者差距很小。总的来说，从覆盖率C来看，FNSGA-III 的解优于NSGA-III算法。The coverage ratio C(A, B) calculates the proportion of solutions in solution set B dominated by at least one solution in A, and measures the degree of overlap between the two solution sets. The larger the index C, the more the solution set A is. The quality of is better than the quality of solution set B. In Table 2, F in C(F,N) represents the solution set of FNSGA-III, and N represents the solution set of NSGA-III. The metric value of C(F,N) is basically greater than that of C(N,F), Under CNN IID, C(F,N)=50% is smaller than C(N,F)=53%, but the difference between the two is very small. Overall, the solution of FNSGA-III is better than the NSGA-III algorithm in terms of coverage C.

运行时间上，CNN时间消耗大于MLP，non-IID时间消耗大于IID，且 NSGA-III时间消耗大于FNSGA-III，即从统计结果来看，FNSGA-III的时间性能最优。根据以上对单一目标最小值、非支配解数量、HV指标、C指标和时间性能的分析，我们可以得出结论，提出的FNSGA-III算法得到的Pareto 最优解具有更好的质量，优于NSGA-III算法解。In terms of running time, the time consumption of CNN is greater than that of MLP, the time consumption of non-IID is greater than that of IID, and the time consumption of NSGA-III is greater than that of FNSGA-III, that is, from the statistical results, FNSGA-III has the best time performance. According to the above analysis of single objective minimum value, number of non-dominated solutions, HV index, C index and time performance, we can conclude that the Pareto optimal solution obtained by the proposed FNSGA-III algorithm has better quality than NSGA-III algorithm solution.

此外，我们将提出的算法FNSGA-III与NSGA-II、SPEA2进化多目标算法进行对比，从HV综合指标、Pareto解数目、覆盖率C、时间、单目标最优解等多方面来评估算法所得Pareto最优解的优劣。选择上述实验中， FNSGA-III与NSGA-III所得Pareto最为接近的MLPnon-IID进行多个进化算法的对比试验。实验结果如表3和图6所示。In addition, we compare the proposed algorithm FNSGA-III with NSGA-II and SPEA2 evolutionary multi-objective algorithm, and evaluate the results of the algorithm in terms of HV comprehensive index, number of Pareto solutions, coverage C, time, and single-objective optimal solution. The pros and cons of Pareto optimal solutions. In the above experiments, MLPnon-IID, which is the closest to the Pareto obtained by FNSGA-III and NSGA-III, is selected to carry out comparative experiments of multiple evolutionary algorithms. The experimental results are shown in Table 3 and Figure 6.

表3 FNSGA-III算法与其它进化算法在MLP non-IID下实验的各指标分析Table 3 Analysis of each index of the experiment of FNSGA-III algorithm and other evolutionary algorithms under MLP non-IID

对表3进行简单分析，FNSGA-III的Pareto解更集中在拐点，并且占优于NSGA-II和SPEA2的Pareto解，即三个目标各个目标都优于NSGA-II和 SPEA2算法。覆盖率均与FNSGA-III进行比较。FNSGA-III的HV值、Pareto 解数量、覆盖率、三目标下的最小值都优于NSGA-II和SPEA2的相关指标。时间上，SPEA2的运行时间最短，但FNSGA-III的运行时间与SPEA2的也相差不大。综上，提出的FNSGA-III算法将随机初始化改为快速贪婪初始化，能够在提高运行效率的同时，基本优于NSGA-III、NSGA-II、SPEA2进化算法，得到的Pareto具有更高的质量。A simple analysis of Table 3 shows that the Pareto solution of FNSGA-III is more concentrated at the inflection point and is superior to the Pareto solution of NSGA-II and SPEA2, that is, each of the three objectives is better than the NSGA-II and SPEA2 algorithms. Coverage is all compared with FNSGA-III. The HV value of FNSGA-III, the number of Pareto solutions, the coverage ratio, and the minimum value under the three objectives are all better than those of NSGA-II and SPEA2. In terms of time, the running time of SPEA2 is the shortest, but the running time of FNSGA-III is not much different from that of SPEA2. In summary, the proposed FNSGA-III algorithm changes the random initialization to fast greedy initialization, which can improve the operation efficiency and is basically better than the NSGA-III, NSGA-II, SPEA2 evolutionary algorithms, and the obtained Pareto has higher quality.

由于在FNSGA-III的联邦学习评估过程中的通信轮次设置非常小，未能充分探究其联邦学习性能。因为计算资源的限制，我们只选择准确率最差的 MLP non-IID进行增强实验。对FNSGA-III算法得到的MLP non-IID的Pareto 最优解选择4个解，其中2个是全局测试错误率很小的解，另外2个是拐点解。将此4个解进行联邦学习训练，并与标准FedAvg算法进行对比，通信轮次设置为150轮次。除了增加通信轮次，各个解在IID和non-IID下进行验证，研究在IID数据集上得到的非支配解是否在非IID数据集上仍然有效，反之亦然。所有验证结果列于表4中。Due to the very small set of communication rounds in the federated learning evaluation process of FNSGA-III, its federated learning performance has not been fully explored. Due to the limitation of computational resources, we only select the MLP non-IID with the worst accuracy for augmentation experiments. Four solutions are selected for the Pareto optimal solution of MLP non-IID obtained by the FNSGA-III algorithm, two of which are solutions with a small global test error rate, and the other two are inflection point solutions. These 4 solutions are trained by federated learning and compared with the standard FedAvg algorithm, and the communication rounds are set to 150 rounds. In addition to increasing the number of communication rounds, each solution is verified under IID and non-IID, and it is investigated whether the non-dominated solutions obtained on the IID dataset are still valid on the non-IID dataset, and vice versa. All validation results are listed in Table 4.

表4 FNSGA-III算法所得的MLP non-IID解的实验数据Table 4 Experimental data of MLP non-IID solution obtained by FNSGA-III algorithm

参数parameter 解1Solution 1 解2Solution 2 解3Solution 3 解4Solution 4 标准FedAvgStandard FedAvg 参与比例CParticipation ratio C 0.620.62 0.473 30.473 3 0.281 20.281 2 0.753 40.753 4 11 学习率ηLearning rate η 0.112 50.112 5 0.114 40.114 4 0.0850.085 0.094 90.094 9 0.050.05 SET参数εSET parameter ε 9595 3131 3131 33 // MLP隐藏层神经元数The number of neurons in the hidden layer of MLP 173173 136136 197197 208208 [200,200][200,200] 通信成本communication cost 57 55357 553 14 33014 330 9 1639 163 3 9723 972 199 210199 210 non-IID下测试准确率/％Test accuracy/% under non-IID 96.7096.70 96.0896.08 94.8194.81 91.6091.60 95.6995.69 non-IID下测试准确率分布方差Test accuracy distribution variance under non-IID 4.714.71 5.795.79 12.7812.78 22.5022.50 9.869.86 IID下测试准确率/％Test accuracy/% under IID 97.9897.98 97.5897.58 97.4597.45 95.3995.39 97.3597.35 IID下测试准确率分布方差Test accuracy distribution variance under IID 1.671.67 2.122.12 2.182.18 3.943.94 2.24 2.24

由表4所示的结果，可以对所选择的MLP non-IID四个Pareto解的演化情况进行如下观察。MLP non-IID下得到的4个解在non-IID数据分布下的实验结果中，解4的稀疏度参数为3，其准确率曲线迭代虽然稳定，但是显然低于其它情况，可能意味着过度地稀疏神经网络会损害模型准确率；解1和解2 在通信成本、准确率和方差都比标准联邦学习的结果好，并且迭代曲线稳定，验证了本发明所提出算法FNSGA-III得到的Pareto解中，存在高质量的解。在IID数据分布下的实验结果中，只有解4的迭代曲线劣于标准联邦学习。可以说，在MLP non-IID下的解在non-IID下有效，扩展到IID下仍能有较好的运行效果。From the results shown in Table 4, the following observations can be made on the evolution of the four Pareto solutions of the selected MLP non-IID. In the experimental results of the four solutions obtained under MLP non-IID data distribution under the non-IID data distribution, the sparsity parameter of solution 4 is 3. Although its accuracy curve iteration is stable, it is obviously lower than other cases, which may mean excessive The ground sparse neural network will damage the model accuracy; Solution 1 and Solution 2 are better than the results of standard federated learning in terms of communication cost, accuracy and variance, and the iterative curve is stable, which verifies the Pareto solution obtained by the algorithm FNSGA-III proposed in the present invention. , there are high-quality solutions. Among the experimental results under the IID data distribution, only the iteration curve of solution 4 is inferior to standard federated learning. It can be said that the solution under MLP non-IID is valid under non-IID, and it can still have better running effect when extended to IID.

本发明提出FNSGA-III算法来解决多目标联邦学习模型问题，并在通信不稳定下进行实验验证。我们首先构建了联邦学习的三目标模板，设定优化目标为最小化全局模型测试错误率、通信成本、全局模型准确率分布方差，决策变量为神经网络的超参数以及联邦学习参数。引入NSGA-III算法对联邦学习多目标模型进行求解，并改变了NSGA-III的初始化，实验结果表明，改进的FNSGA-III算法优于原NSGA-III算法。并且使用FNSGA-III算法优化得到的Pareto最优解与基准的联邦平均算法比较，有效提高了全局模型准确率、降低了全局模型准确率分布方差及通信成本。The present invention proposes the FNSGA-III algorithm to solve the multi-objective federated learning model problem, and conducts experimental verification under unstable communication conditions. We first constructed a three-objective template for federated learning, and set the optimization objective to minimize the global model test error rate, communication cost, and global model accuracy distribution variance. The decision variables were the hyperparameters of the neural network and the federated learning parameters. The NSGA-III algorithm is introduced to solve the federated learning multi-objective model, and the initialization of NSGA-III is changed. The experimental results show that the improved FNSGA-III algorithm is better than the original NSGA-III algorithm. And the Pareto optimal solution optimized by FNSGA-III algorithm is compared with the benchmark federated average algorithm, which effectively improves the accuracy of the global model, reduces the distribution variance of the accuracy of the global model and the communication cost.

本发明所使用的词语“优选的”意指用作实例、示例或例证。本发明描述为“优选的”任意方面或设计不必被解释为比其他方面或设计更有利。相反，词语“优选的”的使用旨在以具体方式提出概念。如本申请中所使用的术语“或” 旨在意指包含的“或”而非排除的“或”。即，除非另外指定或从上下文中清楚， “X使用A或B”意指自然包括排列的任意一个。即，如果X使用A；X使用 B；或X使用A和B二者，则“X使用A或B”在前述任一示例中得到满足。As used herein, the word "preferred" means serving as an example, instance or illustration. Any aspect or design of the present invention described as "preferred" is not necessarily to be construed as advantageous over other aspects or designs. Rather, use of the word "preferred" is intended to introduce concepts in a specific manner. The term "or" as used in this application is intended to mean an inclusive "or" rather than an exclusive "or." That is, unless specified otherwise or clear from context, "X employs A or B" is meant to naturally include either of the permutations. That is, "X uses A or B" is satisfied in any of the preceding examples if X uses A; X uses B; or X uses both A and B.

而且，尽管已经相对于一个或实现方式示出并描述了本公开，但是本领域技术人员基于对本说明书和附图的阅读和理解将会想到等价变型和修改。本公开包括所有这样的修改和变型，并且仅由所附权利要求的范围限制。特别地关于由上述组件(例如元件等)执行的各种功能，用于描述这样的组件的术语旨在对应于执行所述组件的指定功能(例如其在功能上是等价的)的任意组件(除非另外指示)，即使在结构上与执行本发明所示的本公开的示范性实现方式中的功能的公开结构不等同。此外，尽管本公开的特定特征已经相对于若干实现方式中的仅一个被公开，但是这种特征可以与如可以对给定或特定应用而言是期望和有利的其他实现方式的一个或其他特征组合。而且，就术语“包括”、“具有”、“含有”或其变形被用在具体实施方式或权利要求中而言，这样的术语旨在以与术语“包含”相似的方式包括。Furthermore, although the disclosure has been shown and described with respect to one implementation or implementation, equivalent variations and modifications will occur to those skilled in the art based on a reading and understanding of this specification and the accompanying drawings. The present disclosure includes all such modifications and variations and is limited only by the scope of the appended claims. In particular with respect to the various functions performed by the above-described components (eg, elements, etc.), the terms used to describe such components are intended to correspond to any component that performs the specified function of the component (eg, which is functionally equivalent) (unless otherwise indicated), even if not structurally equivalent to the disclosed structures that perform the functions of the illustrated exemplary implementations of the present disclosure by this disclosure. Furthermore, although a particular feature of the present disclosure has been disclosed with respect to only one of several implementations, such feature may be combined with one or other features of other implementations as may be desired and advantageous for a given or particular application combination. Also, to the extent that the terms "comprising", "having", "containing" or variations thereof are used in the detailed description or the claims, such terms are intended to be inclusive in a manner similar to the term "comprising".

本发明实施例中的各功能单元可以集成在一个处理模块中，也可以是各个单元单独物理存在，也可以多个或多个以上单元集成在一个模块中。上述集成的模块既可以采用硬件的形式实现，也可以采用软件功能模块的形式实现。所述集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时，也可以存储在一个计算机可读取存储介质中。上述提到的存储介质可以是只读存储器，磁盘或光盘等。上述的各装置或系统，可以执行相应方法实施例中的存储方法。Each functional unit in this embodiment of the present invention may be integrated into one processing module, or each unit may exist physically alone, or multiple or more units may be integrated into one module. The above-mentioned integrated modules can be implemented in the form of hardware or in the form of software function modules. If the integrated modules are implemented in the form of software functional modules and sold or used as independent products, they may also be stored in a computer-readable storage medium. The above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, and the like. The above-mentioned devices or systems can execute the storage methods in the corresponding method embodiments.

综上所述，上述实施例为本发明的一种实施方式，但本发明的实施方式并不受所述实施例的限制，其他的任何背离本发明的精神实质与原理下所做的改变、修饰、代替、组合、简化，均应为等效的置换方式，都包含在本发明的保护范围之内。To sum up, the above-mentioned embodiment is an embodiment of the present invention, but the embodiment of the present invention is not limited by the embodiment, and any other changes that deviate from the spirit and principle of the present invention, Modifications, substitutions, combinations, and simplifications should all be equivalent substitutions, which are all included within the protection scope of the present invention.

Claims

1. The multi-target federated learning evolution method based on the improved NSGA-III is applied to a server and a plurality of participants, and is characterized by comprising the following steps:

acquiring learning data, wherein the learning data is used for labeling;

constructing a multi-objective optimization model for federal learning, wherein the multi-objective optimization model comprises three objectives of maximizing the accuracy of a global model, minimizing the distribution variance of the accuracy of the global model and communication cost;

fast greedy initialization P _t Multi-target evaluation was performed using the FedAvg algorithm, 0;

to P _t Performing non-dominant sequencing;

performing iteration, wherein each iteration performs the following operations: selection, crossover, mutation operator operations, producing Q _t Carrying out federal learning training evaluation on the offspring population, and calculating to obtain three targets of each individual; mixed parent and child populations R _t ＝Q _t +P _t (ii) a To R _t Non-dominant ordering and selection of P by reference point _t+1 ；

Finding out the Pareto optimal solution, and outputting a labeling result corresponding to the Pareto optimal solution.

2. The multi-objective federated learning evolution method based on modified NSGA-III as claimed in claim 1Characterized in that, in the course of the federal learning evolution, the data set is owned as D _k The loss function for the kth participant of (1) is:

the global goal of the federated learning evolution approach is to minimize the global loss function l (w) as follows:

where k is the participant's serial number, L _k (w) is the loss function of the kth participant,/ _i (w) is a loss function on data sample i, n _k Data set D for participant k _k Size n _k ＝|D _k And n is the total size of the data samples of the K participants.

3. The multi-objective NSGA-III based federated learning evolution method of claim 1, wherein each participant receives the global model w from the server during each round of the federated learning process _t And training the global model by using the local data to obtain an updated local model

Then the participant sends the updated local model to the server, and the server aggregates the models according to a certain rule to obtain a new global model w _t+1 For the next round of iterative training, the subscript t represents the federally learned communication round.

4. The multi-objective federated learning evolution method based on modified NSGA-III as claimed in claim 1, wherein the three-objective optimization model of federated learning evolution is:

wherein F (v) is the objective function of a three-objective optimization model having 3 minimization objectives, each of which is the minimization of the global model test error rate f ₁ Global model accuracy distribution variance f ₂ Communication cost f ₃ Conv is the number of convolution layers, kc is the number of convolution kernels, ks is the size of convolution kernels, L is the number of fully-connected layers, N is the number of fully-connected layer neurons, eta is the learning rate, epsilon is the connectivity parameter of the neural network, the number of connections between two fully-connected layers is determined by the connectivity parameter epsilon, and the total number of connections is N ═ epsilon (N ═ epsilon) ^k +n ^k ^-1 ) Wherein n is ^k And n ^k-1 The numbers of neurons in the k-layer and the k-1-layer, respectively.

5. The multi-objective NSGA-III based federated learning evolution method of claim 1, wherein objective f is ₁ The global model test error rate E is 1-A, A is the average test accuracy rate of the global model

{a ₁ ,a ₂ ,...,a _K The accuracy of each participant.

6. The multi-objective NSGA-III based federated learning evolution method of claim 5, wherein objective f is ₂ For global model accuracy distribution variance

7. The multi-objective NSGA-III based federated learning evolution method of claim 5, wherein objective f is ₃ Can be expressed as

K is the total number of participants, C is the participation proportion of each round of participants, and sigma is the size of the model parameter.

8. The multi-target federal learned evolution method based on improved NSGA-III as claimed in claim 1, wherein the process of performing federal learned training evaluation on the offspring population is implemented by a static SET-based FedAvg algorithm, which specifically includes the following steps:

i is an individual of a population in the FNSGA-III algorithm, P is the population scale, and after the individual i is decoded, the related federal learning neural network hyper-parameter, the connectivity of the neural network and the participation ratio C of each round of participants are obtained _i ；

Using a link communication parameter epsilon _i Initializing a static SET topology, and taking the static SET topology as a global model used in an algorithm;

in each round of training process, training local data by using a small batch random gradient descent method;

after a certain turn, three targets of the test error rate of the global model, the accuracy rate distribution variance of the global model and the communication cost are calculated.

9. The multi-target federal learning evolution method based on modified NSGA-III as claimed in claim 1, wherein the number of convolution layers, the number of convolution kernels, the size of convolution kernels, the number of full-link layers, the number of neurons in each layer of full-link layers and the SET parameter epsilon of MLP and CNN are encoded using binary, and the learning rate η and the participation ratio C in each round are encoded using real values.

10. The multi-objective NSGA-III based federated learning evolution method of claim 1, wherein the learning data is MNIST data set.