CN114819181A - Multi-objective federated learning evolution method based on improved NSGA-III - Google Patents
Multi-objective federated learning evolution method based on improved NSGA-III Download PDFInfo
- Publication number
- CN114819181A CN114819181A CN202210396629.7A CN202210396629A CN114819181A CN 114819181 A CN114819181 A CN 114819181A CN 202210396629 A CN202210396629 A CN 202210396629A CN 114819181 A CN114819181 A CN 114819181A
- Authority
- CN
- China
- Prior art keywords
- iii
- objective
- federated learning
- nsga
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 36
- 238000004891 communication Methods 0.000 claims abstract description 65
- 238000009826 distribution Methods 0.000 claims abstract description 38
- 238000012549 training Methods 0.000 claims abstract description 37
- 238000005457 optimization Methods 0.000 claims abstract description 32
- 238000011156 evaluation Methods 0.000 claims abstract description 11
- 230000035772 mutation Effects 0.000 claims abstract description 8
- 238000013527 convolutional neural network Methods 0.000 claims description 40
- 230000006870 function Effects 0.000 claims description 30
- 238000013528 artificial neural network Methods 0.000 claims description 27
- 238000012360 testing method Methods 0.000 claims description 26
- 210000002569 neuron Anatomy 0.000 claims description 20
- 230000008569 process Effects 0.000 claims description 15
- 230000003068 static effect Effects 0.000 claims description 12
- 238000002372 labelling Methods 0.000 claims description 4
- 238000011478 gradient descent method Methods 0.000 claims description 2
- 238000013459 approach Methods 0.000 claims 1
- 238000012163 sequencing technique Methods 0.000 claims 1
- 238000013473 artificial intelligence Methods 0.000 abstract description 3
- 238000002474 experimental method Methods 0.000 description 16
- 210000000349 chromosome Anatomy 0.000 description 10
- 230000005540 biological transmission Effects 0.000 description 8
- 238000010586 diagram Methods 0.000 description 8
- 238000012854 evaluation process Methods 0.000 description 6
- 230000002068 genetic effect Effects 0.000 description 6
- 238000004458 analytical method Methods 0.000 description 5
- 238000010801 machine learning Methods 0.000 description 5
- 238000003860 storage Methods 0.000 description 5
- 230000004913 activation Effects 0.000 description 4
- 230000002776 aggregation Effects 0.000 description 3
- 238000004220 aggregation Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000011176 pooling Methods 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 230000000052 comparative effect Effects 0.000 description 2
- 238000000354 decomposition reaction Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000003062 neural network model Methods 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 230000003416 augmentation Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000010429 evolutionary process Effects 0.000 description 1
- 238000012804 iterative process Methods 0.000 description 1
- 230000007786 learning performance Effects 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000013468 resource allocation Methods 0.000 description 1
- 238000010187 selection method Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/086—Learning methods using evolutionary algorithms, e.g. genetic algorithms or genetic programming
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Mathematical Physics (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Physiology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
本发明属于人工智能领域,公开了基于改进NSGA‑III的多目标联邦学习进化方法,包括以下步骤:获取学习数据;构建联邦学习多目标优化模型;快速贪婪初始化,并进行多目标评估;进行非占优排序;进行迭代,每次迭代进行如下操作:选择、交叉、变异算子操作,产生Qt,对子代种群进行联邦学习的训练评估,计算得到每个个体的三个目标;混合父代和子代种群Rt=Qt+Pt;对Rt非占优排序,并依靠参考点选择Pt+1;找出帕累托最优解,输出帕累托最优解对应的标注结果。本发明对比经典算法能得到较优帕累托解,在保证全局模型准确率的情况下,降低通信成本和全局模型准确率分布方差。
The invention belongs to the field of artificial intelligence, and discloses a multi-objective federated learning evolution method based on improved NSGA-III, comprising the following steps: acquiring learning data; constructing a federated learning multi-objective optimization model; fast greedy initialization and multi-objective evaluation; Dominant sorting; perform iteration, each iteration performs the following operations: selection, crossover, mutation operator operations, generating Q t , performing federated learning training evaluation on the offspring population, and calculating the three goals of each individual; mixing parent Generation and offspring populations R t =Q t +P t ; sort R t non-dominantly, and select P t+1 based on the reference point; find out the Pareto optimal solution, and output the label corresponding to the Pareto optimal solution result. Compared with the classical algorithm, the present invention can obtain a better Pareto solution, and under the condition of ensuring the accuracy of the global model, the communication cost and the distribution variance of the accuracy of the global model are reduced.
Description
技术领域technical field
本发明属于人工智能技术领域,尤其涉及基于改进NSGA-III的多目标联 邦学习进化方法。The invention belongs to the technical field of artificial intelligence, and particularly relates to a multi-objective federated learning evolution method based on improved NSGA-III.
背景技术Background technique
人工智能的快速发展给社会带来极大便利的同时,也带来一些隐患,如 数据孤岛和隐私泄露。传统的集中式机器学习,需要将分散的数据聚集在一 起进行机器学习训练,但实际上许多领域的数据很难聚合在一起,如医院之 间很难共享数据,存在严重的“数据孤岛”问题。此外,隐私泄露问题显现,人 们隐私保护意识逐渐提高,世界各国也出台了隐私保护的法律法规。The rapid development of artificial intelligence brings great convenience to society, but also brings some hidden dangers, such as data silos and privacy leakage. Traditional centralized machine learning needs to gather scattered data together for machine learning training, but in fact it is difficult to aggregate data in many fields. For example, it is difficult to share data between hospitals, and there is a serious problem of "data silos". . In addition, the problem of privacy leakage has emerged, people's awareness of privacy protection has gradually increased, and countries around the world have also introduced privacy protection laws and regulations.
因此,联邦学习作为解决数据孤岛和隐私泄露问题的可行解决方案应运 而生。联邦学习可以在使数据留在参与者本地的同时,训练出一个良好的全 局模型。每个参与者从服务器下载当前的全局模型后,在本地数据中训练, 然后将训练后的本地模型上传到服务器中,进行模型的聚合更新,经过多轮 迭代,最后得到一个性能良好的全局模型。Therefore, federated learning emerges as a viable solution to the problems of data silos and privacy leaks. Federated learning can train a good global model while keeping the data local to the participants. After each participant downloads the current global model from the server, trains it in the local data, and then uploads the trained local model to the server for aggregated update of the model. After multiple rounds of iterations, a global model with good performance is finally obtained. .
但是,传统的联邦学习仍然面临着通信成本高和结构异质性的挑战。联 邦学习的服务器和参与者之间的参数传输消耗着大量的通信成本;同时,不 同参与者之间由于计算存储能力不同、网络环境不同,参与者会在训练时出 现离线、传输模型参数丢失等情况,导致联邦学习效率、准确性、公平性等 受到影响。目前有许多文献致力于通信成本或结构异质性研究,但对这些问 题的综合考虑较少。However, traditional federated learning still faces the challenges of high communication cost and structural heterogeneity. The parameter transmission between the federated learning server and the participants consumes a lot of communication costs; at the same time, due to different computing and storage capabilities and different network environments between different participants, the participants will be offline during training, and the parameters of the transmission model will be lost, etc. The efficiency, accuracy, and fairness of federated learning are affected. There is currently much literature devoted to the study of communication cost or structural heterogeneity, but less comprehensive consideration of these issues.
针对以上问题,本发明综合考虑了通信成本和结构异质性问题,对模型 有效性、公平性和通信成本之间的平衡问题进行研究。本发明首先定义联邦 学习为一个三目标优化模型,以同时最大化全局模型准确率、最小化全局模 型准确率分布方差和通信成本为目标。在结合联邦学习训练特点的基础上, 对第3代非支配排序遗传算法(non-dominatedsorted genetic algorithm-III, NSGA-III)的初始化进行改进,设计出一种面向多目标联邦学习的基于快速 贪婪初始化的NSGA-III算法(Fast greedy initialization NSGA-III,FNSGA-III)。 实验结果表明,FNSGA-III算法可以实现三个目标的均衡,能够在保持FL模 型整体性能不严重损失的情况下,有效降低通信成本和有效减小各参与者准 确率分布的方差,使参与者准确率分布更均衡。本发明所作的工作主要包括:In view of the above problems, the present invention comprehensively considers the problem of communication cost and structural heterogeneity, and studies the balance between model validity, fairness and communication cost. The present invention first defines federated learning as a three-objective optimization model, aiming at maximizing the global model accuracy rate, minimizing the global model accuracy rate distribution variance and communication cost at the same time. Based on the training characteristics of federated learning, the initialization of the third-generation non-dominated sorted genetic algorithm-III (NSGA-III) is improved, and a multi-objective federated learning based fast greedy algorithm is designed. The initialized NSGA-III algorithm (Fast greedy initialization NSGA-III, FNSGA-III). The experimental results show that the FNSGA-III algorithm can achieve the balance of the three objectives, and can effectively reduce the communication cost and the variance of the accuracy distribution of each participant without seriously losing the overall performance of the FL model. The accuracy distribution is more balanced. The work done by the present invention mainly includes:
(1)据作者目前所知,本发明首次综合考虑到全局模型准确率最大化、全 局模型准确率分布方差最小化和通信成本最小化等目标,构建联邦学习多目 标优化模型。(1) According to the author's current knowledge, the present invention comprehensively considers the objectives of maximizing the accuracy of the global model, minimizing the distribution variance of the accuracy of the global model, and minimizing the communication cost, and constructs a federated learning multi-objective optimization model for the first time.
(2)提出FNSGA-III算法。为了快速收敛NSGA-III算法并得到高质量解, 更便于对联邦学习多目标优化模型的求解,提出基于快速贪婪初始化的初始 解构建算法,引入二进制和实值编码和解码策略,加快NSGA-III算法进化效 率。(2) Propose FNSGA-III algorithm. In order to quickly converge the NSGA-III algorithm and obtain high-quality solutions, it is more convenient to solve the multi-objective optimization model of federated learning, an initial solution construction algorithm based on fast greedy initialization is proposed, and binary and real-valued encoding and decoding strategies are introduced to speed up NSGA-III. Algorithm evolution efficiency.
(3)通过MNIST数据集实验,验证FNSGA-III算法所得Pareto解优于 NSGA-III算法,FNSGA-III算法的超体积HV指标值最高能达到NSGA-III算 法的127.55%,并且运行时间最优情况下是NSGA-III算法的73.96%。此外, 将FNSGA-III算法与其它经典进化算法NSGA-II、SPEA2进行对比,结果表 明FNSGA-III算法所得Pareto解质量更高。最后,选择部分Pareto解进行联 邦学习实验,算法所得Pareto解能在保证全局模型准确率的情况下,有效降 低通信成本和全局模型准确率分布方差。(3) Through the MNIST data set experiment, it is verified that the Pareto solution obtained by the FNSGA-III algorithm is better than the NSGA-III algorithm. The FNSGA-III algorithm's hyper-volume HV index value can reach 127.55% of the NSGA-III algorithm, and the running time is optimal. The case is 73.96% of the NSGA-III algorithm. In addition, comparing the FNSGA-III algorithm with other classical evolutionary algorithms NSGA-II and SPEA2, the results show that the Pareto solution quality obtained by the FNSGA-III algorithm is higher. Finally, some Pareto solutions are selected for federated learning experiments. The Pareto solutions obtained by the algorithm can effectively reduce the communication cost and the distribution variance of the global model accuracy while ensuring the accuracy of the global model.
近年来,联邦学习受到了广泛的关注。McMahan在2016年首次提出了联 邦学习的概念和联邦平均算法(Federated Averaging,FedAvg),该算法对隐 私保护和数据孤岛下的机器学习训练有重要应用意义。联邦学习的研究不断 深入,但它仍然存在一些未克服的挑战,如通信成本高、结构异质性等挑战。In recent years, federated learning has received extensive attention. McMahan first proposed the concept of federated learning and Federated Averaging (FedAvg) in 2016, which has important application significance for privacy protection and machine learning training under data silos. The research of federated learning continues to deepen, but it still has some unsurmountable challenges, such as high communication cost and structural heterogeneity.
为了使联邦学习在海量数据中也能适用,必须考虑降低联邦学习的通信 开销问题。McMahan等人提出的FedAvg算法通过增加每一轮通信中的本地 训练计算量,来减少全局通信轮次,从而提高通信效率。还有学者通过减少 参与者上传参数大小,来降低通信传输量。Chen等人提出了一种分层异步更 新算法,作者根据深度神经网络模型的结构特点将参数分层为浅层参数和深 层参数,在前期的全局通信迭代过程中,本地参与者与服务器之间只传输浅 层参数,并只在最后几轮通信中对全局模型的深层参数进行传输聚合更新,该算法通过减少传输模型参数大小和降低神经网络中深层参数的更新频率来 降低通信开销,不足之处是模型的准确率会受到影响。Zhu等人将稀疏进化训 练算法(Sparseevolutionary Training,SET)引入到联邦学习中,SET算法的 主要思想是在神经网络的全连接层之间,通过稀疏性参数来控制全连接网络 之间的连接稀疏度,以此来减少传输模型参数大小,有效降低了通信成本。In order to make federated learning applicable to massive data, we must consider reducing the communication overhead of federated learning. The FedAvg algorithm proposed by McMahan et al. reduces the number of global communication rounds by increasing the local training computation in each round of communication, thereby improving communication efficiency. Some scholars reduce the amount of communication transmission by reducing the size of the parameters uploaded by the participants. Chen et al. proposed a layered asynchronous update algorithm. The authors layered the parameters into shallow parameters and deep parameters according to the structural characteristics of the deep neural network model. In the early global communication iteration process, between local participants and the server Only the shallow parameters are transmitted, and the deep parameters of the global model are only transmitted and aggregated in the last few rounds of communication. The algorithm reduces the communication overhead by reducing the size of the parameters of the transmission model and the update frequency of the deep parameters in the neural network. However, the accuracy of the model will be affected. Zhu et al. introduced the sparse evolutionary training (SET) algorithm into federated learning. The main idea of the SET algorithm is to control the connection sparseness between the fully connected networks through the sparsity parameter between the fully connected layers of the neural network. In order to reduce the size of the parameters of the transmission model, the communication cost is effectively reduced.
除了通信成本外,结构异质性也是联邦学习优化的主要问题之一。不同 参与者之间由于计算存储能力不同、网络环境不同,训练时参与者会出现离 线、传输模型参数丢失等情况。为了增强联邦学习的鲁棒性,有学者对结构 异质性问题进行了多方面的研究。Hao等人设计了安全聚合协议,允许参与 者随时退出,只要剩余参与者数能够满足联邦学习更新,提高了系统的容错 率和鲁棒性。有学者研究了如何合理分配异构设备资源,Kang等人考虑了参 与者的开销差异,以激励更多高质量参与者进行联邦学习训练。Li等人使用 全局模型准确率方差作为公平性衡量指标,设计了一个q-FFL(q-Fair federated learning)优化算法,该算法增加了高损耗参与者的模型聚合权重,实验表明, 算法能使低准确率参与者的准确率提高,实现不同参与者之间性能分布均衡, 推动联邦学习的公平资源分配。Besides communication cost, structural heterogeneity is also one of the main problems for federated learning optimization. Due to different computing and storage capabilities and different network environments among different participants, the participants will be offline during training, and the parameters of the transmission model will be lost. In order to enhance the robustness of federated learning, some scholars have conducted various studies on the problem of structural heterogeneity. Hao et al. designed a secure aggregation protocol that allows participants to withdraw at any time, as long as the number of remaining participants can satisfy the federated learning update, which improves the fault tolerance and robustness of the system. Some scholars have studied how to reasonably allocate heterogeneous equipment resources. Kang et al. considered the cost difference of participants to motivate more high-quality participants to perform federated learning training. Using the global model accuracy variance as a fairness measure, Li et al. designed a q-FFL (q-Fair federated learning) optimization algorithm, which increased the model aggregation weight of high-loss participants. The accuracy of low-accuracy participants is improved, the performance distribution between different participants is balanced, and the fair resource allocation of federated learning is promoted.
上述研究针对通信成本或结构异质性某一方面进行,在不同程度、不同 目标上实现了对联邦学习算法的优化,但联邦学习的应用更多情况下对模型 的准确率、公平性、通信成本等都有要求,为了实现联邦学习框架下多个目 标的平衡,有学者尝试将智能优化算法与联邦学习结合。Zhu等人定义联邦学 习为双目标优化问题,以最小化模型测试错误率和通信成本为目标,使用 NSGA-II(non-dominated sorted genetic algorithm-II)算法优化了联邦学习的神 经网络结构参数,算法进化出的Pareto解与标准的FedAvg算法相比,在一定 程度上同时提高模型性能和通信效率,但该算法并未考虑由联邦学习的结构 异质性问题带来的通信不稳定及参与者之间准确分布不均衡等其它情况,同 时使用的NSGA-II算法对于多目标联邦学习模型的扩展性较差。Basheer等人 使用粒子群算法来更新优化联邦学习的神经网络的隐藏层数、神经元数和全 局通信轮次,但其优化目标为单目标,没有综合考虑联邦学习的其它目标。The above research is carried out on one aspect of communication cost or structural heterogeneity, and the optimization of the federated learning algorithm has been achieved to different degrees and different goals. There are requirements for costs, etc. In order to achieve the balance of multiple goals under the federated learning framework, some scholars have tried to combine intelligent optimization algorithms with federated learning. Zhu et al. defined federated learning as a dual-objective optimization problem, aiming to minimize the model test error rate and communication cost, and used the NSGA-II (non-dominated sorted genetic algorithm-II) algorithm to optimize the neural network structure parameters of federated learning. Compared with the standard FedAvg algorithm, the Pareto solution evolved by the algorithm can simultaneously improve the model performance and communication efficiency to a certain extent, but the algorithm does not consider the communication instability and participants caused by the structural heterogeneity of federated learning. In other cases, the accurate distribution is not balanced, and the NSGA-II algorithm used at the same time has poor scalability for the multi-objective federated learning model. Basheer et al. used particle swarm optimization to update the number of hidden layers, neurons and global communication rounds of the neural network optimized for federated learning, but the optimization objective was a single objective, and other objectives of federated learning were not considered comprehensively.
发明内容SUMMARY OF THE INVENTION
针对上述问题,本发明综合考虑通信成本和结构异质性问题,引入公平性作 为优化目标,以探究联邦学习的模型准确率、公平性和通信成本的多目标均 衡关系。并在实验中设定通信环境是不稳定的,增强算法的鲁棒性。In view of the above problems, the present invention comprehensively considers the communication cost and structural heterogeneity, and introduces fairness as an optimization objective to explore the multi-objective equilibrium relationship between model accuracy, fairness and communication cost of federated learning. And the communication environment is set to be unstable in the experiment to enhance the robustness of the algorithm.
本发明提出的基于改进NSGA-III的多目标联邦学习进化方法,应用于服 务器和多个参与者,包括以下步骤:The multi-objective federated learning evolution method based on the improved NSGA-III proposed by the present invention is applied to the server and multiple participants, including the following steps:
获取学习数据,所述学习数据用于进行标注;acquiring learning data, the learning data is used for labeling;
构建联邦学习多目标优化模型,所述多目标优化模型包括最大化全局模 型准确率、最小化全局模型准确率分布方差和通信成本三个目标;Constructing a federated learning multi-objective optimization model, the multi-objective optimization model includes three objectives of maximizing the accuracy of the global model, minimizing the distribution variance of the accuracy of the global model, and communication cost;
快速贪婪初始化Pt=0,使用FedAvg算法进行多目标评估;Fast greedy initialization P t = 0, multi-objective evaluation using FedAvg algorithm;
对Pt进行非占优排序;Perform non-dominant sorting on P t ;
进行迭代,每次迭代进行如下操作:选择、交叉、变异算子操作,产生 Qt,对子代种群进行联邦学习的训练评估,计算得到每个个体的三个目标;混 合父代和子代种群Rt=Qt+Pt;对Rt非占优排序,并依靠参考点选择Pt+1;Iteration is performed, and the following operations are performed in each iteration: selection, crossover, and mutation operator operations to generate Q t , the training evaluation of federated learning is performed on the offspring population, and the three goals of each individual are obtained by calculation; the parent and offspring populations are mixed. R t =Q t +P t ; sort R t non-dominantly, and select P t+1 depending on the reference point;
找出Pareto最优解,输出Pareto最优解对应的标注结果。Find the Pareto optimal solution and output the labeling result corresponding to the Pareto optimal solution.
进一步的,在联邦学习进化过程中,拥有数据集为Dk的第k个参与者的损 失函数为:Further, in the evolutionary process of federated learning, the loss function of the kth participant with the dataset Dk is:
联邦学习进化方法的全局目标是最小化如下的全局损失函数L(w):The global goal of federated learning evolutionary methods is to minimize the global loss function L(w) as follows:
其中,k是参与者的序号,Lk(w)为第k个参与者的损失函数,li(w)是数据 样本i上的损失函数,nk为参与者k的数据集Dk大小nk=|Dk|,n为K个参与者 的数据样本总大小。where k is the serial number of the participant, L k (w) is the loss function of the kth participant, l i (w) is the loss function on the data sample i, and n k is the size of the dataset D k of participant k n k = |D k |, where n is the total data sample size of K participants.
进一步的,在联邦学习过程的每一轮训练中,各个参与者从服务器接收 全局模型wt,并使用本地数据对全局模型进行训练,得到更新的本地局部模 型然后参与者将更新后的本地模型发送给服务器,服务器以一定的规则 聚合各个模型,得到一个新的全局模型wt+1,用于下一轮次的迭代训练,下标 t表示联邦学习的通信轮次。Further, in each round of training in the federated learning process, each participant receives the global model wt from the server, and uses the local data to train the global model to obtain an updated local local model. Then the participant sends the updated local model to the server, and the server aggregates each model according to certain rules to obtain a new global model w t+1 for the next round of iterative training, where the subscript t represents the federated learning Communication rounds.
进一步的,联邦学习进化的三目标优化模型为:Further, the three-objective optimization model of federated learning evolution is:
其中,F(v)是三目标优化模型的目标函数,此模型有3个最小化目标,分 别为最小化全局模型测试错误率f1、全局模型准确率分布方差f2、通信成本f3, Conv为卷积层数、kc为卷积核数目、ks为卷积核大小、L为全连接层数、N为 全连接层神经元数、η为学习率,ε为神经网络的连通性参数,两层全连接层 之间的连接数由连通性参数ε决定,总连接数为n=ε(nk+nk-1),其中,nk和 nk-1分别为k层和k-1层的神经元数。Among them, F(v) is the objective function of the three-objective optimization model. This model has three minimization objectives, which are to minimize the global model test error rate f 1 , the global model accuracy distribution variance f 2 , and the communication cost f 3 , Conv is the number of convolution layers, kc is the number of convolution kernels, ks is the size of the convolution kernels, L is the number of fully connected layers, N is the number of neurons in the fully connected layer, η is the learning rate, and ε is the connectivity parameter of the neural network , the number of connections between the two fully connected layers is determined by the connectivity parameter ε, and the total number of connections is n=ε(n k +n k-1 ), where n k and n k-1 are the k layers and k respectively - Number of neurons in
进一步的,目标f1的全局模型测试错误率E=1-A,A为全局模型平均测 试准确率{a1,a2,...,aK}为各个参与者的准确率。Further, the global model test error rate E=1-A of the target f 1 , A is the average test accuracy rate of the global model {a 1 ,a 2 ,...,a K } is the accuracy of each participant.
进一步的,目标f2为全局模型准确率分布方差 Further, the target f 2 is the global model accuracy distribution variance
进一步的,目标f3可以表示为K是参与者总数,C是 参与者每轮参与比例,σ是模型参数大小。Further, the target f3 can be expressed as K is the total number of participants, C is the proportion of participants in each round, and σ is the size of the model parameters.
进一步的,所述对子代种群进行联邦学习的训练评估的过程,通过基于 静态SET的FedAvg算法实现,具体包括如下步骤:Further, the described process of carrying out the training evaluation of federated learning to the offspring population is realized by the FedAvg algorithm based on static SET, and specifically includes the following steps:
i是FNSGA-III算法中种群的一个个体,P是种群规模,对个体i解码后, 得到相关的联邦学习神经网络超参数、神经网络的连通性和参与者每轮参与 比例Ci;i is an individual of the population in the FNSGA-III algorithm, P is the population size, after decoding the individual i, the relevant federated learning neural network hyperparameters, the connectivity of the neural network and the participation ratio C i of the participants in each round are obtained;
使用连通信参数εi初始化静态SET拓扑,将所述静态SET拓扑作为算法 中使用的全局模型;Use the communication parameter ε i to initialize the static SET topology, and use the static SET topology as the global model used in the algorithm;
在每一轮训练过程中,使用小批量随机梯度下降方法训练本地数据;During each round of training, use the mini-batch stochastic gradient descent method to train the local data;
经过一定轮次后,计算全局模型的测试错误率、全局模型准确率分布方 差和通信成本三个目标。After a certain number of rounds, the test error rate of the global model, the distribution variance of the global model accuracy rate and the communication cost are calculated.
进一步的,MLP和CNN的卷积层数、卷积核数目、卷积核大小、全连 接层数、全连接层每层神经元数和SET参数ε使用二进制编码,学习率η和每 轮参与比例C使用实值编码。Further, the number of convolutional layers, the number of convolutional kernels, the size of convolutional kernels, the number of fully connected layers, the number of neurons in each layer of the fully connected layer, and the SET parameter ε of MLP and CNN use binary encoding, the learning rate η and the participation in each round Scale C uses real-valued encoding.
进一步的,所述学习数据为MNIST数据集。Further, the learning data is the MNIST data set.
本发明的有益效果如下:The beneficial effects of the present invention are as follows:
本发明提出FNSGA-III算法来解决多目标联邦学习模型问题,并在通信不 稳定下进行实验验证。我们首先构建了联邦学习的三目标模板,设定优化目 标为最小化全局模型测试错误率、通信成本、全局模型准确率分布方差,决 策变量为神经网络的超参数以及联邦学习参数。引入NSGA-III算法对联邦学 习多目标模型进行求解,并改变了NSGA-III的初始化,实验结果表明,改进 的FNSGA-III算法优于原NSGA-III算法。并且使用FNSGA-III算法优化得到的 Pareto最优解与基准的联邦平均算法比较,有效提高了全局模型准确率、降低 了全局模型准确率分布方差及通信成本。The present invention proposes FNSGA-III algorithm to solve the problem of multi-objective federated learning model, and conducts experimental verification under unstable communication conditions. We first constructed a three-objective template for federated learning, and set the optimization objective to minimize the global model test error rate, communication cost, and global model accuracy distribution variance. The decision variables were the hyperparameters of the neural network and the federated learning parameters. The NSGA-III algorithm is introduced to solve the federated learning multi-objective model, and the initialization of NSGA-III is changed. The experimental results show that the improved FNSGA-III algorithm is better than the original NSGA-III algorithm. And compared with the benchmark federated average algorithm, the Pareto optimal solution optimized by the FNSGA-III algorithm effectively improves the accuracy of the global model and reduces the distribution variance and communication cost of the global model accuracy.
附图说明Description of drawings
图1联邦学习训练过程图;Figure 1. The federated learning training process diagram;
图2本发明的MLP模型的染色体编码实例图;Fig. 2 chromosome coding example diagram of MLP model of the present invention;
图3本发明的CNN模型的染色体编码实例图;Fig. 3 chromosome coding example diagram of CNN model of the present invention;
图4本发明的算法流程图;Fig. 4 algorithm flow chart of the present invention;
图5本发明与NSGA-III算法的MLP在IID上实验对比图;Fig. 5 is an experiment contrast diagram on IID of the MLP of the present invention and NSGA-III algorithm;
图6本发明与NSGA-III算法的MLP在non-IID上实验对比图;Fig. 6 is an experiment comparison diagram of the present invention and the MLP of NSGA-III algorithm on non-IID;
图7本发明与NSGA-III算法的CNN在IID上实验对比图;Fig. 7 is the experiment contrast diagram of the CNN of the present invention and NSGA-III algorithm on IID;
图8本发明与NSGA-III算法的CNN在non-IID上实验对比图;Fig. 8 is the experiment comparison diagram of the present invention and the CNN of NSGA-III algorithm on non-IID;
图9本发明、NSGA-II和SPEA2的Pareto最优解图。Figure 9 Pareto optimal solution diagram of the present invention, NSGA-II and SPEA2.
具体实施方式Detailed ways
下面结合附图对本发明作进一步的说明,但不以任何方式对本发明加以 限制,基于本发明教导所作的任何变换或替换,均属于本发明的保护范围。The present invention is further described below in conjunction with the accompanying drawings, but the present invention is not limited in any way, and any transformation or replacement based on the teaching of the present invention belongs to the protection scope of the present invention.
为达到该目的,本发明采用的技术方案包括步骤如下:In order to achieve this purpose, the technical scheme adopted in the present invention comprises the steps as follows:
为了使本发明的技术方案和有益效果更加清楚,以下结合实际例子,对 本发明进行进一步描述。应当理解,此处所描述的具体实施例仅用以解释本 发明,并不用于限定本发明。In order to make the technical solutions and beneficial effects of the present invention clearer, the present invention will be further described below with reference to practical examples. It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.
实施例Example
联邦学习是一种隐私保护机器学习技术,它使参与者共同训练出一个全 局模型,而无需上传本地隐私数据到服务器中。假设有K个参与者,拥有的 数据为{D1,D2,…,DK},传统的集中式学习是把所有的数据放在一起,使用 D=D1∪D2…DK来训练模型。Federated learning is a privacy-preserving machine learning technique that enables participants to jointly train a global model without uploading local private data to the server. Suppose there are K participants with data {D 1 ,D 2 ,...,D K }, the traditional centralized learning is to put all the data together, using D=D 1 ∪ D 2 ... D K to Train the model.
在联邦学习过程中,拥有数据集为Dk的第k个参与者的损失函数为:In the federated learning process, the loss function of the kth participant with the dataset Dk is:
联邦学习全局目标是最小化全局损失函数L(w):The global goal of federated learning is to minimize the global loss function L(w):
式(1)和(2)中,k是参与者的序号,Lk(w)为第k个参与者的损失函数,li(w) 是数据样本i上的损失函数,nk为参与者k的数据集Dk大小nk=|Dk|,n为K个 参与者的数据样本总大小。联邦学习的目标是通过最小化参与者损失函数 Lk(w)的加权平均,来优化全局损失函数L(w)。联邦学习是一个协作过程,如 图1所示。In formulas (1) and (2), k is the serial number of the participant, L k (w) is the loss function of the kth participant, l i (w) is the loss function on the data sample i, and n k is the participation function. The size of the dataset D k of participant k is n k = |D k |, where n is the total size of the data samples of K participants. The goal of federated learning is to optimize the global loss function L(w) by minimizing the weighted average of the participant loss functions Lk (w). Federated learning is a collaborative process, as shown in Figure 1.
在每一轮训练中,各个参与者从服务器接收全局模型wt,并使用本地数 据对全局模型进行训练,得到更新的本地局部模型然后参与者将更新后 的本地模型发送给服务器,服务器以一定的规则聚合各个模型,得到一个新 的全局模型wt+1,用于下一轮次的迭代训练。下标t表示联邦学习的通信轮次。 第3代非支配排序遗传算法(NSGA-III)In each round of training, each participant receives the global model wt from the server, and uses the local data to train the global model to obtain an updated local local model Then the participant sends the updated local model to the server, and the server aggregates each model according to certain rules to obtain a new global model w t+1 for the next round of iterative training. The subscript t denotes the communication round of federated learning. The third generation of non-dominated sorting genetic algorithm (NSGA-III)
基于遗传算法和Pareto最优解的多目标优化进化算法的研究成果有许多, 比如第2代非支配排序遗传算法(NSGA-II)、基于分解的多目标进化算法 (Multi-objectiveevolutionary algorithm based on decomposition,MOEA/D)、 SPEA2(Strength ParetoEvolutionary Algorithm 2)、PAESPareto archived evolution strategy)。NSGA-II是一种强大的、鲁棒的多目标进化算法,适用于 具有两个或三个目标的问题。如果目标数大于3个,则可以采用较新的进化 算法,如基于参考点的第3代非支配排序遗传算法(NSGA-III),NSGA-III 在四个或四个以上目标的优化问题上的性能优于NSGA--II。本发明将联邦学 习定义为三目标优化问题模型,为了保证算法目标的可扩展性,例如当增加 联邦学习的目标扩展为四个及以上时,算法仍然适用,本发明采用NSGA-III 算法,其基本步骤可以概述为:There are many research results of multi-objective optimization evolutionary algorithm based on genetic algorithm and Pareto optimal solution, such as the second generation non-dominated sorting genetic algorithm (NSGA-II), multi-objective evolutionary algorithm based on decomposition (Multi-objective evolutionary algorithm based on decomposition) , MOEA/D), SPEA2 (Strength ParetoEvolutionary Algorithm 2), PAESPareto archived evolution strategy). NSGA-II is a powerful and robust multi-objective evolutionary algorithm suitable for problems with two or three objectives. If the number of objectives is greater than 3, newer evolutionary algorithms can be used, such as the 3rd generation non-dominated sorting genetic algorithm based on reference points (NSGA-III), NSGA-III is used for optimization problems with four or more objectives The performance is better than NSGA--II. The present invention defines federated learning as a three-objective optimization problem model. In order to ensure the scalability of the algorithm objectives, for example, when the objectives of the federated learning are expanded to four or more, the algorithm is still applicable. The present invention adopts the NSGA-III algorithm, which The basic steps can be outlined as:
步骤1初始化参考点和父代种群Pt(种群大小为N),种群内个体进行 非支配排序、个体归一化、关联到参考点。Step 1: Initialize the reference point and the parent population P t (population size is N), and perform non-dominated sorting of individuals in the population, normalize individuals, and associate to the reference point.
步骤2在Pt上使用选择、交叉、变异算子,创建一个与父代种群Pt大小 相同的子代种群Qt。
步骤3混合Pt和Qt成一个新种群Rt,其中Rt种群大小为2N。对合并种群 进行非支配排序,将其划分为不同的非支配解集(F1,F2,...,Fs)、并进行个体归一 化、关联到参考点。
步骤4从排序后的种群Rt中选择N个解产生下一代父代种群Pt+1。如果 选择的非支配前沿解的数量大于N,则在待选的最后一层非支配解集中,采 用基于参考点的选择方法进行解的选择。Step 4: Select N solutions from the sorted population R t to generate the next generation parent population P t+1 . If the number of selected non-dominated frontier solutions is greater than N, in the non-dominated solution set of the last layer to be selected, a reference point-based selection method is used to select solutions.
步骤5转到步骤2,重复整个过程,直到满足预先设定的停止条件,输 出Pareto最优解。
本发明在典型的多目标优化模型的基础上构建了联邦学习的三目标优化 模型,并对其目标、决策变量、变量编码等进行阐述。其中本发明的联邦学 习的三目标优化模型为:The present invention builds a three-objective optimization model of federated learning on the basis of a typical multi-objective optimization model, and elaborates its objectives, decision variables, variable codes, and the like. The three-objective optimization model of the federated learning of the present invention is:
(1)目标函数:(1) Objective function:
F(v)是该模型的目标函数,此模型有3个最小化目标,分别为最小化全 局模型测试错误率f1、全局模型准确率分布方差f2、通信成本f3。F(v) is the objective function of the model. This model has three minimization objectives, which are to minimize the global model test error rate f 1 , the global model accuracy distribution variance f 2 , and the communication cost f 3 .
本发明中,联邦学习三个目标的具体评估过程为,使用FedAvg算法结合 SET算法进行一定通信轮次的训练,将训练好的全局模型w进行测试,得到各 个参与者的准确率{a1,a2,...,aK}。计算得到全局模型平均测试准确率由此,可以计算出目标f1全局模型测试错误率E=1-A。In the present invention, the specific evaluation process of the three goals of federated learning is to use the FedAvg algorithm combined with the SET algorithm to conduct training for a certain round of communication, test the trained global model w, and obtain the accuracy rate {a 1 , a 2 ,...,a K }. Calculate the average test accuracy of the global model Thus, the target f 1 global model test error rate E=1-A can be calculated.
目标f2为全局模型准确率分布方差方差可以看作公平性 的度量指标之一,将公平性作为优化目标,可以尽量避免出现平均精度高, 但单个参与者没有准确率保证的情况。通过减小参与方之间全局模型准确率 分布的方差,从而使聚合后参与者准确率分布更均匀、更公平。The target f 2 is the variance of the global model accuracy distribution Variance can be regarded as one of the measures of fairness. Taking fairness as the optimization goal can try to avoid the situation where the average precision is high, but there is no guarantee of accuracy for a single participant. By reducing the variance of the global model accuracy distribution among the participants, the accuracy distribution of the participants after aggregation is more uniform and fair.
目标f3为参与者的平均通信成本,假设每个参与者的通信成本只与其传 输的模型参数大小相关,由于本发明使用的神经网络结构在联邦学习训练过 程中并未发生改变,因此,各参与者的模型参数大小σ相同并保持不变,则目 标f3可以表示为K是参与者总数,C是参与者每轮参与比 例,σ是模型参数大小。The target f3 is the average communication cost of the participants. It is assumed that the communication cost of each participant is only related to the size of the model parameters it transmits. Since the neural network structure used in the present invention does not change during the federated learning training process, each The model parameter size σ of the participants is the same and remains unchanged, then the target f3 can be expressed as K is the total number of participants, C is the proportion of participants in each round, and σ is the size of the model parameters.
(2)模型决策变量及其约束(2) Model decision variables and their constraints
因联邦学习是机器学习模型进行协作训练的过程,在本发明,决策变量 和约束条件为待优化的各参数及其范围,用v来表示。我们设定优化的参数包 括联邦学习的神经网络超参数、神经网络的连通性参数ε和每轮参与比例C三 部分。Since federated learning is a process of collaborative training of machine learning models, in the present invention, decision variables and constraints are parameters to be optimized and their ranges, which are represented by v. The parameters we set for optimization include the neural network hyperparameters of federated learning, the connectivity parameter ε of the neural network, and the participation ratio C in each round.
本发明的神经网络选择了多层感知机(Multilayer perceptron,MLP)和卷 积神经网络(Convolutional neural network,CNN),其中MLP模型下超参数 包括隐藏层数L、每一层隐藏层的神经元数N、学习率η;CNN超参数包括卷 积层数Conv、卷积核数目kc、卷积核大小ks、全连接层数L、全连接层神经元 数N、学习率η。即v={Conv,kc,ks,L,N,ε,η,C},各个变量的取值范围在实验部 分设定。The neural network of the present invention selects a multi-layer perceptron (MLP) and a convolutional neural network (CNN), wherein the hyperparameters under the MLP model include the number of hidden layers L, the number of neurons in each hidden layer Number N, learning rate η; CNN hyperparameters include the number of convolution layers Conv, the number of convolution kernels kc, the size of the convolution kernel ks, the number of fully connected layers L, the number of fully connected layer neurons N, and the learning rate η. That is, v={Conv,kc,ks,L,N,ε,η,C}, the value range of each variable is set in the experimental part.
神经网络的连通性参数ε借鉴了Mocanu提出的SET算法中的静态SET 算法,即首先在两层全连接层之间使用ER随机图初始化稀疏权重矩阵,其后 保持网络的拓扑固定不变。两层之间的连接数由参数ε决定,总连接数为 n=ε(nk+nk-1),其中,nk和nk-1分别为k层和k-1层的神经元数。在本发明中, 静态SET算法使用于MLP及CNN的全连接层中。The connectivity parameter ε of the neural network draws on the static SET algorithm in the SET algorithm proposed by Mocanu, that is, firstly, the sparse weight matrix is initialized with an ER random graph between the two fully connected layers, and then the topology of the network is kept fixed. The number of connections between the two layers is determined by the parameter ε, and the total number of connections is n=ε(n k +n k-1 ), where n k and n k-1 are neurons in layers k and k-1 respectively number. In the present invention, the static SET algorithm is used in the fully connected layers of MLP and CNN.
(3)决策变量编码(3) Coding of decision variables
我们采用FNSGA-III算法来优化联邦学习的神经网络超参数、连通性参 数ε和每轮参与比例C。染色体是算法操作的主体,在本发明,我们有整数和 实数两种类型的决策变量需要编码,其中所有整数使用二进制编码,实数使 用实值编码。则MLP和CNN的卷积层数、卷积核数目、卷积核大小、全连 接层数、全连接层每层神经元数和SET参数ε使用二进制编码,学习率η和每 轮参与比例C使用实值编码。MLP和CNN的编码实例如图2和图3所示。We adopt the FNSGA-III algorithm to optimize the neural network hyperparameters, connectivity parameter ε and participation ratio C of each round of federated learning. Chromosome is the main body of algorithm operation. In the present invention, we have two types of decision variables, integer and real numbers, which need to be coded. All integers use binary coding, and real numbers use real-value coding. Then the number of convolutional layers, the number of convolutional kernels, the size of convolutional kernels, the number of fully connected layers, the number of neurons in each layer of the fully connected layer and the SET parameter ε of MLP and CNN use binary coding, the learning rate η and the participation ratio C in each round Use real-value encoding. Encoding examples of MLP and CNN are shown in Figure 2 and Figure 3.
解码过程中,二进制解码时自动增加1,例如000000解码为1,MLP例 子中,N1编码为000111,解码为N1=8。为了方便,CNN中卷积核大小只在3 和5之间选择,并始终保持卷积输出不变,则在CNN的神经网络结构中,只 在最后增加一个池化层。During the decoding process, 1 is automatically added during binary decoding. For example, 000000 is decoded as 1. In the MLP example, N 1 is encoded as 000111 and decoded as N 1 =8. For convenience, the size of the convolution kernel in CNN is only selected between 3 and 5, and the convolution output is always kept unchanged. In the neural network structure of CNN, only a pooling layer is added at the end.
为加快搜索性能,本发明改进NSGA-III中的随机初始化为适应于联邦学 习的快速贪婪初始化,并将改进后的NSGA-III算法记为FNSGA-III算法。其 中,快速贪婪初始化过程简述如下:In order to speed up the search performance, the present invention improves the random initialization in NSGA-III to be a fast greedy initialization suitable for federated learning, and records the improved NSGA-III algorithm as the FNSGA-III algorithm. Among them, the fast greedy initialization process is briefly described as follows:
(1)随机产生l倍种群初始解;(1) Randomly generate the initial solution of l-ploid population;
(2)对所有参与者随机分成相同大小的组后,在每个组内同步进行初始解 的联邦学习训练评估过程。其中联邦学习的每轮参与者数、本地训练轮次、 全局通信轮次都有所降低,可以较快速得到联邦学习训练后的三个目标,并 完成对所有初始解的评估;(2) After randomly dividing all participants into groups of the same size, the federated learning training and evaluation process of the initial solution is performed synchronously in each group. Among them, the number of participants in each round of federated learning, the number of local training rounds, and the number of global communication rounds are all reduced, which can quickly obtain the three goals after federated learning training, and complete the evaluation of all initial solutions;
(3)然后分别选三个目标最优的一倍种群解;(3) Then select the optimal doubling population solutions for the three objectives respectively;
(4)混合去掉重复解后,从中随机选择指定种群数量解。(4) After mixing and removing duplicate solutions, randomly select the solution with the specified population number.
FNSGA-III算法流程FNSGA-III algorithm flow
对于联邦学习的三目标优化模型,本发明使用FNSGA-III算法来求解, 以获得一组Pareto最优解,其算法流程图如图4所示。For the three-objective optimization model of federated learning, the present invention uses the FNSGA-III algorithm to solve the problem to obtain a set of Pareto optimal solutions. The algorithm flowchart is shown in FIG. 4 .
FNSGA-III首先采用快速贪婪初始化产生大小为N的初始种群,即第一代 父代种群,并将对应的变量编码为二进制和实值染色体。其迭代过程主要是 采用二元锦标赛选择两个父代个体来产生两个子代个体,交叉变异算法分别 是在二进制染色体上采用单点交叉和翻转变异,在实值染色体上采用模拟二 值交叉(Simulated binary crossover,SBX)和多项式变异。重复这个过程直到 产生N个子代个体。FNSGA-III first uses fast greedy initialization to generate an initial population of size N, that is, the first-generation parent population, and encodes the corresponding variables as binary and real-valued chromosomes. The iterative process mainly uses binary tournaments to select two parent individuals to generate two offspring individuals. The crossover mutation algorithm uses single-point crossover and flip mutation on binary chromosomes, and simulated binary crossover on real-valued chromosomes ( Simulated binary crossover, SBX) and polynomial variation. Repeat this process until N offspring individuals are produced.
然后对子代种群进行联邦学习的训练评估,计算得到每个个体的三个目 标。混合父代种群和子代种群,并对混合种群进行非支配排序。从中选择N个 个体作为下一代的父代种群。重复这些步骤,直到满足迭代停止条件。最后 得到一组Pareto最优解,对其进行深入分析。Then, the training evaluation of federated learning is performed on the offspring population, and the three goals of each individual are calculated. The parent and child populations are mixed, and the mixed population is sorted non-dominantly. N individuals are selected as the parent population of the next generation. Repeat these steps until the iteration stop condition is met. Finally, a set of Pareto optimal solutions are obtained and analyzed in depth.
其中联邦学习的具体评估过程,是在FedAvg算法的基础上结合静态SET 算法。FNSGA-III算法下的联邦学习评估过程伪代码如算法1所示。The specific evaluation process of federated learning is based on the FedAvg algorithm combined with the static SET algorithm. The pseudo-code of the federated learning evaluation process under the FNSGA-III algorithm is shown in
在算法1中,i是FNSGA-III算法中种群的一个个体,P是种群规模。对 个体i解码后,得到相关的联邦学习神经网络超参数、神经网络的连通性和参 与者每轮参与比例Ci。首先使用连通信参数εi初始化静态SET拓扑,将其作 为算法中使用的全局模型;在每一轮训练过程中,使用小批量随机梯度下降 方法(Mini-batch SGD)训练本地数据。经过一定轮次后,计算全局模型的测 试错误率、全局模型准确率分布方差和通信成本三个目标。In
在本部分,我们将介绍本发明实验设置。主要包括以下几个部分:(1)实 验环境及实验数据集;(2)实验中使用的神经网络相关参数及稀疏连通性参数; (3)联邦学习的参数及数据划分方式;(4)FNSGA-III参数。In this section, we will introduce the experimental setup of the present invention. It mainly includes the following parts: (1) experimental environment and experimental data set; (2) neural network related parameters and sparse connectivity parameters used in the experiment; (3) federated learning parameters and data division methods; (4) FNSGA -III parameter.
实验环境是基于Intel(R)Core(TM)i9-9900KF CPU@3.60GHz×16的 Ubuntu系统。每个实验都在MNIST数据集上训练和测试,MNIST数据集由 28×28像素的手写数字图像组成,有60 000个训练图像和10 000个测试图像。The experimental environment is an Ubuntu system based on Intel(R) Core(TM) i9-9900KF CPU@3.60GHz×16. Each experiment is trained and tested on the MNIST dataset, which consists of 28 × 28 pixel images of handwritten digits, with 60 000 training images and 10 000 testing images.
我们选择了MLP和CNN作为联邦学习训练的神经网络模型,并依经验 设置了本发明的标准MLP和CNN参数。其中,MLP中有2个隐藏层,每层 有200个神经元(参数量为199210),使用ReLu函数作为激活函数。CNN 模型设置两个5×5卷积层(第一个有32个通道,第二个有64个通道),其后 接着一个2×2 Max池化层,一个128神经元的全连接层,使用ReLu激活函数, 最后是一个10类的softmax输出层(参数量为1 659 146)。在MLP和CNN 中,Mini-batch SGD算法的学习率η为0.05,批量大小B为10。静态SET算 法使用于MLP及CNN的全连接层中,本发明设置网络稀疏度参数ε=20。上 述参数设置作为本发明实验中的标准神经网络结构。We chose MLP and CNN as the neural network models for federated learning training, and empirically set the standard MLP and CNN parameters of the present invention. Among them, there are 2 hidden layers in the MLP, each layer has 200 neurons (the parameter amount is 199210), and the ReLu function is used as the activation function. The CNN model sets two 5×5 convolutional layers (the first has 32 channels and the second has 64 channels), followed by a 2×2 Max pooling layer, a fully connected layer of 128 neurons, The ReLu activation function is used, and finally a 10-class softmax output layer (with a parameter size of 1 659 146). In MLP and CNN, the learning rate η of the Mini-batch SGD algorithm is 0.05 and the batch size B is 10. The static SET algorithm is used in the fully connected layer of MLP and CNN, and the present invention sets the network sparsity parameter ε=20. The above parameter settings are used as the standard neural network structure in the experiments of the present invention.
在联邦学习中,我们设置参与者总数K为100,参与者参与比例C为1, 即每轮通信中有100×1个参与者。对于参与者本地模型训练,迭代轮次epoch 设置为5。由于数据在不同参与者之间的大小和分布往往是不相同的,我们研 究以下两个现实场景:第一种独立同分布(Independent Identically Distribution, IID),首先将MNIST数据打乱,则100个参与者中每个包含600样本;第二 种非独立同分布(Non-Independent IdenticallyDistribution,non-IID),首先对 标记数字标签进行排序,将其按照300样本平均划分为200个片段,并给100 个参与者分配两个片段,则每个参与者只含有两个标签且数据样本大小相同。 因为本发明假定联邦学习的通信环境不稳定,模型参数传输存在丢失,在此 设定丢失率Drop=30%。In federated learning, we set the total number of participants K as 100, and the participant participation ratio C as 1, that is, there are 100 × 1 participants in each round of communication. For participant local model training, the iteration epoch is set to 5. Since the size and distribution of data between different participants are often different, we study the following two real-world scenarios: the first independent identical distribution (IID), first shuffle the MNIST data, then 100 Each of the participants contains 600 samples; the second non-Independent Identically Distribution (non-IID), first sorts the marked digital labels, divides them into 200 segments on average according to 300 samples, and gives 100 Each participant is assigned two segments, each participant contains only two labels and the data sample size is the same. Because the present invention assumes that the communication environment of federated learning is unstable and the model parameter transmission is lost, the loss rate Drop=30% is set here.
接着,我们对FNSGA-III进行参数设置,其中种群规模设置为20,迭代 20次。选择算子采用二轮锦标赛,二进制染色体中采用概率为0.9的单点交 叉,和概率为0.1的位翻转变异;实值染色体中采用概率为0.9且nc=2的模拟 二进制交叉,和概率为0.1且nm=20的多项式变异。Next, we set the parameters for FNSGA-III, where the population size was set to 20, and 20 iterations were performed. The selection operator adopts the two-round tournament, the single-point crossover with probability 0.9 is used in the binary chromosome, and the bit-flip mutation with probability 0.1 is used; the simulated binary crossover with probability 0.9 and nc = 2 is used in the real-valued chromosome, and the sum probability is Polynomial variation of 0.1 and nm = 20.
本发明使用FNSGA-III算法来优化联邦学习的相关参数,以实现联邦学 习中全局模型测试错误率、全局模型准确率分布方差和通信成本之间我们选 择了MLP和CNN作为联邦学习训练的神经网络模型,并依经验设置了本发 明的标准MLP和CNN参数。其中,MLP中有2个隐藏层,每层有200个神 经元(参数量为199 210),使用ReLu函数作为激活函数。CNN模型设置两 个5×5卷积层(第一个有32个通道,第二个有64个通道),其后接着一个2×2 Max池化层,一个128神经元的全连接层,使用ReLu激活函数,最后是一个 10类的softmax输出层(参数量为1 659 146)。在MLP和CNN中,Mini-batch SGD算法的学习率η为0.05,批量大小B为10。静态SET算法使用于MLP 及CNN的全连接层中,本发明设置网络稀疏度参数ε=20。上述参数设置作 为本发明实验中的标准神经网络结构。The present invention uses the FNSGA-III algorithm to optimize the relevant parameters of federated learning, so as to realize the difference between the global model test error rate, the global model accuracy rate distribution variance and the communication cost in the federated learning. We choose MLP and CNN as the neural network for federated learning training. model, and empirically set the standard MLP and CNN parameters of the present invention. Among them, there are 2 hidden layers in MLP, each layer has 200 neurons (parameters are 199 210), and the ReLu function is used as the activation function. The CNN model sets two 5×5 convolutional layers (the first has 32 channels, the second has 64 channels), followed by a 2×2 Max pooling layer, a fully connected layer of 128 neurons, Using the ReLu activation function, finally a 10-class softmax output layer (with a parameter size of 1 659 146). In MLP and CNN, the learning rate η of the Mini-batch SGD algorithm is 0.05 and the batch size B is 10. The static SET algorithm is used in the fully connected layer of MLP and CNN, and the present invention sets the network sparsity parameter ε=20. The above parameter settings are used as the standard neural network structure in the experiments of the present invention.
在联邦学习中,我们设置参与者总数K为100,参与者参与比例C为1, 即每轮通信中有100×1个参与者。对于参与者本地模型训练,迭代轮次epoch 设置为5。由于数据在不同参与者之间的大小和分布往往是不相同的,我们研 究以下两个现实场景:第一种独立同分布(Independent Identically Distribution, IID),首先将MNIST数据打乱,则100个参与者中每个包含600样本;第二 种非独立同分布(Non-Independent IdenticallyDistribution,non-IID),首先对 标记数字标签进行排序,将其按照300样本平均划分为200个片段,并给100 个参与者分配两个片段,则每个参与者只含有两个标签且数据样本大小相同。 因为本发明假定联邦学习的通信环境不稳定,模型参数传输存在丢失,在此 设定丢失率Drop=30%。In federated learning, we set the total number of participants K as 100, and the participant participation ratio C as 1, that is, there are 100 × 1 participants in each round of communication. For participant local model training, the iteration epoch is set to 5. Since the size and distribution of data between different participants are often different, we study the following two real-world scenarios: the first independent identical distribution (IID), first shuffle the MNIST data, then 100 Each of the participants contains 600 samples; the second non-Independent Identically Distribution (non-IID), first sorts the marked digital labels, divides them into 200 segments on average according to 300 samples, and gives 100 Each participant is assigned two segments, each participant contains only two labels and the data sample size is the same. Because the present invention assumes that the communication environment of federated learning is unstable and the model parameter transmission is lost, the loss rate Drop=30% is set here.
接着,我们对FNSGA-III进行参数设置,其中种群规模设置为20,迭代 20次。选择算子采用二轮锦标赛,二进制染色体中采用概率为0.9的单点交 叉,和概率为0.1的位翻转变异;实值染色体中采用概率为0.9且nc=2的模拟 二进制交叉,和概率为0.1且nm=20的多项式变异。Next, we set the parameters for FNSGA-III, where the population size was set to 20, and 20 iterations were performed. The selection operator adopts the two-round tournament, the single-point crossover with probability 0.9 is used in the binary chromosome, and the bit-flip mutation with probability 0.1 is used; the simulated binary crossover with probability 0.9 and nc = 2 is used in the real-valued chromosome, and the sum probability is Polynomial variation of 0.1 and nm = 20.
本发明使用FNSGA-III算法来优化联邦学习的相关参数,以实现联邦学 习中全局模型测试错误率、全局模型准确率分布方差和通信成本之间的平衡。 首先对FNSGA-III算法和NSGA-III算法进行对比实验,探究提出方法的有效 性。多目标进化算法实验分析的相关参数设置如所示。The present invention uses the FNSGA-III algorithm to optimize the relevant parameters of the federated learning, so as to achieve a balance between the global model test error rate, the global model accuracy rate distribution variance and the communication cost in the federated learning. Firstly, a comparative experiment is carried out on the FNSGA-III algorithm and the NSGA-III algorithm to explore the effectiveness of the proposed method. The relevant parameter settings for the experimental analysis of the multi-objective evolutionary algorithm are shown in the figure.
表1多目标联邦学习进化算法相关参数设置Table 1 Multi-objective federated learning evolutionary algorithm related parameter settings
我们设置种群大小为20,种群迭代代数为20,对每个个体进行联邦学习 评估过程的通信轮数为5次。设置参与者参与比例参数C的取值范围是0.1到 1,保证有一定的参与者参与训练,学习率为0.01至0.2,太高的学习率会影 响收敛。We set the population size to 20, the population iteration generation to 20, and the number of communication rounds for each individual to perform the federated learning evaluation process to 5. Set the value range of the participant participation ratio parameter C from 0.1 to 1, to ensure that a certain number of participants participate in the training, and the learning rate is 0.01 to 0.2. Too high a learning rate will affect the convergence.
神经网络的参数设置中,MLP的最大隐藏层数为4,每层的最大神经元 数为256。对于CNN,设置最大卷积层数为3,最大核通道数为64,最大全 连接层数为3,卷积层中最大神经元数为256,卷积核大小是3或5。网络稀 疏度参数最大值设置为128。In the parameter setting of the neural network, the maximum number of hidden layers of MLP is 4, and the maximum number of neurons in each layer is 256. For CNN, set the maximum number of convolutional layers to 3, the maximum number of kernel channels to 64, the maximum number of fully connected layers to 3, the maximum number of neurons in the convolutional layer to be 256, and the size of the convolution kernel to be 3 or 5. The maximum value of the network sparsity parameter is set to 128.
MLP和CNN在IID和non-IID数据上,使用FNSGA-III算法和NSGA-III 算法进化得到的最终Pareto解如图5-图8所示,其中每个点代表联邦学习中 特定结构参数对应的一个解。浅色的表示为本发明的FNSGA-III算法下得到 的Pareto最优解,深色的点表示为随机初始化的NSGA-III算法下得到的Pareto 最优解。The final Pareto solutions of MLP and CNN using FNSGA-III algorithm and NSGA-III algorithm evolution on IID and non-IID data are shown in Figure 5-Figure 8, where each point represents the corresponding structure parameter in federated learning. a solution. The light-colored points represent the Pareto optimal solution obtained under the FNSGA-III algorithm of the present invention, and the dark-colored points represent the Pareto optimal solution obtained under the randomly initialized NSGA-III algorithm.
FNSGA-III得到的Pareto解的数量比NSGA-III下得到解数量更稳定,其 中NSGA-III在CNN下的Pareto解数量较少,比如CNN IID下的NSGA-III 的Pareto解数量只有3个。图5-图8中浅色的解要占优于深色的解,即本发 明得到Pareto解优于NSGA-III算法的Pareto解,除了在CNN IID下出现了 一个深色解占优于浅色解,但在CNN IID下的解数量非常少。并且IID下的 FNSGA-III算法占优效果更明显,FNSGA-III和NSGA-III的解之间距离相差较大,而non-IID下两个算法的解之间距离差别较小。同时,我们发现 FNSGA-III会更收敛于拐点处,拐点处的解特点是各个目标值小、解质量更高, FNSGA-III的Pareto解集中在拐点处使得其在MLP下的均匀性劣于NSGA-III 的Pareto解的均匀性。The number of Pareto solutions obtained by FNSGA-III is more stable than that obtained by NSGA-III, and the number of Pareto solutions of NSGA-III under CNN is less, for example, the number of Pareto solutions of NSGA-III under CNN IID is only 3. In Fig. 5-Fig. 8, the light-colored solution is superior to the dark-colored solution, that is, the Pareto solution obtained by the present invention is superior to the Pareto solution of the NSGA-III algorithm, except that under CNN IID, a dark-colored solution is superior to the light-colored solution. color solution, but the number of solutions under CNN IID is very small. And the FNSGA-III algorithm under IID has a more dominant effect. The distance between the solutions of FNSGA-III and NSGA-III is quite different, while the distance between the solutions of the two algorithms under non-IID is smaller. At the same time, we find that FNSGA-III will converge more at the inflection point. The solution at the inflection point is characterized by small target values and higher solution quality. The Pareto solution of FNSGA-III is concentrated at the inflection point, which makes its uniformity under MLP inferior to Homogeneity of the Pareto solution of NSGA-III.
此外,表2给出了FNSGA-III与NSGA-III最后得到的Pareto解的相关评 价指标结果,从多个维度对两个算法的Pareto解进行分析。In addition, Table 2 shows the related evaluation index results of the Pareto solutions finally obtained by FNSGA-III and NSGA-III. The Pareto solutions of the two algorithms are analyzed from multiple dimensions.
单一目标的最小值反映各个目标函数的极值情况,体现算法的寻优能力。 从表2可以看出,FNSGA-III得到的单一目标下的最小全局模型测试错误率、 最小方差、最小通信成本基本都比NSGA-Ⅲ要小,例如在最小全局模型测试 错误率上,FNSGA-III的值比NSGA-III小,并在CNN non-IID下差距明显, 相差4.89%。The minimum value of a single objective reflects the extreme value of each objective function and reflects the optimization ability of the algorithm. It can be seen from Table 2 that the minimum global model test error rate, minimum variance, and minimum communication cost obtained by FNSGA-III under a single target are basically smaller than NSGA-III. For example, in terms of the minimum global model test error rate, FNSGA- The value of III is smaller than that of NSGA-III, and the difference is obvious under CNN non-IID, with a difference of 4.89%.
FNSGA-III算法获得的Pareto非支配解数量稳定,除了MLP IID,其余数 量都大于NSGA-III。MLP下两个算法得到的解数量平均来看大于CNN下的 解数量。NSGA-III的MLP解数量明显大于CNN的解数量,变化明显。则 FNSGA-III算法在解数量上比NSGA-III算法更具有鲁棒性。The number of Pareto non-dominated solutions obtained by the FNSGA-III algorithm is stable, except for MLP IID, the other numbers are larger than those of NSGA-III. On average, the number of solutions obtained by the two algorithms under MLP is larger than that under CNN. The number of MLP solutions of NSGA-III is significantly larger than that of CNN, and the change is obvious. Then the FNSGA-III algorithm is more robust than the NSGA-III algorithm in terms of the number of solutions.
综合性指标超体积指标(Hypervolume,HV),计算的是由所有非支配 解与参考点构成的超立方体的超体积之和,是评价Pareto解的综合指标,一 般来说,HV值越大,所评估的Pareto解的质量越好。由表2可知,FNSGA-III 算法的HV值始终优于NSGA-III,表现出更好的质量。The comprehensive index Hypervolume index (Hypervolume, HV), which calculates the sum of the hypervolume of the hypercube composed of all non-dominated solutions and reference points, is a comprehensive index for evaluating the Pareto solution. The quality of the evaluated Pareto solution is better. It can be seen from Table 2 that the HV value of the FNSGA-III algorithm is always better than that of NSGA-III, showing better quality.
表2 FNSGA-III算法与NSGA-III算法的各指标分析Table 2 Analysis of each index of FNSGA-III algorithm and NSGA-III algorithm
覆盖率C(A,B)计算的是解集B中的解至少被A中的一个解支配的比例, 衡量的是两个解集之间的重合程度,指标C越大,代表解集A的质量较解集 B的质量越好。在表2中,C(F,N)中的F表示为FNSGA-III的解集,N表示为 NSGA-III解集,C(F,N)度量值基本都大于C(N,F),CNN IID下C(F,N)=50%小 于C(N,F)=53%,但两者差距很小。总的来说,从覆盖率C来看,FNSGA-III 的解优于NSGA-III算法。The coverage ratio C(A, B) calculates the proportion of solutions in solution set B dominated by at least one solution in A, and measures the degree of overlap between the two solution sets. The larger the index C, the more the solution set A is. The quality of is better than the quality of solution set B. In Table 2, F in C(F,N) represents the solution set of FNSGA-III, and N represents the solution set of NSGA-III. The metric value of C(F,N) is basically greater than that of C(N,F), Under CNN IID, C(F,N)=50% is smaller than C(N,F)=53%, but the difference between the two is very small. Overall, the solution of FNSGA-III is better than the NSGA-III algorithm in terms of coverage C.
运行时间上,CNN时间消耗大于MLP,non-IID时间消耗大于IID,且 NSGA-III时间消耗大于FNSGA-III,即从统计结果来看,FNSGA-III的时间 性能最优。根据以上对单一目标最小值、非支配解数量、HV指标、C指标和 时间性能的分析,我们可以得出结论,提出的FNSGA-III算法得到的Pareto 最优解具有更好的质量,优于NSGA-III算法解。In terms of running time, the time consumption of CNN is greater than that of MLP, the time consumption of non-IID is greater than that of IID, and the time consumption of NSGA-III is greater than that of FNSGA-III, that is, from the statistical results, FNSGA-III has the best time performance. According to the above analysis of single objective minimum value, number of non-dominated solutions, HV index, C index and time performance, we can conclude that the Pareto optimal solution obtained by the proposed FNSGA-III algorithm has better quality than NSGA-III algorithm solution.
此外,我们将提出的算法FNSGA-III与NSGA-II、SPEA2进化多目标算 法进行对比,从HV综合指标、Pareto解数目、覆盖率C、时间、单目标最优 解等多方面来评估算法所得Pareto最优解的优劣。选择上述实验中, FNSGA-III与NSGA-III所得Pareto最为接近的MLPnon-IID进行多个进化算 法的对比试验。实验结果如表3和图6所示。In addition, we compare the proposed algorithm FNSGA-III with NSGA-II and SPEA2 evolutionary multi-objective algorithm, and evaluate the results of the algorithm in terms of HV comprehensive index, number of Pareto solutions, coverage C, time, and single-objective optimal solution. The pros and cons of Pareto optimal solutions. In the above experiments, MLPnon-IID, which is the closest to the Pareto obtained by FNSGA-III and NSGA-III, is selected to carry out comparative experiments of multiple evolutionary algorithms. The experimental results are shown in Table 3 and Figure 6.
表3 FNSGA-III算法与其它进化算法在MLP non-IID下实验的各指标分析Table 3 Analysis of each index of the experiment of FNSGA-III algorithm and other evolutionary algorithms under MLP non-IID
对表3进行简单分析,FNSGA-III的Pareto解更集中在拐点,并且占优 于NSGA-II和SPEA2的Pareto解,即三个目标各个目标都优于NSGA-II和 SPEA2算法。覆盖率均与FNSGA-III进行比较。FNSGA-III的HV值、Pareto 解数量、覆盖率、三目标下的最小值都优于NSGA-II和SPEA2的相关指标。 时间上,SPEA2的运行时间最短,但FNSGA-III的运行时间与SPEA2的也相 差不大。综上,提出的FNSGA-III算法将随机初始化改为快速贪婪初始化, 能够在提高运行效率的同时,基本优于NSGA-III、NSGA-II、SPEA2进化算 法,得到的Pareto具有更高的质量。A simple analysis of Table 3 shows that the Pareto solution of FNSGA-III is more concentrated at the inflection point and is superior to the Pareto solution of NSGA-II and SPEA2, that is, each of the three objectives is better than the NSGA-II and SPEA2 algorithms. Coverage is all compared with FNSGA-III. The HV value of FNSGA-III, the number of Pareto solutions, the coverage ratio, and the minimum value under the three objectives are all better than those of NSGA-II and SPEA2. In terms of time, the running time of SPEA2 is the shortest, but the running time of FNSGA-III is not much different from that of SPEA2. In summary, the proposed FNSGA-III algorithm changes the random initialization to fast greedy initialization, which can improve the operation efficiency and is basically better than the NSGA-III, NSGA-II, SPEA2 evolutionary algorithms, and the obtained Pareto has higher quality.
由于在FNSGA-III的联邦学习评估过程中的通信轮次设置非常小,未能 充分探究其联邦学习性能。因为计算资源的限制,我们只选择准确率最差的 MLP non-IID进行增强实验。对FNSGA-III算法得到的MLP non-IID的Pareto 最优解选择4个解,其中2个是全局测试错误率很小的解,另外2个是拐点 解。将此4个解进行联邦学习训练,并与标准FedAvg算法进行对比,通信轮 次设置为150轮次。除了增加通信轮次,各个解在IID和non-IID下进行验证, 研究在IID数据集上得到的非支配解是否在非IID数据集上仍然有效,反之亦 然。所有验证结果列于表4中。Due to the very small set of communication rounds in the federated learning evaluation process of FNSGA-III, its federated learning performance has not been fully explored. Due to the limitation of computational resources, we only select the MLP non-IID with the worst accuracy for augmentation experiments. Four solutions are selected for the Pareto optimal solution of MLP non-IID obtained by the FNSGA-III algorithm, two of which are solutions with a small global test error rate, and the other two are inflection point solutions. These 4 solutions are trained by federated learning and compared with the standard FedAvg algorithm, and the communication rounds are set to 150 rounds. In addition to increasing the number of communication rounds, each solution is verified under IID and non-IID, and it is investigated whether the non-dominated solutions obtained on the IID dataset are still valid on the non-IID dataset, and vice versa. All validation results are listed in Table 4.
表4 FNSGA-III算法所得的MLP non-IID解的实验数据Table 4 Experimental data of MLP non-IID solution obtained by FNSGA-III algorithm
由表4所示的结果,可以对所选择的MLP non-IID四个Pareto解的演化 情况进行如下观察。MLP non-IID下得到的4个解在non-IID数据分布下的实 验结果中,解4的稀疏度参数为3,其准确率曲线迭代虽然稳定,但是显然低 于其它情况,可能意味着过度地稀疏神经网络会损害模型准确率;解1和解2 在通信成本、准确率和方差都比标准联邦学习的结果好,并且迭代曲线稳定, 验证了本发明所提出算法FNSGA-III得到的Pareto解中,存在高质量的解。 在IID数据分布下的实验结果中,只有解4的迭代曲线劣于标准联邦学习。可 以说,在MLP non-IID下的解在non-IID下有效,扩展到IID下仍能有较好的 运行效果。From the results shown in Table 4, the following observations can be made on the evolution of the four Pareto solutions of the selected MLP non-IID. In the experimental results of the four solutions obtained under MLP non-IID data distribution under the non-IID data distribution, the sparsity parameter of
本发明的有益效果如下:The beneficial effects of the present invention are as follows:
本发明提出FNSGA-III算法来解决多目标联邦学习模型问题,并在通信 不稳定下进行实验验证。我们首先构建了联邦学习的三目标模板,设定优化 目标为最小化全局模型测试错误率、通信成本、全局模型准确率分布方差, 决策变量为神经网络的超参数以及联邦学习参数。引入NSGA-III算法对联邦 学习多目标模型进行求解,并改变了NSGA-III的初始化,实验结果表明,改 进的FNSGA-III算法优于原NSGA-III算法。并且使用FNSGA-III算法优化 得到的Pareto最优解与基准的联邦平均算法比较,有效提高了全局模型准确 率、降低了全局模型准确率分布方差及通信成本。The present invention proposes the FNSGA-III algorithm to solve the multi-objective federated learning model problem, and conducts experimental verification under unstable communication conditions. We first constructed a three-objective template for federated learning, and set the optimization objective to minimize the global model test error rate, communication cost, and global model accuracy distribution variance. The decision variables were the hyperparameters of the neural network and the federated learning parameters. The NSGA-III algorithm is introduced to solve the federated learning multi-objective model, and the initialization of NSGA-III is changed. The experimental results show that the improved FNSGA-III algorithm is better than the original NSGA-III algorithm. And the Pareto optimal solution optimized by FNSGA-III algorithm is compared with the benchmark federated average algorithm, which effectively improves the accuracy of the global model, reduces the distribution variance of the accuracy of the global model and the communication cost.
本发明所使用的词语“优选的”意指用作实例、示例或例证。本发明描述 为“优选的”任意方面或设计不必被解释为比其他方面或设计更有利。相反, 词语“优选的”的使用旨在以具体方式提出概念。如本申请中所使用的术语“或” 旨在意指包含的“或”而非排除的“或”。即,除非另外指定或从上下文中清楚, “X使用A或B”意指自然包括排列的任意一个。即,如果X使用A;X使用 B;或X使用A和B二者,则“X使用A或B”在前述任一示例中得到满足。As used herein, the word "preferred" means serving as an example, instance or illustration. Any aspect or design of the present invention described as "preferred" is not necessarily to be construed as advantageous over other aspects or designs. Rather, use of the word "preferred" is intended to introduce concepts in a specific manner. The term "or" as used in this application is intended to mean an inclusive "or" rather than an exclusive "or." That is, unless specified otherwise or clear from context, "X employs A or B" is meant to naturally include either of the permutations. That is, "X uses A or B" is satisfied in any of the preceding examples if X uses A; X uses B; or X uses both A and B.
而且,尽管已经相对于一个或实现方式示出并描述了本公开,但是本领 域技术人员基于对本说明书和附图的阅读和理解将会想到等价变型和修改。 本公开包括所有这样的修改和变型,并且仅由所附权利要求的范围限制。特 别地关于由上述组件(例如元件等)执行的各种功能,用于描述这样的组件的 术语旨在对应于执行所述组件的指定功能(例如其在功能上是等价的)的任意 组件(除非另外指示),即使在结构上与执行本发明所示的本公开的示范性实 现方式中的功能的公开结构不等同。此外,尽管本公开的特定特征已经相对 于若干实现方式中的仅一个被公开,但是这种特征可以与如可以对给定或特 定应用而言是期望和有利的其他实现方式的一个或其他特征组合。而且,就 术语“包括”、“具有”、“含有”或其变形被用在具体实施方式或权利要 求中而言,这样的术语旨在以与术语“包含”相似的方式包括。Furthermore, although the disclosure has been shown and described with respect to one implementation or implementation, equivalent variations and modifications will occur to those skilled in the art based on a reading and understanding of this specification and the accompanying drawings. The present disclosure includes all such modifications and variations and is limited only by the scope of the appended claims. In particular with respect to the various functions performed by the above-described components (eg, elements, etc.), the terms used to describe such components are intended to correspond to any component that performs the specified function of the component (eg, which is functionally equivalent) (unless otherwise indicated), even if not structurally equivalent to the disclosed structures that perform the functions of the illustrated exemplary implementations of the present disclosure by this disclosure. Furthermore, although a particular feature of the present disclosure has been disclosed with respect to only one of several implementations, such feature may be combined with one or other features of other implementations as may be desired and advantageous for a given or particular application combination. Also, to the extent that the terms "comprising", "having", "containing" or variations thereof are used in the detailed description or the claims, such terms are intended to be inclusive in a manner similar to the term "comprising".
本发明实施例中的各功能单元可以集成在一个处理模块中,也可以是各 个单元单独物理存在,也可以多个或多个以上单元集成在一个模块中。上述 集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实 现。所述集成的模块如果以软件功能模块的形式实现并作为独立的产品销售 或使用时,也可以存储在一个计算机可读取存储介质中。上述提到的存储介 质可以是只读存储器,磁盘或光盘等。上述的各装置或系统,可以执行相应 方法实施例中的存储方法。Each functional unit in this embodiment of the present invention may be integrated into one processing module, or each unit may exist physically alone, or multiple or more units may be integrated into one module. The above-mentioned integrated modules can be implemented in the form of hardware or in the form of software function modules. If the integrated modules are implemented in the form of software functional modules and sold or used as independent products, they may also be stored in a computer-readable storage medium. The above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, and the like. The above-mentioned devices or systems can execute the storage methods in the corresponding method embodiments.
综上所述,上述实施例为本发明的一种实施方式,但本发明的实施方式 并不受所述实施例的限制,其他的任何背离本发明的精神实质与原理下所做 的改变、修饰、代替、组合、简化,均应为等效的置换方式,都包含在本发 明的保护范围之内。To sum up, the above-mentioned embodiment is an embodiment of the present invention, but the embodiment of the present invention is not limited by the embodiment, and any other changes that deviate from the spirit and principle of the present invention, Modifications, substitutions, combinations, and simplifications should all be equivalent substitutions, which are all included within the protection scope of the present invention.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210396629.7A CN114819181B (en) | 2022-04-15 | 2022-04-15 | Multi-objective federated learning evolutionary method based on improved NSGA-III |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210396629.7A CN114819181B (en) | 2022-04-15 | 2022-04-15 | Multi-objective federated learning evolutionary method based on improved NSGA-III |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114819181A true CN114819181A (en) | 2022-07-29 |
CN114819181B CN114819181B (en) | 2025-02-14 |
Family
ID=82537275
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210396629.7A Active CN114819181B (en) | 2022-04-15 | 2022-04-15 | Multi-objective federated learning evolutionary method based on improved NSGA-III |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114819181B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117689001A (en) * | 2024-02-02 | 2024-03-12 | 中科方寸知微(南京)科技有限公司 | Neural network multi-granularity pruning compression method and system based on zero data search |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110276376A (en) * | 2019-05-14 | 2019-09-24 | 嘉兴职业技术学院 | A Crowd Information Clustering Method Based on Super Meta Heuristic Algorithm |
CN113139603A (en) * | 2021-04-25 | 2021-07-20 | 广州大学 | Federal learning method based on EMD distance fusion multi-source heterogeneous data |
US20210295019A1 (en) * | 2020-03-19 | 2021-09-23 | Sichuan University | Face recognition method based on evolutionary convolutional neural network |
CN114154646A (en) * | 2021-12-07 | 2022-03-08 | 南京华苏科技有限公司 | Efficiency optimization method for federal learning in mobile edge network |
-
2022
- 2022-04-15 CN CN202210396629.7A patent/CN114819181B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110276376A (en) * | 2019-05-14 | 2019-09-24 | 嘉兴职业技术学院 | A Crowd Information Clustering Method Based on Super Meta Heuristic Algorithm |
US20210295019A1 (en) * | 2020-03-19 | 2021-09-23 | Sichuan University | Face recognition method based on evolutionary convolutional neural network |
CN113139603A (en) * | 2021-04-25 | 2021-07-20 | 广州大学 | Federal learning method based on EMD distance fusion multi-source heterogeneous data |
CN114154646A (en) * | 2021-12-07 | 2022-03-08 | 南京华苏科技有限公司 | Efficiency optimization method for federal learning in mobile edge network |
Non-Patent Citations (1)
Title |
---|
王林;陈璨;: "一种基于DE算法和NSGA-Ⅱ的多目标混合进化算法", 运筹与管理, no. 06, 25 December 2010 (2010-12-25) * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117689001A (en) * | 2024-02-02 | 2024-03-12 | 中科方寸知微(南京)科技有限公司 | Neural network multi-granularity pruning compression method and system based on zero data search |
CN117689001B (en) * | 2024-02-02 | 2024-05-07 | 中科方寸知微(南京)科技有限公司 | Neural network multi-granularity pruning compression method and system based on zero data search |
Also Published As
Publication number | Publication date |
---|---|
CN114819181B (en) | 2025-02-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP3422261A1 (en) | Global optimization, searching and machine learning method based on lamarck acquired genetic principle | |
KR20210040248A (en) | Generative structure-property inverse computational co-design of materials | |
CN101271572B (en) | Image Segmentation Method Based on Immune Clone Selection Clustering | |
CN102594909B (en) | Multi-objective community detection method based on spectrum information of common neighbour matrix | |
CN111709534B (en) | Federated learning method, device, equipment and medium based on evolutionary computing | |
CN110992935A (en) | Computing system for training neural networks | |
CN112487193B (en) | A zero-shot image classification method based on autoencoder | |
CN111709519B (en) | Deep learning parallel computing architecture method and super-parameter automatic configuration optimization thereof | |
CN110533221A (en) | Multipurpose Optimal Method based on production confrontation network | |
CN113191092A (en) | Industrial process product quality soft measurement method based on orthogonal increment random configuration network | |
Dang et al. | Hybrid IoT device selection with knowledge transfer for federated learning | |
CN114819181A (en) | Multi-objective federated learning evolution method based on improved NSGA-III | |
CN114372337B (en) | Grouping optimization method for model aggregation in distributed edge learning | |
CN105740952A (en) | Multi-objective rapid genetic method for community network detection | |
CN114037014A (en) | Citation Network Clustering Method Based on Graph Autoencoder | |
CN106503793B (en) | A kind of neural network short-term wind speed forecasting method based on improvement difference algorithm | |
CN117692939A (en) | A client scheduling method in a dynamic communication environment | |
CN117056763A (en) | Community discovery method based on variogram embedding | |
CN117725455A (en) | Heterogeneous information network community discovery method based on self-supervision learning | |
CN116976405A (en) | Variable component shadow quantum neural network based on immune optimization algorithm | |
CN115906959A (en) | Parameter training method of neural network model based on DE-BP algorithm | |
CN107437230A (en) | A kind of method that multi-target evolution based on matrix coder solves interview packet | |
Gao et al. | Fedim: An anti-attack federated learning based on agent importance aggregation | |
CN118821905B (en) | Agent model-assisted evolutionary generative adversarial network architecture search method and system | |
CN116451049B (en) | Wind power prediction method based on agent assisted evolutionary neural network structure search |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant |