CN114492746A

CN114492746A - Federal learning acceleration method based on model segmentation

Info

Publication number: CN114492746A
Application number: CN202210057437.3A
Authority: CN
Inventors: 曹绍华; 陈辉; 陈舒; 张汉卿; 张卫山; 吴春雷
Original assignee: China University of Petroleum East China
Current assignee: China University of Petroleum East China
Priority date: 2022-01-19
Filing date: 2022-01-19
Publication date: 2022-05-13

Abstract

The invention discloses a federated learning acceleration method based on model segmentation, which belongs to the field of Internet of things and the field of machine vision, and firstly, in order to improve the training efficiency of federated learning, the selection of high-quality participants is extremely critical, aiming at high-quality and resource-limited Internet of things equipment, the two aspects of network bandwidth change and global training time are comprehensively considered, and a calculation task unloading strategy is designed by utilizing the concept of model segmentation, so that the global training time is reduced, and the training efficiency is improved; secondly, a federal learning paradigm is adopted to protect data safety, and distributed user data is utilized to improve reasoning performance; and finally, optimizing a global model aggregation strategy of the federal learning, and further reducing transmission contents and communication pressure by an aggregation mode combining multi-round iteration re-communication and model compression to achieve the purpose of accelerating the federal learning.

Description

Federal learning acceleration method based on model segmentation

Technical Field

The invention belongs to the field of Internet of things and the field of machine vision, and particularly relates to a federal learning acceleration method based on model segmentation.

Background

Federal Learning (Federal Learning) is a novel artificial intelligence basic technology, is proposed by Google in 2016, and is originally used for solving the problem of local model updating of android mobile phone terminal users.

In recent years, the relevant people in the field have conducted intensive research thereon, such as: (1) in 2016, Jakub et al propose a federated learning acceleration algorithm based on synchronization parameter updating, and aiming at a synchronization updating strategy, the method mainly utilizes the fault tolerance characteristic of model aggregation to properly reduce the communication frequency, thereby reducing the communication overhead; common methods include increasing communication intervals, reducing transmission content, asymmetric push and fetch, computation, and transmission pipelining; but facing resource constrained IoT devices, it is difficult or impossible to perform large training tasks at all; (2) compared with a synchronous updating strategy, although the efficiency of the asynchronous updating strategy can be greatly improved, the asynchronous updating strategy can cause the delay problem among local model parameters from different participants, thereby causing the convergence to be poor in the training process; (3) neel Guha et al in 2019 proposes a single-round communication federal learning improvement scheme and a federal learning acceleration algorithm based on model integration, namely the whole training process can complete the construction of a global model only by 1 round of communication; however, since the local model quality of different participants may be very different, the optimal method for generating the global federated model may only need to consider the local models of a part of the participants, not the local models of all the participants, and therefore how to quickly select the part of the participants which is particularly important is urgent to solve.

In the research, the federal learning acceleration algorithm based on synchronous updating is relatively sensitive to communication and computing resources, and the traditional federal learning algorithm cannot be applied to the scene of the actual internet of things along with the increase of the calculated amount of the training task; the federated learning acceleration algorithm updated based on asynchronous parameters has a delay problem, so that the convergence in the training process is poor; the federal learning acceleration algorithm based on model integration needs to consider how to quickly select this part of particularly important participants. Meanwhile, the prior art does not consider and design in this respect based on a federated learning acceleration method of model segmentation.

Disclosure of Invention

In order to solve the problems, the invention provides a federal learning acceleration method based on model segmentation, which accelerates the federal learning by adopting the concepts of model segmentation and compression.

The technical scheme of the invention is as follows:

a federated learning acceleration method based on model segmentation specifically comprises the following steps:

s1, after training data are input, a server randomly selects K participants (K is less than or equal to N) from N participants to carry out local training, and first, two rounds of traditional federal learning are carried out without executing any unloading strategy;

s2, calculating the global loss function variable quantity of the previous two rounds and the training time of each participant, sequencing the global loss function variable quantities of the previous two rounds from large to small, and selecting the participant with the highest rank for training;

s3, comprehensively considering factors of training time and network bandwidth, dynamically grouping participants after each round of training, grouping the participants with low bandwidth into additional groups, grouping the participants with similar characteristics into one group, and executing the same unloading strategy by the same group;

s4, comprehensively considering the training time and the computing capacity of the participants, judging whether resources are limited or not by combining the grouping condition of the step S3, and if the resources are limited, entering the step S5 to minimize the overall training time; otherwise, directly entering step S7 for multiple iterations of uploading;

s5, according to the target of global training time minimization, taking each layer of the deep neural network as a calculation unit, and generating division points among the layers;

s6, selecting an optimal segmentation point from the candidate segmentation points, dynamically unloading a calculation task to a credible edge server, cooperatively training a resource-limited participant and the edge server, and uploading parameter information to the server by the edge server;

s7, performing multiple rounds of iteration, starting from the third round of federal learning, and uploading parameters after all participants carry out local training for multiple iterations;

s8, from the beginning of the third round of federal learning, adopting a strategy of re-aggregation after multiple iterations, and calculating the gradient mean value variation of each layer of the model after the multiple iterations are finished;

s9, sorting the variation from large to small, wherein the larger the variation is, the higher the sensitivity is, selecting a layer with high sensitivity for uploading, and for a layer with low sensitivity, the layer is not uploaded, so that the optimal global model is obtained.

Further, five modules are included: the system comprises a participant selection module, a dynamic network perception module, a calculation task unloading decision module, a local training strategy module and a dynamic model aggregation module;

the participant selection module is responsible for selecting high-quality participants according to the global loss function variation;

the dynamic network perception module is responsible for comprehensively considering training time and network broadband and grouping all participants;

the task unloading calculation strategy module is responsible for combining grouping conditions with global training time minimization and selecting an optimal unloading point;

the local training strategy module is responsible for carrying out multi-round iteration and then uploading, and the iteration times are dynamically set according to the variable quantity of the global loss function;

the dynamic model aggregation module is responsible for performing multiple rounds of iteration and re-aggregation, calculating and sequencing the layer sensitivities, and selecting the layer with the high sensitivity for uploading.

Further, in step S3, the similar features are network bandwidth and CPU frequency; the calculation capability in the step S4 is the CPU frequency.

Further, in step S6, the candidate division point is a point having a small data amount and a small calculation amount.

Further, in step S7, the initial value of the number of iterations is set to 10, and then the number of iterations is dynamically set according to the gradient mean variation of the previous round.

The invention has the following beneficial technical effects:

by selecting high-quality participants and adopting a federal learning paradigm, data privacy is protected, reasoning performance is improved by using distributed user data, and federal learning efficiency is accelerated; partial calculation of the resource-limited participants is unloaded to a trusted edge server through a model segmentation method, and the bandwidth change of the participants and the server can be self-adapted through a dynamic network sensing module to make an optimal unloading strategy, so that the model precision is improved, and the training efficiency is accelerated; and the aggregation strategy combining multiple rounds of iterative re-communication and uploading parameters selected according to the layer sensitivity saves the communication and calculation resource consumption of the traditional federal learning, such as electric quantity, bandwidth, memory and the like.

Drawings

FIG. 1 is a schematic structural diagram of a federated learning acceleration method based on model segmentation according to the present invention;

FIG. 2 is a flow chart of a federated learning acceleration method based on model segmentation according to the present invention.

Detailed Description

The invention is described in further detail below with reference to the following figures and detailed description:

the invention provides a federal learning acceleration method based on model segmentation based on federal learning, model segmentation and model compression technologies. First, to improve the training efficiency of federal learning, it is extremely critical to choose high quality participants. Aiming at high-quality and resource-limited Internet of things (IoT) equipment, two aspects of network bandwidth change and training time are comprehensively considered, and a calculation task unloading strategy is designed by utilizing the idea of model segmentation, so that the overall training time is reduced, and the training efficiency is improved; secondly, federated learning is adopted to protect data safety, and distributed user data is utilized to improve reasoning performance; and finally, optimizing a global model aggregation strategy of the federal learning, and further reducing transmission content, reducing communication pressure and achieving the purpose of accelerating the federal learning by an aggregation mode combining multi-round iteration re-communication and model compression.

As shown in fig. 1, a federated learning acceleration method based on model segmentation includes five modules: the system comprises a participant selection module, a dynamic network perception module, a calculation task unloading decision module, a local training strategy module and a dynamic model aggregation module;

As shown in fig. 2, the federal learning acceleration method based on model segmentation specifically includes the following steps:

s3, comprehensively considering factors of training time and network bandwidth, dynamically grouping participants after each round of training, grouping the participants with low bandwidth into additional groups, grouping the participants with similar characteristics (network bandwidth and CPU frequency) into one group, and executing the same unloading strategy by the same group;

s4, comprehensively considering the training time and the computing power (CPU frequency) of the participants, judging whether the resources are limited or not by combining the grouping condition of the step S3, and if the resources are limited, entering the step S5 to minimize the global training time; otherwise, directly entering step S7 for multiple iterations of uploading;

s6, selecting an optimal segmentation point from candidate segmentation points (layers with small data volume and small calculation amount, such as a pooling layer), dynamically unloading a calculation task to a credible edge server, cooperatively training participants with limited resources and the edge server, and uploading parameter information to the server by the edge server;

s7, performing multi-round iteration, starting from the third round of federal learning, and uploading parameters after all participants are trained locally for multiple iterations, instead of uploading the parameters once per training; setting an initial value of the number of times of multiple iterations as 10, and then dynamically setting according to the gradient mean value variation of the previous round;

s9, sorting the variation from large to small, wherein the larger the variation is, the more sensitive the layer is, selecting the layer with high sensitivity to upload, and for the layer with low sensitivity, the layer is not uploaded, so that the optimal global model is obtained.

The algorithm pseudo code of the Federal learning acceleration method based on model segmentation is as follows:

according to the invention, high-quality participants are selected, a federal learning paradigm is adopted, so that data privacy is protected, the reasoning performance is improved by using distributed user data, and the federal learning efficiency is accelerated; meanwhile, in order to further improve the accuracy of the model, the algorithm can unload part of the calculation of the resource-limited participants to a trusted edge server through a model segmentation method, and can adapt to the bandwidth change of the participants and the server through a dynamic network sensing module to make an optimal unloading strategy, so that the model accuracy is improved, and the training efficiency is accelerated; and the aggregation strategy combining multi-round iterative re-communication and uploading parameters selected according to the layer sensitivity saves the communication and calculation resource consumption of the traditional federal learning, such as electric quantity, bandwidth, memory and the like.

It is to be understood that the above description is not intended to limit the present invention, and the present invention is not limited to the above examples, and those skilled in the art may make modifications, alterations, additions or substitutions within the spirit and scope of the present invention.

Claims

1. A federated learning acceleration method based on model segmentation is characterized by comprising the following steps:

s3, comprehensively considering factors of training time and network bandwidth, dynamically grouping the participants after each round of training, grouping the participants with low bandwidth into additional groups, grouping the participants with similar characteristics into one group, and executing the same unloading strategy by the same group;

2. The model segmentation-based federated learning acceleration method of claim 1, characterized in that it comprises five modules: the system comprises a participant selection module, a dynamic network perception module, a calculation task unloading decision module, a local training strategy module and a dynamic model aggregation module;

3. The model segmentation-based federated learning acceleration method of claim 1, wherein in step S3, similar features are network bandwidth and CPU frequency; the calculation capability in the step S4 is the CPU frequency.

4. The model segmentation-based federated learning acceleration method of claim 1, wherein in step S6, the candidate segmentation points are points whose data amount is small and whose calculation amount is small.

5. The model segmentation-based federated learning acceleration method of claim 1, wherein in step S7, the initial value of the number of iterations is set to 10, and then is dynamically set according to the gradient mean change of the previous round.