CN114492746A - Federal learning acceleration method based on model segmentation - Google Patents
Federal learning acceleration method based on model segmentation Download PDFInfo
- Publication number
- CN114492746A CN114492746A CN202210057437.3A CN202210057437A CN114492746A CN 114492746 A CN114492746 A CN 114492746A CN 202210057437 A CN202210057437 A CN 202210057437A CN 114492746 A CN114492746 A CN 114492746A
- Authority
- CN
- China
- Prior art keywords
- training
- participants
- model
- global
- module
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/211—Selection of the most significant subset of features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computational Linguistics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Information Transfer Between Computers (AREA)
Abstract
The invention discloses a federated learning acceleration method based on model segmentation, which belongs to the field of Internet of things and the field of machine vision, and firstly, in order to improve the training efficiency of federated learning, the selection of high-quality participants is extremely critical, aiming at high-quality and resource-limited Internet of things equipment, the two aspects of network bandwidth change and global training time are comprehensively considered, and a calculation task unloading strategy is designed by utilizing the concept of model segmentation, so that the global training time is reduced, and the training efficiency is improved; secondly, a federal learning paradigm is adopted to protect data safety, and distributed user data is utilized to improve reasoning performance; and finally, optimizing a global model aggregation strategy of the federal learning, and further reducing transmission contents and communication pressure by an aggregation mode combining multi-round iteration re-communication and model compression to achieve the purpose of accelerating the federal learning.
Description
Technical Field
The invention belongs to the field of Internet of things and the field of machine vision, and particularly relates to a federal learning acceleration method based on model segmentation.
Background
Federal Learning (Federal Learning) is a novel artificial intelligence basic technology, is proposed by Google in 2016, and is originally used for solving the problem of local model updating of android mobile phone terminal users.
In recent years, the relevant people in the field have conducted intensive research thereon, such as: (1) in 2016, Jakub et al propose a federated learning acceleration algorithm based on synchronization parameter updating, and aiming at a synchronization updating strategy, the method mainly utilizes the fault tolerance characteristic of model aggregation to properly reduce the communication frequency, thereby reducing the communication overhead; common methods include increasing communication intervals, reducing transmission content, asymmetric push and fetch, computation, and transmission pipelining; but facing resource constrained IoT devices, it is difficult or impossible to perform large training tasks at all; (2) compared with a synchronous updating strategy, although the efficiency of the asynchronous updating strategy can be greatly improved, the asynchronous updating strategy can cause the delay problem among local model parameters from different participants, thereby causing the convergence to be poor in the training process; (3) neel Guha et al in 2019 proposes a single-round communication federal learning improvement scheme and a federal learning acceleration algorithm based on model integration, namely the whole training process can complete the construction of a global model only by 1 round of communication; however, since the local model quality of different participants may be very different, the optimal method for generating the global federated model may only need to consider the local models of a part of the participants, not the local models of all the participants, and therefore how to quickly select the part of the participants which is particularly important is urgent to solve.
In the research, the federal learning acceleration algorithm based on synchronous updating is relatively sensitive to communication and computing resources, and the traditional federal learning algorithm cannot be applied to the scene of the actual internet of things along with the increase of the calculated amount of the training task; the federated learning acceleration algorithm updated based on asynchronous parameters has a delay problem, so that the convergence in the training process is poor; the federal learning acceleration algorithm based on model integration needs to consider how to quickly select this part of particularly important participants. Meanwhile, the prior art does not consider and design in this respect based on a federated learning acceleration method of model segmentation.
Disclosure of Invention
In order to solve the problems, the invention provides a federal learning acceleration method based on model segmentation, which accelerates the federal learning by adopting the concepts of model segmentation and compression.
The technical scheme of the invention is as follows:
a federated learning acceleration method based on model segmentation specifically comprises the following steps:
s1, after training data are input, a server randomly selects K participants (K is less than or equal to N) from N participants to carry out local training, and first, two rounds of traditional federal learning are carried out without executing any unloading strategy;
s2, calculating the global loss function variable quantity of the previous two rounds and the training time of each participant, sequencing the global loss function variable quantities of the previous two rounds from large to small, and selecting the participant with the highest rank for training;
s3, comprehensively considering factors of training time and network bandwidth, dynamically grouping participants after each round of training, grouping the participants with low bandwidth into additional groups, grouping the participants with similar characteristics into one group, and executing the same unloading strategy by the same group;
s4, comprehensively considering the training time and the computing capacity of the participants, judging whether resources are limited or not by combining the grouping condition of the step S3, and if the resources are limited, entering the step S5 to minimize the overall training time; otherwise, directly entering step S7 for multiple iterations of uploading;
s5, according to the target of global training time minimization, taking each layer of the deep neural network as a calculation unit, and generating division points among the layers;
s6, selecting an optimal segmentation point from the candidate segmentation points, dynamically unloading a calculation task to a credible edge server, cooperatively training a resource-limited participant and the edge server, and uploading parameter information to the server by the edge server;
s7, performing multiple rounds of iteration, starting from the third round of federal learning, and uploading parameters after all participants carry out local training for multiple iterations;
s8, from the beginning of the third round of federal learning, adopting a strategy of re-aggregation after multiple iterations, and calculating the gradient mean value variation of each layer of the model after the multiple iterations are finished;
s9, sorting the variation from large to small, wherein the larger the variation is, the higher the sensitivity is, selecting a layer with high sensitivity for uploading, and for a layer with low sensitivity, the layer is not uploaded, so that the optimal global model is obtained.
Further, five modules are included: the system comprises a participant selection module, a dynamic network perception module, a calculation task unloading decision module, a local training strategy module and a dynamic model aggregation module;
the participant selection module is responsible for selecting high-quality participants according to the global loss function variation;
the dynamic network perception module is responsible for comprehensively considering training time and network broadband and grouping all participants;
the task unloading calculation strategy module is responsible for combining grouping conditions with global training time minimization and selecting an optimal unloading point;
the local training strategy module is responsible for carrying out multi-round iteration and then uploading, and the iteration times are dynamically set according to the variable quantity of the global loss function;
the dynamic model aggregation module is responsible for performing multiple rounds of iteration and re-aggregation, calculating and sequencing the layer sensitivities, and selecting the layer with the high sensitivity for uploading.
Further, in step S3, the similar features are network bandwidth and CPU frequency; the calculation capability in the step S4 is the CPU frequency.
Further, in step S6, the candidate division point is a point having a small data amount and a small calculation amount.
Further, in step S7, the initial value of the number of iterations is set to 10, and then the number of iterations is dynamically set according to the gradient mean variation of the previous round.
The invention has the following beneficial technical effects:
by selecting high-quality participants and adopting a federal learning paradigm, data privacy is protected, reasoning performance is improved by using distributed user data, and federal learning efficiency is accelerated; partial calculation of the resource-limited participants is unloaded to a trusted edge server through a model segmentation method, and the bandwidth change of the participants and the server can be self-adapted through a dynamic network sensing module to make an optimal unloading strategy, so that the model precision is improved, and the training efficiency is accelerated; and the aggregation strategy combining multiple rounds of iterative re-communication and uploading parameters selected according to the layer sensitivity saves the communication and calculation resource consumption of the traditional federal learning, such as electric quantity, bandwidth, memory and the like.
Drawings
FIG. 1 is a schematic structural diagram of a federated learning acceleration method based on model segmentation according to the present invention;
FIG. 2 is a flow chart of a federated learning acceleration method based on model segmentation according to the present invention.
Detailed Description
The invention is described in further detail below with reference to the following figures and detailed description:
the invention provides a federal learning acceleration method based on model segmentation based on federal learning, model segmentation and model compression technologies. First, to improve the training efficiency of federal learning, it is extremely critical to choose high quality participants. Aiming at high-quality and resource-limited Internet of things (IoT) equipment, two aspects of network bandwidth change and training time are comprehensively considered, and a calculation task unloading strategy is designed by utilizing the idea of model segmentation, so that the overall training time is reduced, and the training efficiency is improved; secondly, federated learning is adopted to protect data safety, and distributed user data is utilized to improve reasoning performance; and finally, optimizing a global model aggregation strategy of the federal learning, and further reducing transmission content, reducing communication pressure and achieving the purpose of accelerating the federal learning by an aggregation mode combining multi-round iteration re-communication and model compression.
As shown in fig. 1, a federated learning acceleration method based on model segmentation includes five modules: the system comprises a participant selection module, a dynamic network perception module, a calculation task unloading decision module, a local training strategy module and a dynamic model aggregation module;
the participant selection module is responsible for selecting high-quality participants according to the global loss function variation;
the dynamic network perception module is responsible for comprehensively considering training time and network broadband and grouping all participants;
the task unloading calculation strategy module is responsible for combining grouping conditions with global training time minimization and selecting an optimal unloading point;
the local training strategy module is responsible for carrying out multi-round iteration and then uploading, and the iteration times are dynamically set according to the variable quantity of the global loss function;
the dynamic model aggregation module is responsible for performing multiple rounds of iteration and re-aggregation, calculating and sequencing the layer sensitivities, and selecting the layer with the high sensitivity for uploading.
As shown in fig. 2, the federal learning acceleration method based on model segmentation specifically includes the following steps:
s1, after training data are input, a server randomly selects K participants (K is less than or equal to N) from N participants to carry out local training, and first, two rounds of traditional federal learning are carried out without executing any unloading strategy;
s2, calculating the global loss function variable quantity of the previous two rounds and the training time of each participant, sequencing the global loss function variable quantities of the previous two rounds from large to small, and selecting the participant with the highest rank for training;
s3, comprehensively considering factors of training time and network bandwidth, dynamically grouping participants after each round of training, grouping the participants with low bandwidth into additional groups, grouping the participants with similar characteristics (network bandwidth and CPU frequency) into one group, and executing the same unloading strategy by the same group;
s4, comprehensively considering the training time and the computing power (CPU frequency) of the participants, judging whether the resources are limited or not by combining the grouping condition of the step S3, and if the resources are limited, entering the step S5 to minimize the global training time; otherwise, directly entering step S7 for multiple iterations of uploading;
s5, according to the target of global training time minimization, taking each layer of the deep neural network as a calculation unit, and generating division points among the layers;
s6, selecting an optimal segmentation point from candidate segmentation points (layers with small data volume and small calculation amount, such as a pooling layer), dynamically unloading a calculation task to a credible edge server, cooperatively training participants with limited resources and the edge server, and uploading parameter information to the server by the edge server;
s7, performing multi-round iteration, starting from the third round of federal learning, and uploading parameters after all participants are trained locally for multiple iterations, instead of uploading the parameters once per training; setting an initial value of the number of times of multiple iterations as 10, and then dynamically setting according to the gradient mean value variation of the previous round;
s8, from the beginning of the third round of federal learning, adopting a strategy of re-aggregation after multiple iterations, and calculating the gradient mean value variation of each layer of the model after the multiple iterations are finished;
s9, sorting the variation from large to small, wherein the larger the variation is, the more sensitive the layer is, selecting the layer with high sensitivity to upload, and for the layer with low sensitivity, the layer is not uploaded, so that the optimal global model is obtained.
The algorithm pseudo code of the Federal learning acceleration method based on model segmentation is as follows:
according to the invention, high-quality participants are selected, a federal learning paradigm is adopted, so that data privacy is protected, the reasoning performance is improved by using distributed user data, and the federal learning efficiency is accelerated; meanwhile, in order to further improve the accuracy of the model, the algorithm can unload part of the calculation of the resource-limited participants to a trusted edge server through a model segmentation method, and can adapt to the bandwidth change of the participants and the server through a dynamic network sensing module to make an optimal unloading strategy, so that the model accuracy is improved, and the training efficiency is accelerated; and the aggregation strategy combining multi-round iterative re-communication and uploading parameters selected according to the layer sensitivity saves the communication and calculation resource consumption of the traditional federal learning, such as electric quantity, bandwidth, memory and the like.
It is to be understood that the above description is not intended to limit the present invention, and the present invention is not limited to the above examples, and those skilled in the art may make modifications, alterations, additions or substitutions within the spirit and scope of the present invention.
Claims (5)
1. A federated learning acceleration method based on model segmentation is characterized by comprising the following steps:
s1, after training data are input, a server randomly selects K participants (K is less than or equal to N) from N participants to carry out local training, and first, two rounds of traditional federal learning are carried out without executing any unloading strategy;
s2, calculating the global loss function variable quantity of the previous two rounds and the training time of each participant, sequencing the global loss function variable quantities of the previous two rounds from large to small, and selecting the participant with the highest rank for training;
s3, comprehensively considering factors of training time and network bandwidth, dynamically grouping the participants after each round of training, grouping the participants with low bandwidth into additional groups, grouping the participants with similar characteristics into one group, and executing the same unloading strategy by the same group;
s4, comprehensively considering the training time and the computing capacity of the participants, judging whether resources are limited or not by combining the grouping condition of the step S3, and if the resources are limited, entering the step S5 to minimize the overall training time; otherwise, directly entering step S7 for multiple iterations of uploading;
s5, according to the target of global training time minimization, taking each layer of the deep neural network as a calculation unit, and generating division points among the layers;
s6, selecting an optimal segmentation point from the candidate segmentation points, dynamically unloading a calculation task to a credible edge server, cooperatively training a resource-limited participant and the edge server, and uploading parameter information to the server by the edge server;
s7, performing multiple rounds of iteration, starting from the third round of federal learning, and uploading parameters after all participants carry out local training for multiple iterations;
s8, from the beginning of the third round of federal learning, adopting a strategy of re-aggregation after multiple iterations, and calculating the gradient mean value variation of each layer of the model after the multiple iterations are finished;
s9, sorting the variation from large to small, wherein the larger the variation is, the higher the sensitivity is, selecting a layer with high sensitivity for uploading, and for a layer with low sensitivity, the layer is not uploaded, so that the optimal global model is obtained.
2. The model segmentation-based federated learning acceleration method of claim 1, characterized in that it comprises five modules: the system comprises a participant selection module, a dynamic network perception module, a calculation task unloading decision module, a local training strategy module and a dynamic model aggregation module;
the participant selection module is responsible for selecting high-quality participants according to the global loss function variation;
the dynamic network perception module is responsible for comprehensively considering training time and network broadband and grouping all participants;
the task unloading calculation strategy module is responsible for combining grouping conditions with global training time minimization and selecting an optimal unloading point;
the local training strategy module is responsible for carrying out multi-round iteration and then uploading, and the iteration times are dynamically set according to the variable quantity of the global loss function;
the dynamic model aggregation module is responsible for performing multiple rounds of iteration and re-aggregation, calculating and sequencing the layer sensitivities, and selecting the layer with the high sensitivity for uploading.
3. The model segmentation-based federated learning acceleration method of claim 1, wherein in step S3, similar features are network bandwidth and CPU frequency; the calculation capability in the step S4 is the CPU frequency.
4. The model segmentation-based federated learning acceleration method of claim 1, wherein in step S6, the candidate segmentation points are points whose data amount is small and whose calculation amount is small.
5. The model segmentation-based federated learning acceleration method of claim 1, wherein in step S7, the initial value of the number of iterations is set to 10, and then is dynamically set according to the gradient mean change of the previous round.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210057437.3A CN114492746A (en) | 2022-01-19 | 2022-01-19 | Federal learning acceleration method based on model segmentation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210057437.3A CN114492746A (en) | 2022-01-19 | 2022-01-19 | Federal learning acceleration method based on model segmentation |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114492746A true CN114492746A (en) | 2022-05-13 |
Family
ID=81471776
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210057437.3A Pending CN114492746A (en) | 2022-01-19 | 2022-01-19 | Federal learning acceleration method based on model segmentation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114492746A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115150288A (en) * | 2022-05-17 | 2022-10-04 | 浙江大学 | Distributed communication system and method |
CN115329990A (en) * | 2022-10-13 | 2022-11-11 | 合肥本源物联网科技有限公司 | Asynchronous federated learning acceleration method based on model segmentation under edge calculation scene |
CN118094640A (en) * | 2024-04-28 | 2024-05-28 | 南京汉卫公共卫生研究院有限公司 | Data security transmission monitoring system and method based on AI federal learning |
-
2022
- 2022-01-19 CN CN202210057437.3A patent/CN114492746A/en active Pending
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115150288A (en) * | 2022-05-17 | 2022-10-04 | 浙江大学 | Distributed communication system and method |
CN115150288B (en) * | 2022-05-17 | 2023-08-04 | 浙江大学 | Distributed communication system and method |
CN115329990A (en) * | 2022-10-13 | 2022-11-11 | 合肥本源物联网科技有限公司 | Asynchronous federated learning acceleration method based on model segmentation under edge calculation scene |
CN115329990B (en) * | 2022-10-13 | 2023-01-20 | 合肥本源物联网科技有限公司 | Asynchronous federated learning acceleration method based on model segmentation under edge computing scene |
CN118094640A (en) * | 2024-04-28 | 2024-05-28 | 南京汉卫公共卫生研究院有限公司 | Data security transmission monitoring system and method based on AI federal learning |
CN118094640B (en) * | 2024-04-28 | 2024-06-25 | 南京汉卫公共卫生研究院有限公司 | Data security transmission monitoring system and method based on AI federal learning |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114492746A (en) | Federal learning acceleration method based on model segmentation | |
CN112181666B (en) | Equipment assessment and federal learning importance aggregation method based on edge intelligence | |
CN109948029B (en) | Neural network self-adaptive depth Hash image searching method | |
CN109711532B (en) | Acceleration method for realizing sparse convolutional neural network inference aiming at hardware | |
CN108304921B (en) | Convolutional neural network training method and image processing method and device | |
CN110458084B (en) | Face age estimation method based on inverted residual error network | |
CN111695696A (en) | Method and device for model training based on federal learning | |
CN111738427B (en) | Operation circuit of neural network | |
CN111224905B (en) | Multi-user detection method based on convolution residual error network in large-scale Internet of things | |
CN111898698B (en) | Object processing method and device, storage medium and electronic equipment | |
CN113238867A (en) | Federated learning method based on network unloading | |
CN113691594B (en) | Method for solving data imbalance problem in federal learning based on second derivative | |
CN114327889A (en) | Model training node selection method for layered federated edge learning | |
CN114580636A (en) | Neural network lightweight deployment method based on three-target joint optimization | |
CN116958534A (en) | Image processing method, training method of image processing model and related device | |
CN110110852B (en) | Method for transplanting deep learning network to FPAG platform | |
CN112001386A (en) | License plate character recognition method, system, medium and terminal | |
CN113821270B (en) | Task unloading sequence prediction method, decision method, electronic device and storage medium | |
CN112446487A (en) | Method, device, system and storage medium for training and applying neural network model | |
CN115150288B (en) | Distributed communication system and method | |
CN116229199A (en) | Target detection method based on model light weight | |
US11934954B2 (en) | Pure integer quantization method for lightweight neural network (LNN) | |
CN113033653B (en) | Edge-cloud cooperative deep neural network model training method | |
CN111461144A (en) | Method for accelerating convolutional neural network | |
CN112001492A (en) | Mixed flow type acceleration framework and acceleration method for binary weight Densenet model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |