CN114492746A - Federal learning acceleration method based on model segmentation - Google Patents

Federal learning acceleration method based on model segmentation Download PDF

Info

Publication number
CN114492746A
CN114492746A CN202210057437.3A CN202210057437A CN114492746A CN 114492746 A CN114492746 A CN 114492746A CN 202210057437 A CN202210057437 A CN 202210057437A CN 114492746 A CN114492746 A CN 114492746A
Authority
CN
China
Prior art keywords
training
participants
model
global
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210057437.3A
Other languages
Chinese (zh)
Inventor
曹绍华
陈辉
陈舒
张汉卿
张卫山
吴春雷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China University of Petroleum East China
Original Assignee
China University of Petroleum East China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Petroleum East China filed Critical China University of Petroleum East China
Priority to CN202210057437.3A priority Critical patent/CN114492746A/en
Publication of CN114492746A publication Critical patent/CN114492746A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention discloses a federated learning acceleration method based on model segmentation, which belongs to the field of Internet of things and the field of machine vision, and firstly, in order to improve the training efficiency of federated learning, the selection of high-quality participants is extremely critical, aiming at high-quality and resource-limited Internet of things equipment, the two aspects of network bandwidth change and global training time are comprehensively considered, and a calculation task unloading strategy is designed by utilizing the concept of model segmentation, so that the global training time is reduced, and the training efficiency is improved; secondly, a federal learning paradigm is adopted to protect data safety, and distributed user data is utilized to improve reasoning performance; and finally, optimizing a global model aggregation strategy of the federal learning, and further reducing transmission contents and communication pressure by an aggregation mode combining multi-round iteration re-communication and model compression to achieve the purpose of accelerating the federal learning.

Description

Federal learning acceleration method based on model segmentation
Technical Field
The invention belongs to the field of Internet of things and the field of machine vision, and particularly relates to a federal learning acceleration method based on model segmentation.
Background
Federal Learning (Federal Learning) is a novel artificial intelligence basic technology, is proposed by Google in 2016, and is originally used for solving the problem of local model updating of android mobile phone terminal users.
In recent years, the relevant people in the field have conducted intensive research thereon, such as: (1) in 2016, Jakub et al propose a federated learning acceleration algorithm based on synchronization parameter updating, and aiming at a synchronization updating strategy, the method mainly utilizes the fault tolerance characteristic of model aggregation to properly reduce the communication frequency, thereby reducing the communication overhead; common methods include increasing communication intervals, reducing transmission content, asymmetric push and fetch, computation, and transmission pipelining; but facing resource constrained IoT devices, it is difficult or impossible to perform large training tasks at all; (2) compared with a synchronous updating strategy, although the efficiency of the asynchronous updating strategy can be greatly improved, the asynchronous updating strategy can cause the delay problem among local model parameters from different participants, thereby causing the convergence to be poor in the training process; (3) neel Guha et al in 2019 proposes a single-round communication federal learning improvement scheme and a federal learning acceleration algorithm based on model integration, namely the whole training process can complete the construction of a global model only by 1 round of communication; however, since the local model quality of different participants may be very different, the optimal method for generating the global federated model may only need to consider the local models of a part of the participants, not the local models of all the participants, and therefore how to quickly select the part of the participants which is particularly important is urgent to solve.
In the research, the federal learning acceleration algorithm based on synchronous updating is relatively sensitive to communication and computing resources, and the traditional federal learning algorithm cannot be applied to the scene of the actual internet of things along with the increase of the calculated amount of the training task; the federated learning acceleration algorithm updated based on asynchronous parameters has a delay problem, so that the convergence in the training process is poor; the federal learning acceleration algorithm based on model integration needs to consider how to quickly select this part of particularly important participants. Meanwhile, the prior art does not consider and design in this respect based on a federated learning acceleration method of model segmentation.
Disclosure of Invention
In order to solve the problems, the invention provides a federal learning acceleration method based on model segmentation, which accelerates the federal learning by adopting the concepts of model segmentation and compression.
The technical scheme of the invention is as follows:
a federated learning acceleration method based on model segmentation specifically comprises the following steps:
s1, after training data are input, a server randomly selects K participants (K is less than or equal to N) from N participants to carry out local training, and first, two rounds of traditional federal learning are carried out without executing any unloading strategy;
s2, calculating the global loss function variable quantity of the previous two rounds and the training time of each participant, sequencing the global loss function variable quantities of the previous two rounds from large to small, and selecting the participant with the highest rank for training;
s3, comprehensively considering factors of training time and network bandwidth, dynamically grouping participants after each round of training, grouping the participants with low bandwidth into additional groups, grouping the participants with similar characteristics into one group, and executing the same unloading strategy by the same group;
s4, comprehensively considering the training time and the computing capacity of the participants, judging whether resources are limited or not by combining the grouping condition of the step S3, and if the resources are limited, entering the step S5 to minimize the overall training time; otherwise, directly entering step S7 for multiple iterations of uploading;
s5, according to the target of global training time minimization, taking each layer of the deep neural network as a calculation unit, and generating division points among the layers;
s6, selecting an optimal segmentation point from the candidate segmentation points, dynamically unloading a calculation task to a credible edge server, cooperatively training a resource-limited participant and the edge server, and uploading parameter information to the server by the edge server;
s7, performing multiple rounds of iteration, starting from the third round of federal learning, and uploading parameters after all participants carry out local training for multiple iterations;
s8, from the beginning of the third round of federal learning, adopting a strategy of re-aggregation after multiple iterations, and calculating the gradient mean value variation of each layer of the model after the multiple iterations are finished;
s9, sorting the variation from large to small, wherein the larger the variation is, the higher the sensitivity is, selecting a layer with high sensitivity for uploading, and for a layer with low sensitivity, the layer is not uploaded, so that the optimal global model is obtained.
Further, five modules are included: the system comprises a participant selection module, a dynamic network perception module, a calculation task unloading decision module, a local training strategy module and a dynamic model aggregation module;
the participant selection module is responsible for selecting high-quality participants according to the global loss function variation;
the dynamic network perception module is responsible for comprehensively considering training time and network broadband and grouping all participants;
the task unloading calculation strategy module is responsible for combining grouping conditions with global training time minimization and selecting an optimal unloading point;
the local training strategy module is responsible for carrying out multi-round iteration and then uploading, and the iteration times are dynamically set according to the variable quantity of the global loss function;
the dynamic model aggregation module is responsible for performing multiple rounds of iteration and re-aggregation, calculating and sequencing the layer sensitivities, and selecting the layer with the high sensitivity for uploading.
Further, in step S3, the similar features are network bandwidth and CPU frequency; the calculation capability in the step S4 is the CPU frequency.
Further, in step S6, the candidate division point is a point having a small data amount and a small calculation amount.
Further, in step S7, the initial value of the number of iterations is set to 10, and then the number of iterations is dynamically set according to the gradient mean variation of the previous round.
The invention has the following beneficial technical effects:
by selecting high-quality participants and adopting a federal learning paradigm, data privacy is protected, reasoning performance is improved by using distributed user data, and federal learning efficiency is accelerated; partial calculation of the resource-limited participants is unloaded to a trusted edge server through a model segmentation method, and the bandwidth change of the participants and the server can be self-adapted through a dynamic network sensing module to make an optimal unloading strategy, so that the model precision is improved, and the training efficiency is accelerated; and the aggregation strategy combining multiple rounds of iterative re-communication and uploading parameters selected according to the layer sensitivity saves the communication and calculation resource consumption of the traditional federal learning, such as electric quantity, bandwidth, memory and the like.
Drawings
FIG. 1 is a schematic structural diagram of a federated learning acceleration method based on model segmentation according to the present invention;
FIG. 2 is a flow chart of a federated learning acceleration method based on model segmentation according to the present invention.
Detailed Description
The invention is described in further detail below with reference to the following figures and detailed description:
the invention provides a federal learning acceleration method based on model segmentation based on federal learning, model segmentation and model compression technologies. First, to improve the training efficiency of federal learning, it is extremely critical to choose high quality participants. Aiming at high-quality and resource-limited Internet of things (IoT) equipment, two aspects of network bandwidth change and training time are comprehensively considered, and a calculation task unloading strategy is designed by utilizing the idea of model segmentation, so that the overall training time is reduced, and the training efficiency is improved; secondly, federated learning is adopted to protect data safety, and distributed user data is utilized to improve reasoning performance; and finally, optimizing a global model aggregation strategy of the federal learning, and further reducing transmission content, reducing communication pressure and achieving the purpose of accelerating the federal learning by an aggregation mode combining multi-round iteration re-communication and model compression.
As shown in fig. 1, a federated learning acceleration method based on model segmentation includes five modules: the system comprises a participant selection module, a dynamic network perception module, a calculation task unloading decision module, a local training strategy module and a dynamic model aggregation module;
the participant selection module is responsible for selecting high-quality participants according to the global loss function variation;
the dynamic network perception module is responsible for comprehensively considering training time and network broadband and grouping all participants;
the task unloading calculation strategy module is responsible for combining grouping conditions with global training time minimization and selecting an optimal unloading point;
the local training strategy module is responsible for carrying out multi-round iteration and then uploading, and the iteration times are dynamically set according to the variable quantity of the global loss function;
the dynamic model aggregation module is responsible for performing multiple rounds of iteration and re-aggregation, calculating and sequencing the layer sensitivities, and selecting the layer with the high sensitivity for uploading.
As shown in fig. 2, the federal learning acceleration method based on model segmentation specifically includes the following steps:
s1, after training data are input, a server randomly selects K participants (K is less than or equal to N) from N participants to carry out local training, and first, two rounds of traditional federal learning are carried out without executing any unloading strategy;
s2, calculating the global loss function variable quantity of the previous two rounds and the training time of each participant, sequencing the global loss function variable quantities of the previous two rounds from large to small, and selecting the participant with the highest rank for training;
s3, comprehensively considering factors of training time and network bandwidth, dynamically grouping participants after each round of training, grouping the participants with low bandwidth into additional groups, grouping the participants with similar characteristics (network bandwidth and CPU frequency) into one group, and executing the same unloading strategy by the same group;
s4, comprehensively considering the training time and the computing power (CPU frequency) of the participants, judging whether the resources are limited or not by combining the grouping condition of the step S3, and if the resources are limited, entering the step S5 to minimize the global training time; otherwise, directly entering step S7 for multiple iterations of uploading;
s5, according to the target of global training time minimization, taking each layer of the deep neural network as a calculation unit, and generating division points among the layers;
s6, selecting an optimal segmentation point from candidate segmentation points (layers with small data volume and small calculation amount, such as a pooling layer), dynamically unloading a calculation task to a credible edge server, cooperatively training participants with limited resources and the edge server, and uploading parameter information to the server by the edge server;
s7, performing multi-round iteration, starting from the third round of federal learning, and uploading parameters after all participants are trained locally for multiple iterations, instead of uploading the parameters once per training; setting an initial value of the number of times of multiple iterations as 10, and then dynamically setting according to the gradient mean value variation of the previous round;
s8, from the beginning of the third round of federal learning, adopting a strategy of re-aggregation after multiple iterations, and calculating the gradient mean value variation of each layer of the model after the multiple iterations are finished;
s9, sorting the variation from large to small, wherein the larger the variation is, the more sensitive the layer is, selecting the layer with high sensitivity to upload, and for the layer with low sensitivity, the layer is not uploaded, so that the optimal global model is obtained.
The algorithm pseudo code of the Federal learning acceleration method based on model segmentation is as follows:
Figure BDA0003476938980000041
Figure BDA0003476938980000051
according to the invention, high-quality participants are selected, a federal learning paradigm is adopted, so that data privacy is protected, the reasoning performance is improved by using distributed user data, and the federal learning efficiency is accelerated; meanwhile, in order to further improve the accuracy of the model, the algorithm can unload part of the calculation of the resource-limited participants to a trusted edge server through a model segmentation method, and can adapt to the bandwidth change of the participants and the server through a dynamic network sensing module to make an optimal unloading strategy, so that the model accuracy is improved, and the training efficiency is accelerated; and the aggregation strategy combining multi-round iterative re-communication and uploading parameters selected according to the layer sensitivity saves the communication and calculation resource consumption of the traditional federal learning, such as electric quantity, bandwidth, memory and the like.
It is to be understood that the above description is not intended to limit the present invention, and the present invention is not limited to the above examples, and those skilled in the art may make modifications, alterations, additions or substitutions within the spirit and scope of the present invention.

Claims (5)

1. A federated learning acceleration method based on model segmentation is characterized by comprising the following steps:
s1, after training data are input, a server randomly selects K participants (K is less than or equal to N) from N participants to carry out local training, and first, two rounds of traditional federal learning are carried out without executing any unloading strategy;
s2, calculating the global loss function variable quantity of the previous two rounds and the training time of each participant, sequencing the global loss function variable quantities of the previous two rounds from large to small, and selecting the participant with the highest rank for training;
s3, comprehensively considering factors of training time and network bandwidth, dynamically grouping the participants after each round of training, grouping the participants with low bandwidth into additional groups, grouping the participants with similar characteristics into one group, and executing the same unloading strategy by the same group;
s4, comprehensively considering the training time and the computing capacity of the participants, judging whether resources are limited or not by combining the grouping condition of the step S3, and if the resources are limited, entering the step S5 to minimize the overall training time; otherwise, directly entering step S7 for multiple iterations of uploading;
s5, according to the target of global training time minimization, taking each layer of the deep neural network as a calculation unit, and generating division points among the layers;
s6, selecting an optimal segmentation point from the candidate segmentation points, dynamically unloading a calculation task to a credible edge server, cooperatively training a resource-limited participant and the edge server, and uploading parameter information to the server by the edge server;
s7, performing multiple rounds of iteration, starting from the third round of federal learning, and uploading parameters after all participants carry out local training for multiple iterations;
s8, from the beginning of the third round of federal learning, adopting a strategy of re-aggregation after multiple iterations, and calculating the gradient mean value variation of each layer of the model after the multiple iterations are finished;
s9, sorting the variation from large to small, wherein the larger the variation is, the higher the sensitivity is, selecting a layer with high sensitivity for uploading, and for a layer with low sensitivity, the layer is not uploaded, so that the optimal global model is obtained.
2. The model segmentation-based federated learning acceleration method of claim 1, characterized in that it comprises five modules: the system comprises a participant selection module, a dynamic network perception module, a calculation task unloading decision module, a local training strategy module and a dynamic model aggregation module;
the participant selection module is responsible for selecting high-quality participants according to the global loss function variation;
the dynamic network perception module is responsible for comprehensively considering training time and network broadband and grouping all participants;
the task unloading calculation strategy module is responsible for combining grouping conditions with global training time minimization and selecting an optimal unloading point;
the local training strategy module is responsible for carrying out multi-round iteration and then uploading, and the iteration times are dynamically set according to the variable quantity of the global loss function;
the dynamic model aggregation module is responsible for performing multiple rounds of iteration and re-aggregation, calculating and sequencing the layer sensitivities, and selecting the layer with the high sensitivity for uploading.
3. The model segmentation-based federated learning acceleration method of claim 1, wherein in step S3, similar features are network bandwidth and CPU frequency; the calculation capability in the step S4 is the CPU frequency.
4. The model segmentation-based federated learning acceleration method of claim 1, wherein in step S6, the candidate segmentation points are points whose data amount is small and whose calculation amount is small.
5. The model segmentation-based federated learning acceleration method of claim 1, wherein in step S7, the initial value of the number of iterations is set to 10, and then is dynamically set according to the gradient mean change of the previous round.
CN202210057437.3A 2022-01-19 2022-01-19 Federal learning acceleration method based on model segmentation Pending CN114492746A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210057437.3A CN114492746A (en) 2022-01-19 2022-01-19 Federal learning acceleration method based on model segmentation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210057437.3A CN114492746A (en) 2022-01-19 2022-01-19 Federal learning acceleration method based on model segmentation

Publications (1)

Publication Number Publication Date
CN114492746A true CN114492746A (en) 2022-05-13

Family

ID=81471776

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210057437.3A Pending CN114492746A (en) 2022-01-19 2022-01-19 Federal learning acceleration method based on model segmentation

Country Status (1)

Country Link
CN (1) CN114492746A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115150288A (en) * 2022-05-17 2022-10-04 浙江大学 Distributed communication system and method
CN115329990A (en) * 2022-10-13 2022-11-11 合肥本源物联网科技有限公司 Asynchronous federated learning acceleration method based on model segmentation under edge calculation scene
CN118094640A (en) * 2024-04-28 2024-05-28 南京汉卫公共卫生研究院有限公司 Data security transmission monitoring system and method based on AI federal learning

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115150288A (en) * 2022-05-17 2022-10-04 浙江大学 Distributed communication system and method
CN115150288B (en) * 2022-05-17 2023-08-04 浙江大学 Distributed communication system and method
CN115329990A (en) * 2022-10-13 2022-11-11 合肥本源物联网科技有限公司 Asynchronous federated learning acceleration method based on model segmentation under edge calculation scene
CN115329990B (en) * 2022-10-13 2023-01-20 合肥本源物联网科技有限公司 Asynchronous federated learning acceleration method based on model segmentation under edge computing scene
CN118094640A (en) * 2024-04-28 2024-05-28 南京汉卫公共卫生研究院有限公司 Data security transmission monitoring system and method based on AI federal learning
CN118094640B (en) * 2024-04-28 2024-06-25 南京汉卫公共卫生研究院有限公司 Data security transmission monitoring system and method based on AI federal learning

Similar Documents

Publication Publication Date Title
CN114492746A (en) Federal learning acceleration method based on model segmentation
CN112181666B (en) Equipment assessment and federal learning importance aggregation method based on edge intelligence
CN109948029B (en) Neural network self-adaptive depth Hash image searching method
CN109711532B (en) Acceleration method for realizing sparse convolutional neural network inference aiming at hardware
CN108304921B (en) Convolutional neural network training method and image processing method and device
CN110458084B (en) Face age estimation method based on inverted residual error network
CN111695696A (en) Method and device for model training based on federal learning
CN111738427B (en) Operation circuit of neural network
CN111224905B (en) Multi-user detection method based on convolution residual error network in large-scale Internet of things
CN111898698B (en) Object processing method and device, storage medium and electronic equipment
CN113238867A (en) Federated learning method based on network unloading
CN113691594B (en) Method for solving data imbalance problem in federal learning based on second derivative
CN114327889A (en) Model training node selection method for layered federated edge learning
CN114580636A (en) Neural network lightweight deployment method based on three-target joint optimization
CN116958534A (en) Image processing method, training method of image processing model and related device
CN110110852B (en) Method for transplanting deep learning network to FPAG platform
CN112001386A (en) License plate character recognition method, system, medium and terminal
CN113821270B (en) Task unloading sequence prediction method, decision method, electronic device and storage medium
CN112446487A (en) Method, device, system and storage medium for training and applying neural network model
CN115150288B (en) Distributed communication system and method
CN116229199A (en) Target detection method based on model light weight
US11934954B2 (en) Pure integer quantization method for lightweight neural network (LNN)
CN113033653B (en) Edge-cloud cooperative deep neural network model training method
CN111461144A (en) Method for accelerating convolutional neural network
CN112001492A (en) Mixed flow type acceleration framework and acceleration method for binary weight Densenet model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination