CN116245194A - Asynchronous federal learning method, device, system and storage medium - Google Patents

Asynchronous federal learning method, device, system and storage medium Download PDF

Info

Publication number
CN116245194A
CN116245194A CN202211686530.7A CN202211686530A CN116245194A CN 116245194 A CN116245194 A CN 116245194A CN 202211686530 A CN202211686530 A CN 202211686530A CN 116245194 A CN116245194 A CN 116245194A
Authority
CN
China
Prior art keywords
model
local
global model
global
jth
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202211686530.7A
Other languages
Chinese (zh)
Inventor
刘吉
霍超
窦德景
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202211686530.7A priority Critical patent/CN116245194A/en
Publication of CN116245194A publication Critical patent/CN116245194A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The present disclosure provides asynchronous federal learning methods, apparatus, systems, and storage media. Relates to the technical field of computers, in particular to the technical field of artificial intelligence such as big data and machine learning. The specific implementation scheme is as follows: responding to the received jth local model sent by the ith device, and acquiring the version of the jth local model and the version of the server local g global model; determining a version difference value of the jth local model and the g global model based on the version of the jth local model and the version of the g global model; under the condition that the version difference value meets the preset condition, determining the weight corresponding to each of the jth local model and the g global model; and aggregating the jth local model and the g global model based on the weights respectively corresponding to the jth local model and the g global model to obtain the (g+1) global model. According to the scheme disclosed by the invention, the accuracy of the model can be improved.

Description

Asynchronous federal learning method, device, system and storage medium
Technical Field
The disclosure relates to the field of computer technology, and in particular to the technical field of artificial intelligence such as big data and machine learning.
Background
Federal learning (Federated Learning, FL) is an emerging machine learning paradigm that can train on multiple distributed devices without uploading raw data, and a server is responsible for aggregation of models, effectively utilizing scattered data and computing power, and at the same time protecting the user's data security as much as possible.
Asynchronous FL allows the server to aggregate uploaded local models without waiting for a lagging device, thereby improving efficiency. However, this mechanism may suffer from the problems of outdated upload models and low accuracy from Non-independent co-distributed (Independent and identically distributed, non-IID) data. For example, when a device uploads a model that is updated based on an old global model, the global model has been updated multiple times. Then, a simple aggregation of the uploaded models may drag the global model to a previous state, resulting in poor accuracy of the global model obtained by the server aggregation.
Disclosure of Invention
The present disclosure provides a method, apparatus, system, and storage medium for asynchronous federal learning.
According to a first aspect of the present disclosure, there is provided a method for asynchronous federal learning, applied to a server side, including:
Responding to the received jth local model sent by the ith device, and acquiring the version of the jth local model and the version of the server local g global model; the g global model is the global model of the local latest version of the server when the g global model is connected with the j local model, i is an integer not less than 1, j is an integer not less than 1, and g is an integer not less than 1;
determining a version difference value of the jth local model and the g global model based on the version of the jth local model and the version of the g global model;
under the condition that the version difference value meets the preset condition, determining the weight corresponding to each of the jth local model and the g global model;
and aggregating the jth local model and the g global model based on the weights respectively corresponding to the jth local model and the g global model to obtain the (g+1) global model.
According to a second aspect of the present disclosure, there is provided a method for asynchronous federal learning, applied to a device side, including:
the j-th local model is sent to a server, wherein the j-th local model is a local model obtained by the i-th device after the j-th local training is finished, j is an integer not less than 1, and i is an integer not less than 1;
And receiving a (g+1) global model sent by a server, wherein the (g+1) global model is a global model obtained by aggregating the (j) local model and the (g) global model based on the (j) local model under the condition that the version difference value of the (j) local model and the (g) global model meets a preset condition, and the (g) global model is a global model of the latest local version of the server when the (j) local model is received, and g is an integer not less than 1.
According to a third aspect of the present disclosure, there is provided an asynchronous federation learning method applied to an asynchronous federation learning system, including:
the ith device sends a trained jth local model to a server, wherein i is an integer not less than 1, and j is an integer not less than 1;
the server responds to receiving a jth local model sent by an ith device, and obtains a version of the jth local model and a version of a server local g global model; the g global model is the global model of the latest version of the server local when the j local model is received, and g is an integer not less than 1; determining a version difference value of the jth local model and the g global model based on the version of the jth local model and the version of the g global model; under the condition that the version difference value meets the preset condition, determining the weight corresponding to each of the jth local model and the g global model; and aggregating the jth local model and the g global model based on the weights respectively corresponding to the jth local model and the g global model to obtain the (g+1) global model.
According to a fourth aspect of the present disclosure, there is provided an asynchronous federal learning apparatus, applied to a server side, including:
the first acquisition module is used for responding to the received j-th local model sent by the i-th device and acquiring the version of the j-th local model and the version of the g-th global model of the server; the g global model is the global model of the local latest version of the server when the j local model is received, i is an integer not less than 1, j is an integer not less than 1, and g is an integer not less than 1;
the first determining module is used for determining a version difference value of the jth local model and the g global model based on the version of the jth local model and the version of the g global model;
the second determining module is used for determining the weight corresponding to each of the jth local model and the g global model under the condition that the version difference value meets the preset condition;
and the aggregation module is used for aggregating the jth local model and the g global model based on the weights corresponding to the jth local model and the g global model respectively to obtain the (g+1) global model.
According to a fifth aspect of the present disclosure, there is provided an asynchronous federal learning apparatus, applied to a device side, including:
The second sending module is used for sending a jth local model to the server, wherein the jth local model is a local model obtained by the ith device after the jth round of local training is finished, j is an integer not less than 1, and i is an integer not less than 1;
the second receiving module is used for receiving a (g+1) th global model sent by the server, wherein the (g+1) th global model is a global model obtained by aggregating the server based on the j-th local model and the (g) th global model under the condition that the version difference value of the j-th local model and the (g) th global model meets a preset condition, and the (g) th global model is a global model of the latest version of the server when the j-th local model is received, and g is an integer not less than 1.
According to a sixth aspect of the present disclosure, there is provided an asynchronous federal learning system comprising:
m devices, which are used for sending the trained local model to a server, wherein m is an integer greater than 2;
the server is used for responding to the received j local model sent by the i device and obtaining the version of the j local model and the version of the local g global model; determining a version difference value of the jth local model and the g global model based on the version of the jth local model and the version of the g global model; under the condition that the version difference value meets the preset condition, determining the weight corresponding to each of the jth local model and the g global model; based on the weights corresponding to the jth local model and the g global model, aggregating the jth local model and the g global model to obtain a (g+1) global model; the g global model is the global model of the latest local version when the server receives the j local model sent by the i device; i is an integer of 0 or more and m or less, j is an integer of 1 or more, and g is an integer of 1 or more.
According to a seventh aspect of the present disclosure, there is provided an electronic device comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of the embodiments of the present disclosure.
According to an eighth aspect of the present disclosure, there is provided a non-transitory computer readable storage medium storing computer instructions for causing the computer to perform a method according to any one of the embodiments of the present disclosure.
According to a ninth aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements a method according to any of the embodiments of the present disclosure.
According to the scheme disclosed by the invention, the accuracy of the model can be improved.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.
Drawings
In the drawings, the same reference numerals refer to the same or similar parts or elements throughout the several views unless otherwise specified. The figures are not necessarily drawn to scale. It is appreciated that these drawings depict only some embodiments according to the disclosure and are not therefore to be considered limiting of its scope.
FIG. 1 is an asynchronous federal learning framework diagram according to an embodiment of the present disclosure;
FIG. 2 is a flow diagram first of an asynchronous federal learning method according to an embodiment of the present disclosure;
FIG. 3 is a general flow diagram of asynchronous federal learning according to an embodiment of the present disclosure;
FIG. 4 is a second flow diagram of an asynchronous federal learning method according to an embodiment of the present disclosure;
FIG. 5 is a schematic diagram of a framework of a local model obtained in an asynchronous federal learning method according to an embodiment of the present disclosure;
FIG. 6 is a second schematic diagram of a framework for obtaining a local model in an asynchronous federal learning method according to an embodiment of the present disclosure;
FIG. 7 is a flow diagram III of an asynchronous federal learning method according to an embodiment of the present disclosure;
FIG. 8 is a schematic diagram of the composition of an asynchronous Federal learning device according to an embodiment of the present disclosure;
FIG. 9 is a schematic diagram II of the composition of an asynchronous Federal learning device according to an embodiment of the present disclosure;
FIG. 10 is a schematic diagram of the composition of an asynchronous federal learning system according to an embodiment of the present disclosure;
FIG. 11 is a schematic diagram of a scenario of an asynchronous federation learning method according to an embodiment of the present disclosure;
fig. 12 is a block diagram of an electronic device for implementing an asynchronous federal learning method according to an embodiment of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
The terms first, second, third and the like in the description and in the claims and in the above-described figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion, such as a series of steps or elements. The method, system, article, or apparatus is not necessarily limited to those explicitly listed but may include other steps or elements not explicitly listed or inherent to such process, method, article, or apparatus.
In the FL paradigm, devices are typically highly heterogeneous in terms of computing and communication capabilities as well as data. Some devices may complete local training and updating of the model in a short time, while others may take longer to complete the process, and may not be able to upload the model due to insufficient bandwidth or high latency, which corresponds to system heterogeneity. In addition, the data in each device may be Non-IID data, which corresponds to data heterogeneity. Data isomerization can lead to different local target and client drift problems, which can reduce the accuracy of the global model in the FL.
To solve the system heterogeneous problem, asynchronous FL can implement global model aggregation without waiting for all devices. The asynchronous FL may be performed once the model is uploaded from any device or multiple models are buffered. However, the old upload model may drag the global model to a previous state, which may significantly reduce the accuracy of the global model.
In order to at least partially solve one or more of the above problems and other potential problems, the present disclosure proposes an asynchronous learning method, which can dynamically adjust the weight (importance) of a local model when a server aggregates according to the state of the local model trained by a device, thereby not only retaining important information in the device, but also reducing the influence of staleness.
For a better understanding of the present solution, an asynchronous system architecture will be described.
FL scene composed of powerful server and m devices, used
Figure BDA0004021263350000051
Representing that m devices co-train a global model. Each device i stores a local data set +.>
Figure BDA0004021263350000052
There is->
Figure BDA0004021263350000053
A data sample, where x i,d Is the d-th s-dimensional input data vector, y i,d Is x i,d Is a label of (a). The whole data set is composed of->
Figure BDA0004021263350000054
Representation of wherein
Figure BDA00040212633500000511
Then, the training procedure in FL is targeted to:
Figure BDA0004021263350000055
where w represents the global model of the model,
Figure BDA0004021263350000056
is a local loss function, defined as +.>
Figure BDA0004021263350000057
Figure BDA0004021263350000058
Is to measure the model parameter w in the data sample { x } i,d ,y i,d A loss function of error on }.
To solve the formula
Figure BDA0004021263350000059
We propose an asynchronous FL framework, as shown in fig. 1. The server is +.>
Figure BDA00040212633500000510
Triggering local training of m' devices. The training process consists of a plurality of global epochs. At the beginning of training, the version of the global model is 0. Then, after each global epoch (epoch), the version of the global model is incremented by 1. Each global epoch consists of 7 steps. First, the server triggers m '(m'. Ltoreq.m) devices and models the global model w o Broadcast to each device in step (1). Each device then performs local training using its local data set in step (2). During the local training process, device i may request a new global model from the server (step (3)) to reduce the staleness of the local training, as the global model may be updated at the same time. The server then models the global model w in step (4) g Broadcast to the devices. After receiving the new global model, the device aggregates the global model and the latest local model into a new model in step (5), and continues to train locally with the new model. However, when a new global model is received with w o Step (5) may be skipped when the same is true. After the local training is completed, the device i uploads the local model to the server in step (6). Finally, the server in step (7) sets the latest global model w j Model with uploading->
Figure BDA0004021263350000061
And (5) polymerizing. When the global model w is aggregated j And local model of uploading->
Figure BDA0004021263350000062
When the degree of staleness of the local model is calculated as tau i =j-o+1. When staleness tau i When important, the local model may contain legacy information in the old global model, which may drag the global model to a previous version corresponding to poor accuracy.
To improve the accuracy of federal learning, the present disclosure proposes dynamically adjusting the importance of each uploaded local model according to the timeliness and gradient to achieve higher accuracy.
The obsolete degree (staleness) is the version difference of the server global model and the device local model.
Fig. 2 is a flow diagram of an asynchronous federation learning method that may be applied to servers in an asynchronous federation learning system, according to an embodiment of the present disclosure. In some possible implementations, the asynchronous federal learning method may also be implemented by way of a processor invoking computer readable instructions stored in memory. As shown in fig. 2, the asynchronous federal learning method includes:
S201: responding to receiving a jth local model sent by an ith device, and acquiring a version of the jth local model and a version of a server local jth global model; the g global model is the global model of the local latest version of the server when the j local model is received, i is an integer not less than 1, j is an integer not less than 1, and g is an integer not less than 1;
s202: determining a version difference value of the jth local model and the g global model based on the version of the jth local model and the version of the g global model;
s203: under the condition that the version difference value meets a preset condition, determining weights corresponding to the jth local model and the g global model respectively;
s204: and aggregating the jth local model and the g global model based on the weights respectively corresponding to the jth local model and the g global model to obtain a (g+1) global model.
In the embodiment of the disclosure, a server triggers m devices, m is an integer not less than 2, and sends an initial global model to the m devices, and at this time, the global model version can be recorded as 0; after each global epoch (epoch), the version of the global model is incremented by 1. Each device receives an initial global model issued by the server, trains the initial global model based on the local data to obtain an initial local model, at this time, the local model version can be recorded as 0, and after each local epoch, the local model version is increased by 1.
In the embodiment of the disclosure, if the version of the jth local model is denoted as j1 and the version of the g global model is denoted as g1, the difference value between the jth local model and the g global model may be denoted as j1-g1, g1-j1, or |g1-j1|. The specific selection of the difference value representation form can be set according to the requirements.
In the embodiment of the disclosure, if the version of the jth local model is denoted as j1 and the version of the g global model is denoted as g1, then j1=j, g1=g, or j1=j-1, g1=g-1. The specific counting method can be set according to the requirement.
In the embodiment of the disclosure, a server responds to a jth local model sent by an ith device, aggregates the jth local model and a g global model based on the jth local model to obtain a (g+1) th global model, and broadcasts the (g+1) th global model to m devices; the ith device trains the (g+1) th global model based on the local data to obtain the (j+1) th local model, and uploads the (j+1) th local model to the server.
In an embodiment of the present disclosure, the preset conditions may include: the version difference value does not reach the preset threshold. Here, the preset threshold may be set or adjusted according to the accuracy or speed requirement. And marking a preset threshold as tv1, if the version of the g global model is g1, the version of the j local model is j1, the version difference value is g1-j1, and if g1-j1 is less than tv, aggregating the g global model and the j local model to obtain the g+1th global model. Taking tv1=10, g1=8, and j1=6 as an example, if the current latest global model locally on the server is the 8 th global model, the 8 th global model is in version 8, and the 6 th local model transmitted by the device i=5 is in version 6, the version difference value is 8-6=2, and because 2 is less than 10, the 8 th global model and the 6 th local model are aggregated to obtain the 9 th global model. Also, taking tv1=10, g1=12, and j1=1 as an example, if the current latest global model locally on the server is the 12 th global model, the version of the 12 th global model is 12, and the version of the 1 st local model transmitted by the device i=8 is 1, the version difference value is 12-1=11, and since 11 > 10, the 1 st local model transmitted by the device i=8 is discarded, that is: the aggregation operation of the 12 th global model with the 1 st local model is not performed. It should be noted that, if the server receives the 4 th local model transmitted by the device i=9 at the next time after receiving the 1 st local model transmitted by the device i=8, if the version of the 4 th local model transmitted by the device i=9 is 4, the version difference value is 12-4=8, and since 8 is less than 10, the 12 th global model and the 4 th local model are aggregated to obtain the 13 th global model, and the 13 th global model is issued to m devices.
In an embodiment of the present disclosure, the preset conditions may include: the version difference value does not reach a preset threshold value, and the difference value between the version of the local model adopted by the g global model and the version of the j local model obtained through aggregation is smaller than a target threshold value.
Here, the local model used when aggregating to obtain the g-th global model may be derived from the j-1-th local model transmitted by the device i or may be derived from the k-th local model transmitted by another device. And (3) marking a preset threshold value as tv1, marking a target threshold value as tv2, if the version of the g-th global model is g1, the version of the j-th local model is j1, the version of the k-th local model is k1, the version difference value is g1-j1, the difference value between the version of the k-th local model and the version of the j-th local model adopted in the process of aggregating the g-th global model is k1-j1, and if g1-j1 is less than tv1 and j1-k1 is less than tv2, aggregating the g-th global model and the j-th local model to obtain the g+1th global model. Therefore, the difference between the version of the local model adopted by the g global model and the version of the j local model can be fully considered by increasing the condition that the difference between the version of the local model adopted by the aggregation and the version of the local model adopted by the last aggregation is obtained, so that the importance of the received local model can be conveniently further judged, and the accuracy of asynchronous federation learning can be further improved.
It should be noted that the foregoing preset conditions are merely exemplary, and are not intended to limit all possible conditions included in the preset conditions, but are not intended to be exhaustive.
In practical application, priority can be set for each device, and different preset thresholds are set for the local models returned by different devices according to the priority of each device. For example, the preset threshold corresponding to the device with the higher priority is higher than the preset threshold corresponding to the device with the lower priority. Therefore, the importance of the local models of different devices can be further adjusted by setting different preset thresholds for the local models returned by different devices, so that the accuracy of asynchronous federal learning is further improved, and the diversity of the asynchronous federal learning can be further enriched.
In an embodiment of the present disclosure, obtaining a version of a jth local model includes: and determining the version of the jth local model based on the related information of the jth local model sent by the ith device. Wherein the relevant information of the jth local model includes: the version of the jth local model, the device number i, and the data of the jth local model are merely exemplary, and are not intended to limit all possible information included in the related information, but are not intended to be exhaustive.
According to the technical scheme, based on the version difference values of the global model and the local model, under the condition that the version difference values meet the preset conditions, the weight (importance) of the local model in the process of server aggregation is dynamically adjusted, so that important information in equipment is reserved, the influence of staleness can be reduced, the purpose of minimizing loss is achieved by dynamically adjusting the important performance of the model, the accuracy of asynchronous learning is improved, and the accuracy of the global model obtained by server aggregation can be improved.
In some embodiments, S203 may include: and determining weights corresponding to the jth local model and the g global model respectively in response to detecting that the version difference value does not reach the preset threshold.
In an embodiment of the present disclosure, determining weights corresponding to the jth local model and the jth global model respectively includes: and determining the weights corresponding to the jth local model and the g global model according to the magnitude of the version difference value. The larger the version difference value is, the smaller the weight ratio of the local model is; the smaller the version difference value, the larger the weight of the local model, but the weight of the g global model is larger than the weight of the j local model.
In an embodiment of the present disclosure, determining weights corresponding to the jth local model and the jth global model respectively includes: and searching a preset weight ratio comparison table according to the magnitude of the version difference value, and determining the weights corresponding to the jth local model and the g global model respectively. The comparison table comprises a corresponding relation between the version difference value and the local model weight ratio. The present disclosure does not limit the manner in which the weight-to-ratio lookup table is obtained. For example, the weight ratio comparison table may be a comparison table determined according to a priori values, or may be a comparison table set according to accuracy or speed requirements.
The foregoing is merely exemplary, and is not intended to limit the manner in which all possible determinations of the weights corresponding to each of the jth local model and the g global model may be included, but is not intended to be exhaustive.
According to the technical scheme, the weights of the global model and the local model in the polymerization of the server are dynamically adjusted according to the degree of obsolescence, so that the accuracy of the global model obtained by the polymerization is improved.
In the disclosed embodiment, when the server receives the model from the original version o of device i
Figure BDA00040212633500000923
The server updates the current global model w according to equation (1) j
Figure BDA0004021263350000091
Wherein the global model wj +1 Is a global model w j Is used to determine the next global model of the model,
Figure BDA0004021263350000092
is a local model->
Figure BDA0004021263350000093
Is used for the weight of the (c),
Figure BDA0004021263350000094
is a global model w j Is a weight of (2).
The global model updating method may influence the accuracy of the global model obtained by each round of aggregation, and therefore, the formula is given
Figure BDA00040212633500000924
The method is characterized by comprising the following double-layer optimization problems:
Figure BDA0004021263350000095
Figure BDA0004021263350000096
Figure BDA0004021263350000097
wherein,,
Figure BDA0004021263350000098
wherein a= { α 12 ,…,α m And is a set of values corresponding to the importance of the device upload model. Formula->
Figure BDA00040212633500000925
Is the minimization of the local loss function.
Using dynamic polynomial functions
Figure BDA0004021263350000099
To express that in formula (2) is formula +.>
Figure BDA00040212633500000926
Defined +.>
Figure BDA00040212633500000910
Figure BDA00040212633500000911
Wherein mu α Is a hyper-parameter, j represents the version of the current global model, o corresponds to the version of the global model received by the ith device prior to local training,
Figure BDA00040212633500000912
and->
Figure BDA00040212633500000913
Is the control parameter for the ith device to perform local training at the t-th time, and these three parameters are dynamically adjusted according to the following equation (3):
Figure BDA00040212633500000914
wherein,,
Figure BDA00040212633500000915
and->
Figure BDA00040212633500000916
Representing the corresponding dynamically adjusted learning rate, +.>
Figure BDA00040212633500000917
And
Figure BDA00040212633500000918
respectively corresponding to the partial derivatives of the loss function.
Wherein the method comprises the steps of,
Figure BDA00040212633500000919
And->
Figure BDA00040212633500000920
Representing the corresponding dynamically adjusted learning rate, +.>
Figure BDA00040212633500000921
And
Figure BDA00040212633500000922
respectively corresponding to the partial derivatives of the loss function. />
Algorithm 1 is a model aggregation algorithm in the server, one single thread of the server triggers m' devices to train locally,
Figure BDA0004021263350000101
Is a local model of the i-th device.
The pseudo code for algorithm 1 is described as follows:
Figure BDA0004021263350000102
next, the pseudo code of algorithm 1 is explained.
Lines 1-6: a single thread periodically triggers the device to perform parallel local training.
Line 8: the server receives the local model sent by the ith device
Figure BDA0004021263350000111
Line 9: the server obtains information of the local model of the ith device and verifies whether the model uploaded by the ith device is within a outdated limit.
Line 10: the local model of the ith device is not within the outdated limit and the server will ignore
Figure BDA0004021263350000112
Lines 12-13: the server updates the control parameters according to equation (3)
Figure BDA0004021263350000113
And calculate based on formula (2)
Figure BDA0004021263350000114
Line 14: the server updates the global model.
Through the algorithm 1, the problem of local model time-lapse in asynchronous federal learning can be solved.
According to the technical scheme, the weight and the aggregation control parameter of the global model and the local model in server aggregation are dynamically adjusted according to the degree of obsolescence and the gradient, so that the accuracy of the global model obtained by aggregation is improved.
In some embodiments, the asynchronous federal learning method may include:
s205: and discarding the j-th local model under the condition that the version difference value does not meet the preset condition.
In the embodiment of the disclosure, the server responds to the jth local model sent by the ith device to obtain information of the jth local model, as shown in fig. 3, if the version difference value between the jth local model and the jth global model does not reach a preset threshold, S203-S204 are executed, and if the version difference value between the jth local model and the jth global model reaches the preset threshold, S205 is executed, that is, S203-S204 are not executed any more.
In the disclosed embodiments, convergence may be guaranteed by limited staleness, thus, two mechanisms are considered to ensure a staleness bound. The first mechanism is that the server discards the local model uploaded at the device side when the staleness exceeds a predefined threshold τ. By the mechanism, the problem of inefficiency caused by holding powerful equipment in idle time can be avoided. The second mechanism is to force the fast device to wait for the slow device. However, unlike the static blocking mechanism, it is dynamically decided whether to allow the selected device to continue training based on a prediction of the maximum staleness of all devices.For example, at a given time, when the server selects m devices for the next epoch
Figure BDA0004021263350000115
The version of the global model is v and a set of devices +.>
Figure BDA0004021263350000116
Local training is still in progress. Then, for- >
Figure BDA0004021263350000117
In (c) and we predict the time (t) at which device i can return its updated local model based on the configured computing and communication capabilities i ). We can further find +.>
Figure BDA0004021263350000118
Is the oldest version (v old ) And a device triggered at time ti. When v+1-v old <And at tau, the device i can be triggered to train. Otherwise, the device i needs to wait until the above condition is satisfied.
In the embodiment of the disclosure, the degree of obsolescence is embodied as the version difference of the global model and the local model, when the version difference is overlarge, namely, the degree of obsolescence is larger, when the servers are aggregated, the local model with a lower version can influence the global model, and the version of the global model is dragged to a lower version.
According to the technical scheme, relative to the local model uploaded by any one or more devices or the buffered local models, the server can timely acquire the version of the uploaded local model, so that the influence caused by the obsolescence of the local model parameters of the device side can be effectively reduced, and the accuracy of the global model obtained by aggregation of the server can be improved.
In some embodiments, the asynchronous federal learning method may further comprise:
s206: receiving a global model updating request sent by an ith device, wherein the global model updating request is sent before the jth local model is generated;
S207: and transmitting a kth global model to the ith device based on the global model updating request, wherein the kth global model is the global model of the latest version locally of the server when the global model updating request is received, and k is more than 0 and less than or equal to g.
In the embodiment of the disclosure, a server receives a global model update request sent by an ith device, and the server responds to the global model update request of the ith device and sends a global model of a local latest version, namely a kth global model, to the ith device when the server receives the global model update request. That is, the server, after obtaining the global model by aggregation, actively broadcasts the latest global model to each device, and when receiving the global model updating request actively sent by the device, issues the local latest global model to the device end sending the global model updating request.
Therefore, the server side can send the local latest global model to the equipment side sending the global model updating request according to the global model updating request actively sent by the equipment, so that the equipment side can train according to the latest global model, and the accuracy of the local training of the equipment is improved.
Fig. 4 is a second flow chart of an asynchronous federation learning method according to an embodiment of the present disclosure, which may be applied to a device side in an asynchronous federation learning system. In some possible implementations, the asynchronous federal learning-based approach may also be implemented by way of a processor invoking computer readable instructions stored in memory. As shown in fig. 4, the asynchronous federal learning method includes:
s401: the method comprises the steps of sending a jth local model to a server, wherein the jth local model is a local model obtained by an ith device after the completion of a jth round of local training, j is an integer not smaller than 1, and i is an integer not smaller than 1;
s402: and receiving a (g+1) th global model sent by the server, wherein the (g+1) th global model is a global model obtained by aggregating the (j) th local model and the (g) th global model based on the condition that the version difference value of the (j) th local model and the (g) th global model meets a preset condition, the (g) th global model is a global model of the latest version of the local server when the (j) th local model is received, and g is an integer not less than 1.
In the disclosed embodiment, when the ith device is triggered to perform local training, the ith device receives a global model w from the server o And will be the global model w o As an initial local model of the ith device, the ith device performs local training during which the local data set is used as defined in equation (4) using a random gradient descent (Stochastic Gradient Descent, SGD) method
Figure BDA0004021263350000131
Updating the local model:
Figure BDA0004021263350000132
where o is the version of the global model, l represents the number of local epochs, η i Refers to the learning rate of the i-th device,
Figure BDA0004021263350000133
corresponding to based on from->
Figure BDA0004021263350000134
Unbiased sample small lot ζ l-1 Is a gradient of (a).
In order to reduce the version gap between the local model and the global model, the newer global model is aggregated with the local model during the local training process of the device, and the global model may be updated densely at the same time during the local model training process. Therefore, the aggregation of the ith device and the newer global model in the server can well reduce the gap between the local model and the global model. However, it is complicated for the device side to determine the weights for aggregating new global models.
When a new global model wg is received, the ith device and the current local model of the ith device
Figure BDA0004021263350000135
Local model aggregation is performed, where the number represented by l is the local epoch using equation (5).
Figure BDA0004021263350000136
Wherein,,
Figure BDA0004021263350000137
is the weight of the new global model on the ith device in the t-th local training. Equation (5) differs from equation (1) in that the new global model received corresponds to a higher global version. We refer to->
Figure BDA0004021263350000138
Figure BDA0004021263350000139
Calculating +.>
Figure BDA00040212633500001310
Figure BDA00040212633500001311
Wherein mu β Is a super-parameter which is used for the processing of the data,
Figure BDA00040212633500001312
and->
Figure BDA00040212633500001313
Is a control parameter dynamically adjusted according to equation (7). />
Figure BDA00040212633500001314
Wherein,,
Figure BDA00040212633500001315
and->
Figure BDA00040212633500001316
Are respectively->
Figure BDA00040212633500001317
And->
Figure BDA00040212633500001318
Is a learning rate of (a).
Algorithm 2 is a model update algorithm on the device, the device generates a global model update request for transmission, the device obtains a global model transmitted by the server, and updates the local model based on the global model.
The pseudo code for algorithm 2 is described as follows:
Figure BDA00040212633500001319
Figure BDA0004021263350000141
next, the pseudo code of algorithm 2 is explained.
Line 1: the ith device generates a local epoch number l for sending a request for a new global model *
Line 5: in the first * In the local epoch, the ith device sends a request to the server.
Line 6: the ith device may wait for a new global model or wait for a new global model in parallel in a separate thread.
Line 8: the device receives a new global model.
Line 9: updating with equation (6)
Figure BDA0004021263350000142
Line 10: updating w using equation (5) o,l-1
Line 11: updating with equation (7)
Figure BDA0004021263350000143
And->
Figure BDA0004021263350000144
Line 13: the local model is updated using equation (4).
Through the algorithm 2, the problem of local model time-lapse in asynchronous federal learning can be solved. And dynamically adjusting the weights of the global model and the local model according to the time lapse and the gradient.
In this way, the control parameters of the local training are dynamically adjusted according to the degree of obsolescence and the gradient, so that the accuracy of the model is improved.
In some embodiments, the asynchronous federal learning method may further comprise:
s403: before sending the jth local model to a server, sending a global model updating request to the server;
s404: receiving a kth global model returned by the server based on the global model updating request, wherein the kth global model is a global model of the latest version of the server when the global model updating request is received, and k is more than 0 and less than or equal to g;
s405: training the kth global model based on the local data set to obtain the jth local model.
Here, S403 to S405 are performed before S401.
Therefore, the equipment end can actively send a global model updating request to the server according to the requirements, and further train according to the global model obtained based on the global model updating request, so that the accuracy of the generated local model is improved.
In some embodiments, S405 includes:
s4051: the local target local model and the kth global model are aggregated to obtain a target global model, wherein the target local model is a local model obtained in the jth round of training process;
s4052: and training the target global model by using the local data set to obtain the j local model.
Here, the jth round training is used for training to obtain the jth local model.
In the embodiment of the disclosure, comparing a kth global model with a target local model, and if the kth global model is different from the target local model, performing an aggregation operation; if the kth global model is the same as the target local model, not performing aggregation operation, and taking the target local model or the kth global model as the target global model.
Therefore, not only can the important information in the equipment be reserved, but also the influence of the over-time of the local model can be reduced, and the accuracy of the jth local model is improved.
In some embodiments, S405 further comprises:
s4053: and training a local latest global model of the ith device before the j-th training is started by utilizing the local data set to obtain the target local model.
Here, S4053 is performed before S4051.
Therefore, through the mode of waiting while training, not only can the important information in the equipment be reserved, but also the local training speed of the model can be improved.
In the embodiment of the disclosure, an ith device sends a global model updating request to a server, and the ith device continuously waits for the server to send the kth global model until the ith device receives the kth global model sent by the server before the ith device does not receive the kth global model sent by the server, and performs local training on the basis of the kth global model and local data to obtain the jth local model.
Thus, the equipment stops local training, and the waiting server performs local training based on the latest global model returned by the request of updating the global model, so that the accuracy of model learning is improved.
FIG. 5 is a schematic diagram of a framework for obtaining a local model in an asynchronous federal learning method according to an embodiment of the present disclosure, as shown in FIG. 5, in some embodiments, the local latest global model of an ith device is trained using a local data set before the jth round of training begins to obtain the target local model; the local target local model and the kth global model are aggregated to obtain a target global model, wherein the target local model is a local model obtained in the jth round of training process; and training the target global model by using the local data set to obtain the j local model.
FIG. 6 is a schematic diagram II of a framework for obtaining a local model in an asynchronous federal learning method according to embodiments of the present disclosure, as shown in FIG. 6, in some embodiments, waiting continues until a kth global model sent by a server is received; and after receiving the kth global model sent by the server, training the kth global model based on the local data set to obtain the jth local model.
It should be understood that the frame diagrams shown in fig. 5 and 6 are merely exemplary and not limiting, and that they are scalable, and that various obvious changes and/or substitutions may be made by one skilled in the art based on the examples of fig. 5 and 6, and the resulting technical solutions still fall within the scope of the disclosure of the embodiments of the present disclosure.
Fig. 7 is a flowchart three of an asynchronous federation learning method according to an embodiment of the present disclosure, which may be applied to a server side and a device side in an asynchronous federation learning system. In some possible implementations, the asynchronous federal learning-based approach may also be implemented by way of a processor invoking computer readable instructions stored in memory. As shown in fig. 7, the asynchronous federal learning method includes:
The server responds to receiving a jth local model sent by an ith device, and obtains a version of the jth local model and a version of a server local jth global model; the g global model is the global model of the local latest version of the server when the j local model is received, i is an integer not less than 1, j is an integer not less than 1, and g is an integer not less than 1; determining a version difference value of the jth local model and the g global model based on the version of the jth local model and the version of the g global model; under the condition that the version difference value meets a preset condition, determining weights corresponding to the jth local model and the g global model respectively; based on the weights corresponding to the jth local model and the g global model, aggregating the jth local model and the g global model to obtain a (g+1) th global model;
and the ith equipment responds to receiving the (g+1) th global model sent by the server, and carries out local training based on the (g+1) th global model to obtain the (j+1) th local model.
In the embodiment of the disclosure, a certain input method manufacturer is to avoid directly acquiring original keyboard and voice data of a user. Traditional distributed machine learning cannot cope with system heterogeneity in a scenario, for example: performance differences, charge status, network status, etc. of the user's handset, and data heterogeneity, such as: the training data local to each user is Non-IID and therefore, by applying FL, the keyboard and speech input model are improved without uploading the user's raw data.
Taking an asynchronous federal learning system consisting of 1 central server and 100 mobile phones as an example, the servers and the mobile phones are connected by using the Internet, and the actual application of the algorithm 1 and the algorithm 2 is introduced:
the server side algorithm is shown in algorithm 1:
lines 1-6: the server periodically triggers some idle handsets in parallel.
Line 8: and the server side in each global epoch acquires a local model from any mobile phone i.
Lines 12-14: dynamically adjusting global model aggregation parameters by using a formula (3), and performing global model aggregation.
The algorithm of the mobile phone end is shown in algorithm 2:
line 1: after the ith mobile phone is triggered by the server, determining a time slot l for sending a request for acquiring a new global model in the local training process *
Line 2: and acquiring the latest global model of the server side.
Line 6: training of local epochs using random gradient descentIn time slot l * A new global model is obtained.
Lines 9-11: if the global model is updated, the local model aggregation parameters are dynamically adjusted using the formula and local model aggregation is performed.
Line 13: the local model is updated.
According to the technical scheme, the weight of the model aggregation parameter is dynamically adjusted according to the over-time and the gradient, so that the influence caused by the over-time in the asynchronous federal learning system is effectively reduced, and the accuracy of the model is improved.
It should be understood that the schematic diagram shown in fig. 7 is merely exemplary and not limiting, and that it is scalable, and that a person skilled in the art may make various obvious changes and/or substitutions based on the example of fig. 7, while still falling within the scope of the disclosure of the embodiments of the present disclosure.
The embodiment of the disclosure provides an asynchronous federal learning device, which is applied to a server, as shown in fig. 8, and may include: a first obtaining module 801, configured to obtain, in response to receiving a jth local model sent by an ith device, a version of the jth local model and a version of a server local jth global model; the g global model is the global model of the local latest version of the server when the j local model is received, i is an integer not less than 1, j is an integer not less than 1, and g is an integer not less than 1; a first determining module 802, configured to determine a version difference value between the jth local model and the g global model based on the version of the jth local model and the version of the g global model; a second determining module 803, configured to determine weights corresponding to the jth local model and the g global model when the version difference value meets a preset condition; the aggregation module 804 is configured to aggregate the jth local model and the g global model based on the weights corresponding to the jth local model and the g global model, to obtain the (g+1) th global model.
In some embodiments, the second determining module 803 includes: and the determining submodule is used for determining the weight corresponding to each of the jth local model and the g global model in response to the fact that the version difference value does not reach the preset threshold.
In some embodiments, the apparatus may further comprise: a second obtaining module 805 (not shown in fig. 8) configured to obtain the aggregation control parameter of the (g+1) th round; the aggregation module 804 is further configured to aggregate the jth local model with the g global model based on the weights corresponding to the jth local model and the g global model, and the aggregation control parameters of the (g+1) th round, to obtain the (g+1) th global model.
In some embodiments, the apparatus may further comprise: a discarding module 806 (not shown in fig. 8) is configured to discard the jth local model if the version difference value does not satisfy the preset condition.
In some embodiments, the apparatus may further comprise: a first receiving module 807 (not shown in fig. 8) configured to receive a global model update request sent by an ith device, the global model update request being sent before generating a jth local model; a first sending module 808 (not shown in fig. 8) configured to send, to the ith device, a kth global model based on the request for updating the global model, where the kth global model is a global model of a latest version local to the server when the request for updating the global model is received, and k is greater than 0 and less than or equal to g.
It should be understood by those skilled in the art that the functions of each processing module in the asynchronous federal learning apparatus according to the embodiments of the present disclosure may be understood with reference to the foregoing description of the asynchronous federal learning method applied to the server side, and each processing module in the asynchronous federal learning apparatus according to the embodiments of the present disclosure may be implemented by an analog circuit implementing the functions described in the embodiments of the present disclosure, or may be implemented by running software that performs the functions described in the embodiments of the present disclosure on an electronic device.
The asynchronous federation learning device can improve the accuracy of the server aggregate global model.
The embodiment of the disclosure provides an asynchronous federal learning device, which is applied to a device end, as shown in fig. 9, and may include: the second sending module 901 is configured to send a jth local model to a server, where the jth local model is a local model obtained by the ith device after the jth local training is completed, j is an integer not less than 1, and i is an integer not less than 1; the second receiving module 902 is configured to receive a (g+1) th global model sent by the server, where the (g+1) th global model is a global model obtained by aggregating the jth local model and the (g) th global model based on the jth local model and the (g) th global model when the jth local model is received, and g is an integer not less than 1.
In some embodiments, the apparatus may further comprise: a third sending module 903 (not shown in fig. 9) configured to send a global model update request to the server before sending the jth local model to the server; a third receiving module 904 (not shown in fig. 9), configured to receive a kth global model returned by the server based on the global model update request, where the kth global model is a global model of a latest version local to the server when the global model update request is received, and k is greater than 0 and less than or equal to g; a third obtaining module 905 (not shown in fig. 9) is configured to train the kth global model based on the local data set to obtain the jth local model.
In some embodiments, the third acquisition module 905 (not shown in fig. 9) may include: the aggregation sub-module is used for aggregating the local target local model and the kth global model to obtain a target global model, wherein the target local model is a local model obtained in the jth round of training process; and the first training sub-module is used for training the target global model by utilizing the local data set to obtain a j local model.
In some embodiments, the third acquisition module 905 (not shown in fig. 9) may further comprise: and the second training sub-module is used for training the local latest global model of the ith equipment before the j-th training is started by utilizing the local data set to obtain a target local model.
It should be understood by those skilled in the art that the functions of each processing module in the asynchronous federal learning device according to the embodiments of the present disclosure may be understood with reference to the foregoing description of the asynchronous federal learning method, and each processing module in the asynchronous federal learning device applied to the device side according to the embodiments of the present disclosure may be implemented by an analog circuit implementing the functions described in the embodiments of the present disclosure, or may be implemented by running software that performs the functions described in the embodiments of the present disclosure on an electronic device.
The asynchronous federal learning device can improve the accuracy and efficiency of local training of equipment.
An embodiment of the present disclosure provides an asynchronous federal learning system, as shown in fig. 10, comprising: m devices, which are used for sending the trained local model to a server, wherein m is an integer greater than 2; the server is used for responding to the received j local model sent by the i device and obtaining the version of the j local model and the version of the local g global model; determining a version difference value of the jth local model and the g global model based on the version of the jth local model and the version of the g global model; under the condition that the version difference value meets the preset condition, determining the weight corresponding to each of the jth local model and the g global model; based on the weights corresponding to the jth local model and the g global model, aggregating the jth local model and the g global model to obtain a (g+1) global model; the g global model is the global model of the latest local version when the server receives the j local model sent by the i device; i is an integer of 0 or more and m or less, j is an integer of 1 or more, and g is an integer of 1 or more.
The asynchronous federal learning system of the embodiment of the disclosure can improve the accuracy of the model obtained by aggregation.
The embodiment of the disclosure also provides a scene schematic diagram of asynchronous federal learning, as shown in fig. 11, an electronic device such as a cloud server receives a trained local model sent by each terminal; each time the electronic device receives the local model, the following processing is executed:
responding to a jth local model sent by a terminal i, and determining a version difference value of the jth local model and a jth global model based on the version of the jth local model and the version of a local current latest version global model (such as the jth global model); under the condition that the version difference value meets the preset condition, determining the weight corresponding to each of the jth local model and the g global model; based on the weights corresponding to the jth local model and the g global model, aggregating the jth local model and the g global model to obtain a (g+1) global model; and broadcasting the (g+1) th global model to each terminal.
And the terminal i carries out local training according to the received g+1th global model to obtain the j+1th local model.
The number of the terminals and the electronic devices is not limited, and a plurality of terminals and a plurality of electronic devices can be included in practical application.
It should be understood that the scene diagram shown in fig. 11 is merely illustrative and not restrictive, and that various obvious changes and/or substitutions may be made by one skilled in the art based on the example of fig. 11, and the resulting technical solutions still fall within the scope of the disclosure of the embodiments of the present disclosure.
In the technical scheme of the disclosure, the acquisition, storage, application and the like of the related user personal information all conform to the regulations of related laws and regulations, and the public sequence is not violated.
According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium and a computer program product.
Fig. 12 shows a schematic block diagram of an example electronic device 1200 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile apparatuses, such as personal digital assistants, cellular telephones, smartphones, wearable devices, and other similar computing apparatuses. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 12, the apparatus 1200 includes a computing unit 1201 that can perform various appropriate actions and processes according to a computer program stored in a Read-Only Memory (ROM) 1202 or a computer program loaded from a storage unit 1208 into a random access Memory (RandomAccess Memory, RAM) 1203. In the RAM 1203, various programs and data required for the operation of the device 1200 may also be stored. The computing unit 1201, the ROM 1202, and the RAM 1203 are connected to each other via a bus 1204. An Input/Output (I/O) interface 1205 is also connected to bus 1204.
Various components in device 1200 are connected to I/O interface 1205, including: an input unit 1206 such as a keyboard, mouse, etc.; an output unit 1207 such as various types of displays, speakers, and the like; a storage unit 1208 such as a magnetic disk, an optical disk, or the like; and a communication unit 1209, such as a network card, modem, wireless communication transceiver, etc. The communication unit 1209 allows the device 1200 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunications networks.
The computing unit 1201 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 1201 include, but are not limited to, a central processing unit (Central Processing Unit, CPU), a graphics processing unit (Graphics Processing Unit, GPU), various dedicated artificial intelligence (Artificial Intelligence, AI) computing chips, various computing units running machine learning model algorithms, digital signal processors (Digital Signal Processor, DSP), and any suitable processors, controllers, microcontrollers, etc. The computing unit 1201 performs the various methods and processes described above, such as the asynchronous federal learning method. For example, in some embodiments, the asynchronous federal learning method may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 1208. In some embodiments, part or all of the computer program may be loaded and/or installed onto device 1200 via ROM 1202 and/or communication unit 1209. When a computer program is loaded into RAM 1203 and executed by computing unit 1201, one or more steps of the asynchronous federal learning method described above may be performed. Alternatively, in other embodiments, the computing unit 1201 may be configured to perform the asynchronous federal learning method by any other suitable means (e.g., by means of firmware).
Various implementations of the systems and techniques described here above can be implemented in digital electronic circuitry, integrated circuitry, field programmable gate arrays (Field Programmable Gate Array, FPGAs), application specific integrated circuits (Application Specific Integrated Circuit, ASICs), application-specific standard products (ASSPs), system On Chip (SOC), complex programmable logic devices (Complex Programmable Logic Device, CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a random access Memory, a read-Only Memory, an erasable programmable read-Only Memory (EPROM), a flash Memory, an optical fiber, a portable compact disc read-Only Memory (Compact Disk Read Only Memory, CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., cathode Ray Tube (CRT) or liquid crystal display (Liquid Crystal Display, LCD) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local area network (Local Area Network, LAN), wide area network (Wide Area Network, WAN) and the internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server incorporating a blockchain. It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel, sequentially, or in a different order, provided that the desired results of the disclosed aspects are achieved, and are not limited herein.
The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions, improvements, etc. that are within the principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims (23)

1. An asynchronous federation learning method is applied to a server side and comprises the following steps:
responding to receiving a jth local model sent by an ith device, and acquiring a version of the jth local model and a version of a server local jth global model; the g global model is the global model of the latest version of the server local when the j local model is connected, i is an integer not less than 1, j is an integer not less than 1, and g is an integer not less than 1;
determining a version difference value of the jth local model and the g global model based on the version of the jth local model and the version of the g global model;
under the condition that the version difference value meets a preset condition, determining weights corresponding to the jth local model and the g global model respectively;
And aggregating the jth local model and the g global model based on the weights respectively corresponding to the jth local model and the g global model to obtain a (g+1) global model.
2. The method according to claim 1, wherein the determining weights of the jth local model and the g global model when the version difference value meets a preset condition includes:
and in response to detecting that the version difference value does not reach a preset threshold, determining weights corresponding to the jth local model and the g global model respectively.
3. The method of claim 1, further comprising:
acquiring polymerization control parameters of the (g+1) th round;
the aggregating the jth local model and the g global model based on the weights corresponding to the jth local model and the g global model to obtain a (g+1) global model includes:
and aggregating the jth local model and the g global model based on the weights respectively corresponding to the jth local model and the g global model and the aggregation control parameters of the (g+1) th round to obtain the (g+1) th global model.
4. The method of claim 1, further comprising:
and discarding the jth local model under the condition that the version difference value does not meet the preset condition.
5. The method of claim 1, further comprising:
receiving a global model updating request sent by the ith device, wherein the global model updating request is sent before the jth local model is generated;
and transmitting a kth global model to the ith device based on the global model updating request, wherein the kth global model is the global model of the latest version of the server local when the global model updating request is received, and k is a positive integer not more than g.
6. An asynchronous federal learning method applied to a device side comprises the following steps:
the method comprises the steps of sending a jth local model to a server, wherein the jth local model is a local model obtained by an ith device after the jth local training is finished, j is an integer not smaller than 1, and i is an integer not smaller than 1;
receiving a (g+1) global model sent by the server, wherein the (g+1) global model is a global model obtained by aggregation of the server based on the (j) local model and the (g) global model under the condition that the version difference value of the (j) local model and the (g) global model meets a preset condition, and the (g) global model is a global model of the latest version of the server when the (j) local model is connected, and g is an integer not smaller than 1.
7. The method of claim 6, further comprising:
before sending the jth local model to the server, sending a global model updating request to the server;
receiving a kth global model returned by the server based on the global model updating request, wherein the kth global model is a global model of the latest version of the local server when the global model updating request is received, and k is a positive integer not more than g;
and training the kth global model based on a local data set to obtain the jth local model.
8. The method of claim 7, wherein the training the kth global model based on the local dataset to obtain the jth local model comprises:
the local target local model and the kth global model are aggregated to obtain a target global model, wherein the target local model is a local model obtained in the jth round of training process;
and training the target global model by using the local data set to obtain the jth local model.
9. The method of claim 8, wherein the training the kth global model based on the local dataset results in the jth local model, further comprising:
And training a local latest global model of the ith device before the j-th training is started by using the local data set to obtain the target local model.
10. An asynchronous federation learning method applied to an asynchronous federation learning system, comprising:
the ith device sends a trained jth local model to a server, wherein i is an integer not less than 1, and j is an integer not less than 1;
the server responds to receiving a jth local model sent by an ith device, and obtains a version of the jth local model and a version of a server local g global model; the g global model is the global model of the latest version of the server local when the j local model is connected, and g is an integer not less than 1; determining a version difference value of the jth local model and the g global model based on the version of the jth local model and the version of the g global model; under the condition that the version difference value meets a preset condition, determining weights corresponding to the jth local model and the g global model respectively; and aggregating the jth local model and the g global model based on the weights respectively corresponding to the jth local model and the g global model to obtain a (g+1) global model.
11. An asynchronous federal learning device, applied to a server side, comprising:
the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring a version of a jth local model and a version of a server local g global model in response to receiving the jth local model sent by an ith device; the g global model is the global model of the latest version of the server local when the j local model is connected, i is an integer not less than 1, j is an integer not less than 1, and g is an integer not less than 1;
the first determining module is used for determining a version difference value of the jth local model and the g global model based on the version of the jth local model and the version of the g global model;
the second determining module is used for determining the weights corresponding to the jth local model and the g global model respectively under the condition that the version difference value meets a preset condition;
and the aggregation module is used for aggregating the jth local model and the g global model based on the weights respectively corresponding to the jth local model and the g global model to obtain a (g+1) global model.
12. The apparatus of claim 11, wherein the second determination module comprises:
And the determining submodule is used for determining the weight corresponding to each of the jth local model and the g global model in response to the fact that the version difference value does not reach a preset threshold.
13. The apparatus of claim 11, further comprising:
the second acquisition module is used for acquiring the aggregation control parameters of the (g+1) th round;
the aggregation module is further configured to aggregate the jth local model with the g global model based on weights corresponding to the jth local model and the g global model, and aggregation control parameters of the (g+1) -th round, to obtain the (g+1) -th global model.
14. The apparatus of claim 11, further comprising:
and the discarding module is used for discarding the jth local model under the condition that the version difference value does not meet the preset condition.
15. The apparatus of claim 11, further comprising:
the first receiving module is used for receiving a global model updating request sent by the ith device, wherein the global model updating request is sent before the jth local model is generated;
the first sending module is configured to send, to the ith device, a kth global model based on the global model update request, where the kth global model is a global model of a latest version of the server local when the global model update request is received, and k is a positive integer not greater than g.
16. An asynchronous federal learning device, applied to a device side, comprising:
the second sending module is used for sending a jth local model to the server, wherein the jth local model is a local model obtained by the ith device after the jth local training is finished, j is an integer not less than 1, and i is an integer not less than 1;
the second receiving module is configured to receive a (g+1) th global model sent by the server, where the (g+1) th global model is a global model obtained by aggregating the (j) th local model and the (g) th global model based on the situation that a version difference value between the (j) th local model and the (g) th global model meets a preset condition, and g is an integer not less than 1, where the (g) th global model is a global model of a local latest version of the server when the (j) th local model is connected.
17. The apparatus of claim 16, further comprising:
a third sending module, configured to send a global model update request to the server before sending the jth local model to the server;
the third receiving module is used for receiving a kth global model returned by the server based on the global model updating request, wherein the kth global model is a global model of the latest version of the server local when the global model updating request is received, and k is a positive integer not more than g;
And the third acquisition module is used for training the kth global model based on the local data set to obtain the jth local model.
18. The apparatus of claim 17, wherein the third acquisition module comprises:
the aggregation sub-module is used for aggregating the local target local model and the kth global model to obtain a target global model, wherein the target local model is a local model obtained in the jth round training process;
and the first training submodule is used for training the target global model by utilizing the local data set to obtain the jth local model.
19. The apparatus of claim 18, wherein the third acquisition module further comprises:
and the second training sub-module is used for training the local latest global model of the ith equipment before the j-th training is started by utilizing the local data set to obtain the target local model.
20. An asynchronous federal learning system, comprising:
m devices, which are used for sending the trained local model to a server, wherein m is an integer greater than 2;
the server is used for responding to the received j local model sent by the i device and obtaining the version of the j local model and the version of the local g global model; determining a version difference value of the jth local model and the g global model based on the version of the jth local model and the version of the g global model; under the condition that the version difference value meets a preset condition, determining weights corresponding to the jth local model and the g global model respectively; based on the weight corresponding to each of the jth local model and the g global model, aggregating the jth local model and the g global model to obtain a (g+1) th global model; the g global model is the global model of the latest local version when the server receives the j local model sent by the i device; i is an integer of 0 or more and m or less, j is an integer of 1 or more, and g is an integer of 1 or more.
21. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-10.
22. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-10.
23. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any of claims 1-10.
CN202211686530.7A 2022-12-27 2022-12-27 Asynchronous federal learning method, device, system and storage medium Withdrawn CN116245194A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211686530.7A CN116245194A (en) 2022-12-27 2022-12-27 Asynchronous federal learning method, device, system and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211686530.7A CN116245194A (en) 2022-12-27 2022-12-27 Asynchronous federal learning method, device, system and storage medium

Publications (1)

Publication Number Publication Date
CN116245194A true CN116245194A (en) 2023-06-09

Family

ID=86625172

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211686530.7A Withdrawn CN116245194A (en) 2022-12-27 2022-12-27 Asynchronous federal learning method, device, system and storage medium

Country Status (1)

Country Link
CN (1) CN116245194A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117936080A (en) * 2024-03-22 2024-04-26 中国人民解放军总医院 Solid malignant tumor clinical auxiliary decision-making method and system based on federal large model

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117936080A (en) * 2024-03-22 2024-04-26 中国人民解放军总医院 Solid malignant tumor clinical auxiliary decision-making method and system based on federal large model
CN117936080B (en) * 2024-03-22 2024-06-04 中国人民解放军总医院 Solid malignant tumor clinical auxiliary decision-making method and system based on federal large model

Similar Documents

Publication Publication Date Title
CN112561078B (en) Distributed model training method and related device
US20210326762A1 (en) Apparatus and method for distributed model training, device, and computer readable storage medium
CN114298322B (en) Federal learning method and apparatus, system, electronic device, and computer readable medium
CN113850394B (en) Federal learning method and device, electronic equipment and storage medium
CN113657483A (en) Model training method, target detection method, device, equipment and storage medium
CN112994980B (en) Time delay test method, device, electronic equipment and storage medium
CN114693934B (en) Training method of semantic segmentation model, video semantic segmentation method and device
CN116245194A (en) Asynchronous federal learning method, device, system and storage medium
US20220053335A1 (en) Method for detecting an abnormal device, device and storage medium
CN115690443B (en) Feature extraction model training method, image classification method and related devices
WO2023142399A1 (en) Information search methods and apparatuses, and electronic device
CN112508768A (en) Single-operator multi-model pipeline reasoning method, system, electronic equipment and medium
CN113657538B (en) Model training and data classification method, device, equipment, storage medium and product
CN113052246B (en) Method and related apparatus for training classification model and image classification
CN115456194B (en) Model training control method, device and system based on asynchronous federal learning
CN114841341B (en) Image processing model training and image processing method, device, equipment and medium
CN116633879A (en) Data packet receiving method, device, equipment and storage medium
CN114758130B (en) Image processing and model training method, device, equipment and storage medium
CN115719093A (en) Distributed training method, device, system, storage medium and electronic equipment
CN115293329A (en) Parameter updating method, device, equipment and storage medium
CN110119364B (en) Method and system for input/output batch submission
CN107800648B (en) Data packet scheduling method and device
CN115936091B (en) Training method and device for deep learning model, electronic equipment and storage medium
CN112685271A (en) Pressure measurement data processing method and device, electronic equipment and readable storage medium
WO2024120445A1 (en) Model input information determination method, apparatus, device and system, and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20230609