CN116050540B - Self-adaptive federal edge learning method based on joint bi-dimensional user scheduling - Google Patents

Self-adaptive federal edge learning method based on joint bi-dimensional user scheduling Download PDF

Info

Publication number
CN116050540B
CN116050540B CN202310050202.6A CN202310050202A CN116050540B CN 116050540 B CN116050540 B CN 116050540B CN 202310050202 A CN202310050202 A CN 202310050202A CN 116050540 B CN116050540 B CN 116050540B
Authority
CN
China
Prior art keywords
edge
batch
evaluation efficiency
model
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310050202.6A
Other languages
Chinese (zh)
Other versions
CN116050540A (en
Inventor
潘春雨
张九川
李学华
姚媛媛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Information Science and Technology University
Original Assignee
Beijing Information Science and Technology University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Information Science and Technology University filed Critical Beijing Information Science and Technology University
Priority to CN202310050202.6A priority Critical patent/CN116050540B/en
Publication of CN116050540A publication Critical patent/CN116050540A/en
Application granted granted Critical
Publication of CN116050540B publication Critical patent/CN116050540B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5072Grid computing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The application provides a self-adaptive federal edge learning method based on joint bi-dimensional user scheduling, which comprises the following steps: based on the loss function and the training period, acquiring the evaluation efficiency of model training; acquiring batch data based on the evaluation efficiency, and acquiring a trained initial model based on the batch data; and screening the initial model to obtain a final trained model. The application can further improve the accuracy and efficiency of the federal learning method.

Description

Self-adaptive federal edge learning method based on joint bi-dimensional user scheduling
Technical Field
The application belongs to the technical field of artificial intelligence, and particularly relates to a self-adaptive federal edge learning method based on joint bi-dimensional user scheduling.
Background
With the development of mobile communication and the internet of things, the data volume generated by devices such as smart phones, sensors of the internet of things and the like has a trend of explosive growth. Machine learning models require a large rich data set to train. On the one hand, the traditional centralized machine learning algorithm needs to upload a large amount of data to the central node, and large-scale data transmission will lead to larger transmission time and congestion. On the other hand, the traditional distributed machine learning algorithm requires the training data set to be uploaded in a concentrated mode, and the training data set is redistributed to a plurality of working nodes after being divided uniformly, so that privacy leakage is easy to cause.
The proposal of federal edge learning (Federated Edge Learning, FEL) provides a solution to the above-mentioned problems. In FEL, model training is performed on edge devices and is equipped with a multiple access edge computing center server. The FEL achieves iterative updating by two steps: 1) Training a local model: the intelligent edge device trains the local model by utilizing the local data set and then uploads model parameters to the central server. 2) Global model aggregation: the central server aggregates the local model parameters uploaded locally to form a global model, updates the global model, and then sends the updated model to the intelligent edge equipment to start a new iteration. Compared with the traditional centralized and distributed machine learning algorithms, the iterative training process of the FEL does not need to upload local data by intelligent edge equipment, so that the FEL has more potential for protecting the privacy of the data.
However, the computing power of intelligent edge devices and local data set heterogeneity and imbalance present significant challenges to the convergence speed of the global model and the global model accuracy. In recent years, related work has conducted optimization studies on iterative processes of FEL. Most of the prior studies used a random gradient descent algorithm (Stochastic Gradient Descent, SGD) for local model training. The literature improves redundancy rate in federal learning through information coding design to reduce the influence that longer equipment caused based on local model training time.
However, none of the above studies takes into account the difference in training completion time due to smart device computing power and data set heterogeneity, and waiting for all edge devices to complete local model training will delay the global model aggregation process. Furthermore, data sets are often bulky and data distribution is not balanced, given that the data collected by a device depends on the local environment and the device's own properties. The local data set presents an uneven distribution state according to devices, and the data attributes of different devices need to be considered for device scheduling. Therefore, a method that combines the computing power of the device and the distribution characteristics of the data set needs to be designed to enhance the model training accuracy and convergence rate of the algorithm.
In the gradient descent algorithm, each iteration needs to calculate the gradient of the sample on the whole data set, and when the number of the samples of the data set is large, each iteration consumes a great deal of time and calculation resources. The formula is as follows:
wherein w is t Model parameters representing the t-th iterationThe number of the product is the number,representing the loss function at w t A gradient thereat. η represents the learning rate, which may represent the distance the entire loss function moves in the negative direction of the gradient during the gradient descent.
Random gradient descent only one sample is selected at a time to calculate the random gradient, so the time per gradient update is greatly reduced. The formula is as follows:
however, the random gradient of one sample does not represent the gradient of the entire data set, so that the random gradient descent method does not run in the negative direction of the full gradient for each iteration, and the convergence process is relatively jittery. Since the random gradient of a single sample and the full gradient of all samples differ significantly, the number of iterations required for convergence is greatly increased using the random gradient descent algorithm.
The trade-off between gradient descent and random gradient descent algorithms is a small batch gradient descent algorithm. The algorithm selects the gradient update model parameters of a part of samples each time, and the update formula is as follows:
wherein xi t A random sample representing a batch selected at the t-th iteration, assuming a batch size of m, is obtained:
however, the batch size employed by each iteration of the conventional small batch gradient descent algorithm needs to be configured before training begins and remains unchanged throughout the training process. Along with the progress of the local model training process, the model precision is gradually improved, and the self-adaptive selection of the batch size according to the model precision is beneficial to improving the convergence rate.
On the other hand, in active learning, when the selected sample has the characteristics of diversity, feature enrichment and the like, the model can be trained by using less data. Therefore, in the FEL, the training can be performed by referring to the data with active learning selection diversity, and when the data with non-independent and same distribution exists in the device, the data with higher diversity can be selected to improve the convergence speed and the convergence accuracy.
At present, the existing federal edge learning local model precision and local model training time have great influence on the global model aggregation and model updating process, so that the batch size of gradient descent extraction needs to be automatically adjusted in the local model training process, and algorithm convergence is accelerated while the model precision is improved;
the existing federal edge learning does not consider the difference of training completion time caused by the computing power of intelligent equipment and the isomerism of data sets, and waiting for all edge equipment to complete local model training delays the global model aggregation process; the data collected by a device is typically bulky depending on the local environment and the device's own properties, and the data is not independently co-distributed. Therefore, the application provides a two-dimensional user scheduling strategy based on task completion time and data self attribute aiming at the non-independent and same distribution characteristic of user data, and the application further improves the precision and convergence speed of the global model while reducing the waiting time.
Disclosure of Invention
In order to solve the technical problems, the application provides a self-adaptive federal edge learning method based on joint bi-dimensional user scheduling, which can further improve the accuracy and efficiency of the federal edge learning method.
In order to achieve the above object, the present application provides an adaptive federal edge learning method based on joint bi-dimensional user scheduling, including:
s1, acquiring evaluation efficiency of model training based on a loss function and a training period;
s2, acquiring batch data based on the evaluation efficiency, and acquiring a trained initial model based on the batch data;
s3, screening the initial model, and putting the screened model back into the S1 for repeated iteration until a plurality of iteration processes are completed, so as to obtain a final trained model.
Optionally, obtaining the evaluation efficiency of model training includes:
acquiring the iteration loss and the loss variation before a plurality of times based on the loss function;
and acquiring the evaluation efficiency based on the loss variation and the training period.
Optionally, the loss variation is:
Δloss=f(x-n)-f(x)
where Δloss is the loss variation, f (x-n) is the loss value of the previous n iterations, and f (x) is the loss value of the current iteration.
Optionally, the evaluation efficiency is:
where e is the evaluation efficiency, Δloss is the loss variation, and t is the training period.
Optionally, acquiring the batch data includes:
based on the evaluation efficiency, presetting a triggering condition of batch switching;
and randomly distributing the local data into batches with different data sizes, storing the batches in a list, selecting the smallest batch in the list to start iteration for the first time, calculating the evaluation efficiency after each iteration is finished, and switching to the batch with a preset value as the batch data when the acquired evaluation efficiency meets the triggering condition for triggering batch switching.
Optionally, the triggering condition includes: the first trigger condition, the second trigger condition and the third trigger condition;
the first triggering condition is as follows: the evaluation efficiency of the nth time is smaller than the evaluation efficiency of the nth-1 time;
the second triggering condition is as follows: the current evaluation efficiency is lower than the historical evaluation efficiency;
the third triggering condition is as follows: the current evaluation efficiency is negative.
Optionally, switching to the batch of the preset value as the batch data includes:
when the acquired evaluation efficiency meets the first trigger condition, switching to a batch with a first preset value as the batch data;
when the acquired evaluation efficiency meets the second trigger condition, switching to a batch with a second preset value as the batch data;
when the acquired evaluation efficiency meets the third trigger condition, switching to a batch with a third preset value as the batch data;
the first preset value is larger than the second preset value, and the third preset value is larger than the first preset value.
Optionally, screening the initial model includes:
when the initial model is derived from edge equipment with lower than preset computing capacity, rejecting the initial model;
carrying out diversity analysis on the removed residual models, and removing equipment corresponding to the models when the diversity index of the models is lower than a threshold value; otherwise, the reservation is made.
Optionally, when the initial model originates from an edge device with a lower computing power than a preset computing power, rejecting the initial model includes:
obtaining training parameters of the initial model locally from a group of edge equipment subsets with heterogeneous computing capacities, and setting a longest time threshold according to different equipment performances;
comparing the local initial model training time for each device in the subset of edge devices to the maximum time threshold specified by device i; if the local training time is not greater than the maximum time threshold value specified by the equipment i, reserving the equipment i in the equipment subset; if the local training time is longer than that specified by device iThe longest time threshold, eliminating the device i from the subset of devices; the devices meeting the threshold requirement in the edge device subset are updated to a new subset M 1
Optionally, the performing diversity analysis on the remaining models after the culling includes:
traversing the new subset M 1 The diversity index G of each device in the database is stored in a diversity index array G; the diversity index in G is then arranged from large to small and based on diversity constraintsScreening from large to small in an array G; if the diversity index G of device i in array G is in the diversity constraint +.>Within, then at the new subset M 1 Reserving the device i; if the diversity index G of device i in array G is in the diversity constraint +.>Outside, then at the new subset M 1 Removing the equipment i; finally, the updated device subset M is output 2 And uses this user scheduling setting in federal learning for the current iteration.
Compared with the prior art, the application has the following advantages and technical effects:
according to the method, based on the evaluation efficiency, the batch data are acquired, and the batch data are determined more accurately, so that the accuracy and the efficiency of the dynamic balance model are improved, and the accuracy and the efficiency of the federal learning algorithm are further improved.
The method and the device can acquire the final trained model, can more accurately determine the model for federal learning aggregation, and can further improve the accuracy and efficiency of federal learning algorithm.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application. In the drawings:
FIG. 1 is a schematic flow chart of the adaptive federal edge learning method of the present application.
Detailed Description
It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other. The application will be described in detail below with reference to the drawings in connection with embodiments.
It should be noted that the steps illustrated in the flowcharts of the figures may be performed in a computer system such as a set of computer executable instructions, and that although a logical order is illustrated in the flowcharts, in some cases the steps illustrated or described may be performed in an order other than that illustrated herein.
As shown in fig. 1, the present application provides an adaptive federal edge learning method based on joint bi-dimensional user scheduling, which includes:
s1, acquiring evaluation efficiency of model training based on a loss function and a training period;
s2, acquiring batch data based on the evaluation efficiency, and acquiring a trained initial model based on the batch data;
s3, screening the initial model, putting the screened model back into the S1 for repeated iteration until a plurality of iteration processes are completed, namely when the success rate of the training result reaches a certain height, wherein the screened model is the model after final training.
Further, obtaining the evaluation efficiency of model training includes:
acquiring the iteration loss and the loss variation before a plurality of times based on the loss function;
and acquiring the evaluation efficiency based on the loss variation and the training period.
Further, obtaining the batch data includes:
based on the evaluation efficiency, presetting a triggering condition of batch switching;
and randomly distributing the local data into batches with different data sizes, storing the batches in a list, selecting the smallest batch in the list to start iteration for the first time, calculating the evaluation efficiency after each iteration is finished, and switching to the batch with a preset value as the batch data when the acquired evaluation efficiency meets the triggering condition for triggering batch switching.
Further, the triggering condition includes: the first trigger condition, the second trigger condition and the third trigger condition;
the first triggering condition is as follows: the evaluation efficiency of the nth time is smaller than the evaluation efficiency of the nth-1 time;
the second triggering condition is as follows: the current evaluation efficiency is lower than the historical evaluation efficiency;
the third triggering condition is as follows: the current evaluation efficiency is negative.
Further, switching to a batch of a preset value as the batch data includes:
when the acquired evaluation efficiency meets the first trigger condition, switching to a batch with a first preset value as the batch data;
when the acquired evaluation efficiency meets the second trigger condition, switching to a batch with a second preset value as the batch data;
when the acquired evaluation efficiency meets the third trigger condition, switching to a batch with a third preset value as the batch data;
the first preset value is larger than the second preset value, and the third preset value is larger than the first preset value.
Further, screening the initial model includes:
when the initial model is derived from edge equipment with lower than preset computing capacity, rejecting the initial model;
carrying out diversity analysis on the removed residual models, and removing equipment corresponding to the models when the diversity index of the models is lower than a threshold value; otherwise, the reservation is made.
Further, when the initial model is derived from an edge device with a lower computing power than a preset computing power, rejecting the initial model includes:
obtaining training parameters of the initial model locally from a group of edge equipment subsets with heterogeneous computing capacities, and setting a longest time threshold according to different equipment performances;
comparing the local initial model training time for each device in the subset of edge devices to the maximum time threshold specified by device i; if the local training time is not greater than the maximum time threshold value specified by the equipment i, reserving the equipment i in the equipment subset; if the local training time is greater than the maximum time threshold value specified by the equipment i, eliminating the equipment i from the equipment subset; the devices meeting the threshold requirement in the edge device subset are updated to a new subset M 1
Further, the diversity analysis of the removed residual model comprises:
traversing the new subset M 1 The diversity index G of each device in the database is stored in a diversity index array G; the diversity index in G is then arranged from large to small and based on diversity constraintsScreening from large to small in an array G; if the diversity index G of device i in array G is in the diversity constraint +.>Within, then at the new subset M 1 Reserving the device i; if the diversity index G of device i in array G is in the diversity constraint +.>Outside, then at the new subset M 1 Removing the equipment i; finally, the updated device subset M is output 2 And uses this user scheduling setting in federal learning for the current iteration.
Examples
Traditional centralized machine learning algorithms require large amounts of data to be uploaded to a central node, and large-scale data transmission results in large transmission times and congestion. In addition, the traditional distributed machine learning algorithm requires the training data set to be uploaded in a concentrated mode, and the training data set is redistributed to a plurality of working nodes after being divided uniformly, so that privacy leakage is easy to cause. The proposal of the adaptive dynamic batch gradient descent algorithm combined with the two-dimensional user scheduling strategy provides a solution to the above problems. For example, in the large context of the industrial internet, traditional factories have a need to convert to intelligent factories, but because the production and manufacturing data of the factories belong to commercial secrets, there are high demands on the privacy and security of the data. Therefore, the algorithm of the embodiment uses the local data of the factory to train the intelligent production model of the factory, keeps the confidential data of the factory in the local server of the factory, and only needs to transmit the intelligent production model to the cloud server, thereby greatly reducing the risk of data leakage. In addition, because the data of a single intelligent factory has the defects of less data volume, single data structure and the like, the accuracy of the model can be greatly influenced in the training process of the intelligent production model, so that the algorithm of the embodiment can upload the intelligent production models trained by a plurality of intelligent factories of the same type to a cloud server for federal learning aggregation, and the accuracy of the intelligent production model in the single factory can be greatly improved. Therefore, the algorithm of the embodiment can ensure safety and efficiency in practical application.
As shown in fig. 1, the embodiment provides an adaptive federal edge learning method based on joint bi-dimensional user scheduling, which specifically includes the steps of:
1. adaptive dynamic batch gradient descent algorithm
Edge device: determining an evaluation efficiency according to a loss function and a running time, wherein the loss function can reflect the accuracy of a model of historical batch data, the model is obtained by fitting a sample and a label (if a simplified formula is assumed to be y=ax+b, the sample is x, the label is y, and A and B are models); batch data (batch size) is determined based on the historical batch data, the evaluation efficiency, and the historical evaluation efficiency, the batch data being used to determine a dynamically updated model. For example, the local data of the intelligent plant includes sample data and labels, so the plant can train the intelligent production model locally, the intelligent plant being edge-wise.
Step (1): loss prediction: and calculating the variation delta loss of the current iteration loss and the previous n losses according to the secondary linear characteristic of the convergence speed of the gradient descent algorithm through a loss function.
Δloss=f(x-n)-f(x)。
Where Δloss is the loss variation, f (x-n) is the loss value of the previous n iterations, and f (x) is the loss value of the current iteration. This formula means the amount of change between the loss value of the current iteration and the loss value of the previous n iterations, f () is a loss function, f (x) represents the loss value of the current iteration, and f (x-n) represents the loss value of the previous n iterations.
Step (2): efficiency evaluation: the training period t is a constant, is a time threshold for evaluating efficiency, and can be set as needed. The algorithm efficiency e is used to evaluate the model training effect of the previous n iterations under the same batch, and is determined by the loss variation and the training period.
Step (3): and the dynamic fitting gradient descent algorithm determines triggering conditions of batch switching through efficiency evaluation parameters. The local data are randomly distributed into batches with different data sizes and stored in a list L, and the algorithm selects the smallest batch to start the first iteration. And calculating algorithm efficiency e after each iteration is finished. And triggering batch switching to a larger batch to be used as batch data until the algorithm efficiency e of the nth time is smaller than the algorithm efficiency e of the n-1 th time. To avoid local optimizations, algorithms allow switching back to smaller batches as batch data when the current algorithm efficiency is lower than the historical efficiency. When the current algorithm efficiency is negative, the current batch is proved to be incapable of enabling the algorithm to be normally converged, the algorithm is switched to a larger batch to be used as batch data, and meanwhile, the batch data is prevented from being accessed again in the subsequent training process, so that the algorithm is prevented from shaking.
2. Two-dimensional user scheduling policy
The central server receives models from a plurality of edge devices; the central server rejects models of part of the edge devices, and the models are used for federal learning aggregation. For example, the intelligent factories of the same type upload respective trained intelligent production models to the cloud, and the models are delivered to the intelligent factories after federal learning aggregation for new rounds of federal edge learning.
The method for eliminating the central server comprises the following two steps:
the method comprises the following steps: the difference of edge equipment is reduced, and the purpose is to improve the speed and reduce the calculation time. When the model is derived from the edge equipment with weaker computing power, rejecting the model; otherwise, the reservation is made.
The main process is that the algorithm firstly obtains the training parameters of the local model from a group of edge equipment subsets M with heterogeneous computing power, and sets a longest time threshold value array T specified by a user scheduling strategy according to different equipment performances. The next local model training time for each device in M is compared to the maximum time threshold specified for device i. If the local training time is less than or equal to the threshold value, reserving the device i in the device subset M; if the local training time is greater than the threshold, device i is culled in device subset M. The threshold-meeting devices in subset M are updated to a new subset M 1
The local model and the data set are stored in the device, for example, the device I is rejected in federal edge learning, and the local model I in the device I does not participate in federal aggregation of federal learning. The model I need not be considered in order to reject the device I.
The second method is as follows: the diversity of the edge equipment data sets is improved, and the purpose is to improve the accuracy. When the diversity index of the model is lower than a threshold value, rejecting the model; otherwise, the reservation is made. The optional diversity index may be a keni-simpson index or a shannon entropy index, and the present scheme is not limited.
The main process is to traverse subset M 1 The diversity index G of the data set in each device and stored in the diversity index array G. The diversity index in G is then arranged from large to small and based on diversity constraintsScreening from large to small in array G. If the diversity index G of device i in array G is in the diversity constraint +.>Within, then at device subset M 1 Reserving the device i; if the diversity index G of device i in array G is in the diversity constraint +.>Outside, then at device subset M 1 The device i is removed. Finally, the updated device subset M is output 2 And uses this user scheduling setting in federal learning for the current iteration.
Through carrying out diversity analysis on the data set in each device, when the diversity index of the model is lower than a threshold value, rejecting the device corresponding to the model; otherwise, the reservation is made.
The formula of the kini-simpson index is:
wherein C is the total number of categories, p c Is the probability of category c.
The formula of shannon entropy index is:
wherein C is the total number of categories, p c Is the probability of category c.
In summary, the main technical scheme of this embodiment is as follows:
1. adaptive dynamic batch gradient descent algorithm: edge device: determining evaluation efficiency according to a loss function and running time, wherein the loss function can reflect the accuracy of a model of historical batch data, and the model is obtained by fitting a sample and a label; batch size (batch size) is determined from the historical batch data, the evaluation efficiency, and the historical evaluation efficiency, the batch data being used to determine the model.
2. The difference of edge equipment is reduced, and the purpose is to improve the speed and reduce the calculation time. When the model is derived from the edge equipment with weaker computing power, rejecting the model; otherwise, the reservation is made.
3. The diversity of the edge equipment data sets is improved, and the purpose is to improve the accuracy. When the diversity index of the model is lower than a threshold value, rejecting the model; otherwise, the reservation is made. Alternatively, the diversity index may be a keni-simpson index or a shannon entropy index, which is not limited in this embodiment.
The beneficial effects of this embodiment are:
1. the batch data is determined more accurately, so that the accuracy and the efficiency of the dynamic balance model can be further improved.
2. The model for federal learning aggregation is determined more accurately, and the accuracy and efficiency of federal learning algorithms can be further improved.
The present application is not limited to the above-mentioned embodiments, and any changes or substitutions that can be easily understood by those skilled in the art within the technical scope of the present application are intended to be included in the scope of the present application. Therefore, the protection scope of the present application should be subject to the protection scope of the claims.

Claims (4)

1. The self-adaptive federal edge learning method based on the joint bi-dimensional user scheduling is characterized by comprising the following steps of:
s1, acquiring evaluation efficiency of model training based on a loss function and a training period;
s2, acquiring batch data based on the evaluation efficiency, and acquiring a trained initial model based on the batch data;
the obtaining of the batch data includes:
based on the evaluation efficiency, presetting a triggering condition of batch switching;
randomly distributing local data into batches with different data sizes, storing the batches in a list, selecting the smallest batch in the list to start iteration for the first time, calculating the evaluation efficiency after each iteration is finished, and switching to a batch with a preset value as the batch data when the acquired evaluation efficiency meets the triggering condition for triggering batch switching;
the triggering conditions include: the first trigger condition, the second trigger condition and the third trigger condition;
the first triggering condition is as follows: first, theThe evaluation efficiency of the times is smaller than +.>Secondary said evaluation efficiency;
the second triggering condition is as follows: the current evaluation efficiency is lower than the historical evaluation efficiency;
the third triggering condition is as follows: the current evaluation efficiency is negative;
the batch switched to the preset value is used as the batch data and comprises the following steps:
when the acquired evaluation efficiency meets the first trigger condition, switching to a batch with a first preset value as the batch data;
when the acquired evaluation efficiency meets the second trigger condition, switching to a batch with a second preset value as the batch data;
when the acquired evaluation efficiency meets the third trigger condition, switching to a batch with a third preset value as the batch data;
the first preset value is larger than the second preset value, and the third preset value is larger than the first preset value;
s3, screening the initial model, and putting the screened model back into the S1 for repeated iteration until a plurality of iteration processes are completed, so as to obtain a final trained model;
screening the initial model includes:
when the initial model is derived from edge equipment with lower than preset computing capacity, rejecting the initial model;
carrying out diversity analysis on the removed residual models, and removing the edge equipment corresponding to the models when the diversity index of the models is lower than a threshold value; otherwise, reserving;
when the initial model is derived from the edge equipment with lower than preset computing power, rejecting the initial model comprises:
obtaining training parameters of the initial model locally from a group of edge equipment subsets with heterogeneous computing capacities, and setting a longest time threshold according to different edge equipment performances;
local said initial model training time and edge device for each edge device in a subset of edge devicesComparing the specified maximum time threshold; if the local training time is not greater than +.>The maximum time threshold specified, then edge devices are +_ in the subset of edge devices>Retaining; if the local training time is greater than the edge device +.>The maximum time threshold specified, then edge devices are +_ in the subset of edge devices>Removing; the edge devices meeting the threshold requirement in the subset of the edge devices are updated to be a new subset +.>
The diversity analysis of the removed residual models comprises the following steps:
traversing the new subsetDiversity index of each edge device in +.>And stored to diversity index array +.>In (a) and (b); then will->The diversity index of (2) is arranged from large to small and is according to diversity constraint +.>In array->Screening from large to small; if the array->Middle edge device->Diversity index>In diversity constraint->Within, then in the new subset +.>Edge device->Retaining;if the array->Middle edge device->Diversity index>In diversity constraint->In addition, then in the new subset +.>Middle culling edge device->The method comprises the steps of carrying out a first treatment on the surface of the Finally, the updated edge device subset is output +.>And uses this user scheduling setting in federal learning for the current iteration.
2. The adaptive federal edge learning method based on joint bi-dimensional user scheduling of claim 1, wherein obtaining the evaluation efficiency of model training comprises:
acquiring the iteration loss and the loss variation before a plurality of times based on the loss function;
and acquiring the evaluation efficiency based on the loss variation and the training period.
3. The adaptive federal edge learning method based on joint bi-dimensional user scheduling according to claim 2, wherein the loss variance is:
wherein (1)>To lose the variable->Loss value for the previous n iterations, +.>The loss value of the iteration is obtained.
4. The adaptive federal edge learning method based on joint bi-dimensional user scheduling according to claim 1, wherein the evaluation efficiency is:
wherein (1)>To evaluate the efficiency->In order to lose the amount of change,tis a training period.
CN202310050202.6A 2023-02-01 2023-02-01 Self-adaptive federal edge learning method based on joint bi-dimensional user scheduling Active CN116050540B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310050202.6A CN116050540B (en) 2023-02-01 2023-02-01 Self-adaptive federal edge learning method based on joint bi-dimensional user scheduling

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310050202.6A CN116050540B (en) 2023-02-01 2023-02-01 Self-adaptive federal edge learning method based on joint bi-dimensional user scheduling

Publications (2)

Publication Number Publication Date
CN116050540A CN116050540A (en) 2023-05-02
CN116050540B true CN116050540B (en) 2023-09-22

Family

ID=86113053

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310050202.6A Active CN116050540B (en) 2023-02-01 2023-02-01 Self-adaptive federal edge learning method based on joint bi-dimensional user scheduling

Country Status (1)

Country Link
CN (1) CN116050540B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116610960B (en) * 2023-07-20 2023-10-13 北京万界数据科技有限责任公司 Monitoring management system for artificial intelligence training parameters
CN116701478B (en) * 2023-08-02 2023-11-24 蘑菇车联信息科技有限公司 Course angle determining method, course angle determining device, computer equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111401552A (en) * 2020-03-11 2020-07-10 浙江大学 Federal learning method and system based on batch size adjustment and gradient compression rate adjustment
CN112181666A (en) * 2020-10-26 2021-01-05 华侨大学 Method, system, equipment and readable storage medium for equipment evaluation and federal learning importance aggregation based on edge intelligence
CN114327889A (en) * 2021-12-27 2022-04-12 吉林大学 Model training node selection method for layered federated edge learning
CN115221955A (en) * 2022-07-15 2022-10-21 东北大学 Multi-depth neural network parameter fusion system and method based on sample difference analysis

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114981795A (en) * 2020-01-14 2022-08-30 Oppo广东移动通信有限公司 Resource scheduling method and device and readable storage medium
CN114217933A (en) * 2021-12-27 2022-03-22 北京百度网讯科技有限公司 Multi-task scheduling method, device, equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111401552A (en) * 2020-03-11 2020-07-10 浙江大学 Federal learning method and system based on batch size adjustment and gradient compression rate adjustment
CN112181666A (en) * 2020-10-26 2021-01-05 华侨大学 Method, system, equipment and readable storage medium for equipment evaluation and federal learning importance aggregation based on edge intelligence
CN114327889A (en) * 2021-12-27 2022-04-12 吉林大学 Model training node selection method for layered federated edge learning
CN115221955A (en) * 2022-07-15 2022-10-21 东北大学 Multi-depth neural network parameter fusion system and method based on sample difference analysis

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Multi-server_federated_edge_learning_for_low_power_consumption_wireless_resource_allocation_based_on_user_QoE;Tianyi Zhou et al.;《JOURNAL OF COMMUNICATIONS AND NETWORKS》;第23卷(第6期);第463-472页 *
联邦边缘学习的低功耗带宽分配与用户调度;周天依等;《Journal of Beijing Information Science & Technology University》;第37卷(第1期);第27-33页 *

Also Published As

Publication number Publication date
CN116050540A (en) 2023-05-02

Similar Documents

Publication Publication Date Title
CN116050540B (en) Self-adaptive federal edge learning method based on joint bi-dimensional user scheduling
US20210133534A1 (en) Cloud task scheduling method based on phagocytosis-based hybrid particle swarm optimization and genetic algorithm
Liu et al. A reinforcement learning-based resource allocation scheme for cloud robotics
CN113191503B (en) Decentralized distributed learning method and system for non-shared data
CN108989098B (en) Time delay optimization-oriented scientific workflow data layout method in hybrid cloud environment
CN106250381A (en) The row sequence optimized for input/output in list data
CN115934333A (en) Historical data perception-based cloud computing resource scheduling method and system
CN115374853A (en) Asynchronous federal learning method and system based on T-Step polymerization algorithm
CN112329820A (en) Method and device for sampling unbalanced data under federal learning
CN116362329A (en) Cluster federation learning method and device integrating parameter optimization
CN114650228A (en) Federal learning scheduling method based on computation unloading in heterogeneous network
CN113887748B (en) Online federal learning task allocation method and device, and federal learning method and system
CN112149990A (en) Fuzzy supply and demand matching method based on prediction
CN112990420A (en) Pruning method for convolutional neural network model
CN110275868A (en) A kind of multi-modal pretreated method of manufaturing data in intelligent plant
CN112232401A (en) Data classification method based on differential privacy and random gradient descent
CN114401192B (en) Multi-SDN controller cooperative training method
CN105187488A (en) Method for realizing MAS (Multi Agent System) load balancing based on genetic algorithm
CN104507150A (en) Method for clustering virtual resources in baseband pooling
CN111290853B (en) Cloud data center scheduling method based on self-adaptive improved genetic algorithm
CN116257361B (en) Unmanned aerial vehicle-assisted fault-prone mobile edge computing resource scheduling optimization method
CN112286689A (en) Cooperative shunting and storing method suitable for block chain workload certification
CN105765569B (en) A kind of data distributing method, loading machine and storage system
CN112306641B (en) Training method for virtual machine migration model
CN117221122B (en) Asynchronous layered joint learning training method based on bandwidth pre-allocation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant