CN114519417A

CN114519417A - Model training method, device, equipment and medium for edge equipment

Info

Publication number: CN114519417A
Application number: CN202210067014.XA
Authority: CN
Inventors: 杜翠凤; 蒋仕宝; 滕少华
Original assignee: Guangzhou Jiesai Communication Planning And Design Institute Co ltd; Guangdong University of Technology; GCI Science and Technology Co Ltd
Current assignee: Guangzhou Jiesai Communication Planning And Design Institute Co ltd; Guangdong University of Technology; GCI Science and Technology Co Ltd
Priority date: 2022-01-20
Filing date: 2022-01-20
Publication date: 2022-05-20

Abstract

The invention provides a model training method, a device, equipment and a medium for edge equipment, wherein the method comprises the following steps: based on an improved firework algorithm, selecting a group of optimal clients, and sending model parameters to each client in the group, so that each client trains a local model corresponding to the client according to the corresponding model parameters; acquiring a trained local model and model parameters corresponding to the trained local model sent by a client, and aggregating the local model and the corresponding parameters to obtain a trained global model; and returning to the client training step when the first preset iteration termination condition is judged not to be met. By adopting the embodiment of the invention, the client combination participating in training can be preferentially selected based on the improved firework algorithm during each training, and the global model of the cloud platform server is split into a plurality of local models by the splitting method, so that the efficient training of a large model in a limited edge network is realized.

Description

Model training method, device, equipment and medium for edge equipment

Technical Field

The invention relates to the technical field of deep learning, in particular to a model training method, a model training device, model training equipment and a model training medium for edge equipment.

Background

With the advent of the internet of things, large amounts of data are generated at the edge of the network. Deep Neural Networks (DNNs) are endowed with a large amount of data for training to greatly improve usability. However, the traditional centralized DNN training method needs to collect a large amount of raw data from a network node, upload the collected data to the cloud, and perform overall training on the model at the cloud. In this way, there are great restrictions on the computing power of the cloud and the scale of the model, and even in this way, the cloud side gathers in all the data, and there is a high requirement on the data encryption capability of the cloud side.

Disclosure of Invention

The invention provides a model training method, a model training device, equipment and a medium for edge equipment, which can improve the model training precision of the edge equipment.

In order to achieve the above object, in a first aspect, an embodiment of the present invention provides a method for training a model of an edge device, where the method is applied to a server, and the method includes:

obtaining model parameters of a model to be trained;

based on an improved firework algorithm, selecting a group of optimal clients, and correspondingly sending the model parameters to each client in the group according to a preset splitting mode, so that each client trains a local model corresponding to the client according to the corresponding model parameters;

acquiring a trained local model and model parameters corresponding to the trained local model sent by a client, and aggregating the local model and the corresponding parameters to obtain a trained global model;

and returning to the improved firework algorithm when the first preset iteration termination condition is not met, selecting a group of optimal clients, and correspondingly sending the model parameters to each client in the group according to a preset splitting mode, so that each client trains the local model corresponding to the client according to the corresponding model parameters.

As an optional embodiment, the selecting a group of optimal clients based on the improved firework algorithm, and correspondingly sending the model parameters to each client in the group according to a preset splitting manner, so that each client trains a local model corresponding to the client according to the corresponding model parameters, includes:

generating a group of fireworks with a specified population size based on a preset chaotic mapping method, and calculating the fitness value of each firework;

calculating the number of explosion sparks of each firework according to the fitness value of each firework, and limiting the number of the explosion sparks;

calculating the explosion radius of each firework according to the adaptability value of each firework, and detecting the minimum value in the explosion radius;

generating explosion sparks and Gaussian abrupt sparks according to the number of the explosion sparks and the explosion radius;

obtaining a new candidate population according to the fireworks, the explosion sparks and the Gaussian mutation sparks, and obtaining a current optimal value according to the candidate population;

judging whether a second preset iteration termination condition is met, if so, stopping iteration, and correspondingly sending the split model parameters to each client in the group according to a preset splitting mode;

if not, obtaining a new firework according to a preset selection rule, and returning to the step of calculating the number of explosion sparks and the explosion radius of each firework according to the fitness value of each firework.

As an alternative embodiment, the formula for calculating the number of explosion sparks of each firework according to the fitness value of each firework is as follows:

wherein S is_iNumber of explosion sparks, F (w), for fireworks i_i) Is the fitness value of Firework i, y_min＝min(F(w_i))，y_max＝maX(F(w_i))，

ε is all constants.

As an alternative embodiment, the calculation formula for calculating the explosion radius of each firework according to the fitness value of each firework is as follows:

wherein R is_iSetting R as the initial maximum explosion radius of the firework i, Gen as the current iteration number, and MaxGen as the highest iteration number.

As an optional embodiment, after the obtaining the trained local model and the model parameters corresponding to the trained local model sent by the client, and aggregating the local model and the parameters corresponding to the local model to obtain the trained global model, the method further includes:

and calculating the global loss value of the training according to the global model, and taking the global loss value as the adaptive value of the improved firework algorithm.

The embodiment of the invention provides a model training device of edge equipment, which comprises:

the model parameter acquisition module is used for acquiring model parameters of the model to be trained;

the model parameter sending module is used for selecting a group of optimal clients based on an improved firework algorithm and correspondingly sending the model parameters to each client in the group according to a preset splitting mode so that each client trains a local model corresponding to the client according to the corresponding model parameters;

the global model aggregation module is used for acquiring a trained local model and model parameters corresponding to the trained local model sent by the client, and aggregating the local model and the corresponding parameters to obtain a trained global model;

and the model training iteration module is used for returning to the improved firework algorithm when the first preset iteration termination condition is judged not to be met, selecting a group of optimal clients, and correspondingly sending the model parameters to each client in the group according to a preset splitting mode so as to enable each client to train the local model corresponding to the client according to the corresponding model parameters.

As an optional embodiment, the model parameter sending module selects a group of optimal clients based on an improved firework algorithm, and correspondingly sends the model parameters to each client in the group according to a preset splitting manner, so that each client trains a local model corresponding to the client according to the corresponding model parameters, including:

An embodiment of the present invention provides a terminal device, which includes a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, where the processor implements the model training method for an edge device according to any one of the above embodiments when executing the computer program.

The embodiment of the present invention provides a computer-readable storage medium, where the computer-readable storage medium includes a stored computer program, and when the computer program runs, a device where the computer-readable storage medium is located is controlled to execute the model training method according to any one of the above embodiments.

Compared with the prior art, the model training method, the device, the equipment and the medium for the edge equipment provided by the embodiment of the invention can be used for preferentially selecting the client combination participating in training to ensure the training precision of a final model based on an improved firework algorithm during each training, in addition, the global model of the cloud platform server is split into a plurality of local models by a splitting method, the local models are distributed to different clients for training, and the training results are sent to the server for aggregation, so that the high-efficiency training of a large model in a limited edge network is realized, and the model training precision of the edge equipment is further improved.

Drawings

Fig. 1 is a schematic flowchart of a model training method for an edge device according to an embodiment of the present invention;

FIG. 2 is a schematic structural diagram of a model training apparatus for edge devices according to an embodiment of the present invention;

fig. 3 is a schematic structural diagram of a terminal device according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In a first aspect, an embodiment of the present invention provides a method for training a model of an edge device, where the method is applied to a server, and fig. 1 is a schematic flow chart of the method for training a model of an edge device according to the embodiment of the present invention, where the method includes steps S11 to S14:

and S11, obtaining model parameters of the model to be trained.

Specifically, the server defines the structure of the model to be trained by all clients, and determines the hyper-parameters of the local and global models, such as the learning rate of the optimizer or neural network, and broadcasts the models and parameters to the clients. Wherein in the first round, the broadcasted model is not trained, and in the next rounds, the server broadcasts the model aggregated by the local model trained by the client.

S12, selecting a group of optimal clients based on an improved firework algorithm, and correspondingly sending the model parameters to each client in the group according to a preset splitting mode, so that each client trains a local model corresponding to the client according to the corresponding model parameters.

S13, obtaining the trained local model and the corresponding model parameters sent by the client, and aggregating the local model and the corresponding parameters to obtain the trained global model.

It can be understood that the client trains the local models locally, and each client shares its own training model with the server after training.

It should be noted that each server training time may be asynchronous, and after the client has trained, the client may separately interface with the server without waiting for all local models to be trained and then interface.

And S14, when the first preset iteration termination condition is judged not to be met, returning to the improved firework algorithm, selecting a group of optimal clients, and correspondingly sending the model parameters to each client in the group according to a preset splitting mode, so that each client trains the local model corresponding to the client according to the corresponding model parameters.

It can be understood that the server aggregates the local models trained by the clients to construct a global model, calculates the global loss, uses the global loss as an adaptive function of the firework algorithm to judge whether the conditions are met, if the conditions are met, the next training is performed, and if the conditions are not met, the server goes to step S12 to perform iteration again to select a more optimal client combination.

It can be further understood that, in the prior art, with the development of the distributed policy, a mode of splitting the whole model is adopted, the local model is delivered to the client to be trained respectively, then the local model parameters are gathered to a centralized node periodically, and the global model is updated through the federal average algorithm. And local training and parameter aggregation are carried out on a plurality of federal epochs to obtain a global DNN model with higher precision. However, how to train in this distributed training mode can ensure the accuracy of the global model, and the problem of how to select the client and how to allocate resources during training is not well solved.

To this end, the invention proposes a new strategy for cooperative training of edge devices based on an improved firework algorithm. Compared with the prior art, the model training method for the edge device provided by the embodiment of the invention can be used for preferentially selecting the client combination participating in training to ensure the training precision of the final model during each training based on the improved firework algorithm, and in addition, the global model of the cloud platform server is split into a plurality of local models by the splitting method, the local models are distributed to different clients for training, and the training results are sent to the server for aggregation, so that the efficient training of a large model in a limited edge network is realized, and the model training precision of the edge device is further improved.

As an alternative embodiment, the step S12 includes:

s121, generating a group of fireworks with a specified population size based on a preset chaotic mapping method, and calculating the fitness value of each firework;

s122, calculating the number of explosion sparks of each firework according to the fitness value of each firework, and limiting the number of the explosion sparks;

s123, calculating the explosion radius of each firework according to the adaptability value of each firework, and detecting the minimum value in the explosion radius;

s124, generating explosion sparks and Gaussian abrupt sparks according to the number of the explosion sparks and the explosion radius;

s125, obtaining a new candidate population according to the fireworks, the explosion sparks and the Gaussian mutation sparks, and obtaining a current optimal value according to the candidate population;

s126, judging whether a second preset iteration termination condition is met, if so, stopping iteration, and correspondingly sending the split model parameters to each client in the group according to a preset splitting mode;

and S126, if not, obtaining new fireworks according to a preset selection rule, and returning to the step of calculating the number of explosion sparks and the explosion radius of each firework according to the fitness value of each firework.

Specifically, operators are randomly selected by utilizing elite, so that a large amount of calculation overhead is avoided. Here, the best fireworks found so far are used as new individuals until the next iteration, and other fireworks are randomly selected in the population.

It is worth to be noted that the edge devices perform collaborative training, the final result is uploaded to the cloud platform, and the cloud platform shares model updates with the server, so as to aggregate and construct a global model. Therefore, an optimal training result is provided for model training based on edge device cooperation, in addition, the chaos mapping is added during initialization by adopting an improved firework algorithm, the initial population distribution is more average and diversified, the limits of the minimum explosion radius and the number of explosion sparks are added in the explosion stage, the retrieval capability of the algorithm is improved, and the situation that the algorithm is trapped in local optimization is avoided.

As an optional embodiment, for example, the step S121 specifically includes:

generating a group of chaotic variable sequences C with the same length as the population size N and with the same size D by using a chaotic mapping equation_x＝{cx_l，cx₂，…，cx_N) Wherein, cx_i＝{cx_i1,cx_i2,…,cx_id,…,cx_iDN is the size of the population, and D is the dimension of the fireworks. Then, the chaotic variable cx is expressed according to the following formula (1)_dMapping to (L) for (0,1)_d,U_d) I.e. extending the traversal of chaotic motion from (0,1) to (L)_d,U_d). Thus generating an initial pyrotechnic x ═ x of size N₁,x₂,…,x_N}. Wherein x_i＝{x_i1,x_i2,…,x_id,…,x_iD},(L_d,U_d) Is the variable optimization interval of the optimization problem. In the present invention, all clients are encoded, for example, assuming that there are 10 clients for allocation, the ten clients are encoded as { '0', '1', '2', …, '9' }, in which case, the variable interval of the optimization problem is the encoding interval of the client.

As an optional embodiment, for example, the step S122 specifically includes:

the model of the server is split and submitted to a client to train the local model, after the training is finished, the local model and the parameters are transmitted to the server to be aggregated, after the aggregation is finished, the loss of the training is calculated and used as the adaptive value of the fireworks, the global loss of the ith fireworks in the iteration of the fireworks algorithm, namely the fitness can be defined as F (w)_i) Wherein w is_iRepresenting parameters of the global model.

It should be noted that the global model includes a plurality of parameters to be trained, and assuming that 100 parameters are included, since each local model is trained on 100 parameters, considering that the amount of data included in the local model is too small, and the accuracy of the trained 100 parameters is not high, by dividing the 100 parameters into 10 parts, 1-10, 11-20, …, 90-100, and sending the first part of parameters (1-10) to clients with 1-100 codes for training the local models, the local models mainly optimize the parameters 1-10, and the second part of parameters are sent to clients with 101-200 codes for training the local models, so that each part of parameters is sent to different clients for training. The model splitting is performed in a global server, and then the corresponding parameters are optimized by each client and uploaded to a global model.

Further, in the fireworks algorithm, fireworks and sparks generated by the explosion together constitute the whole fireworks population. Each firework explodes and produces a certain amplitude of fireworks. To balance the development and exploratory capabilities of the algorithm, each firework produces a different number of sparks. Generally, fireworks with poor adaptability have the characteristics of large explosion radius and less sparks generated by explosion; the firework with better adaptability has the characteristics of small explosion radius and more sparks generated by explosion. The number of the explosion sparks of the ith firework can be calculated, namely the number of the explosion sparks of each firework is calculated according to the fitness value of each firework by the following calculation formula:

Is a constant used to control the number of explosion sparks, and epsilon is a very small constant used to ensure that the denominator is not 0.

Further, in order to avoid excessive sparks generated by fireworks with good fitness and too little sparks generated by fireworks with poor fitness, the number of sparks is limited by the formula (2):

where 0< a, b <1 are two constant parameters that limit the population size.

As an optional embodiment, for example, the step S123 specifically includes:

the maximum explosion radius is set as a constant in the traditional firework algorithm, so that the later fine searching capability of the algorithm is weakened, and the solving precision of the algorithm is low. If the maximum explosion radius shows a non-linear decreasing change trend on the whole, the algorithm is favorable for global search in the early stage and favorable for local search in the later stage, so that the effect of self-adaptive acceleration is achieved. In the invention, the maximum explosion radius is set as a simulated annealing factor, R is set as an initial maximum explosion radius, Gen is the current iteration number, MaxGen is the highest iteration number, and then the maximum explosion radius of the ith firework is defined as follows:

the firework detonation radius with the maximum detonation radius being the simulated annealing factor can be defined as:

Further, in order to avoid the optimal fireworks to produce explosion fireworks within a small radius, the explosion fireworks will be located at (almost) the same position as the fireworks themselves, and a minimum radius check strategy is performed. For each dimension k, the limits of the detonation radius are as follows:

here, during each iteration,

can be obtained by a non-linear decreasing function of equation (6).

Wherein A is_ibitAnd A_finalRespectively, the initial and final minimum amplitudes, eval_maxIs the maximum number of function computations, and t represents the current number of evaluations.

It is worth to be noted that by adopting the firework algorithm and the adaptive explosion radius, the algorithm is beneficial to global search in the early stage and local search in the later stage, so that the retrieval and search capability of the algorithm is improved, and the algorithm process is accelerated.

As an optional embodiment, for example, the step S124 specifically includes:

to maintain the diversity of the population, m (m) ≦ N fireworks in the population were randomly selected and mutated in the k dimension. To avoid wasting search resources on test functions whose optimal value is not at the origin, a new gaussian mutation is used, defined as follows:

x_ik＝x_ik+g*(x_Bk-x_ik) (7)

wherein x is_BkThe position of the best fireworks/exploding sparks found so far, g ═ N (0, 1).

When the position of the new spark exceeds the search range of dimension k, the new spark will be mapped to another position of the current fireworks algorithm search range as follows:

x_ik＝x_LB,k+rand(x_UB,k-x_LB,k) (8)

wherein x is_LB,kAnd x_UB,kRespectively the upper and lower bounds of the search space in the k-dimension.

As an alternative embodiment, after the step S13, the method further includes:

Referring to fig. 2, which is a schematic structural diagram of a model training apparatus for edge equipment according to an embodiment of the present invention, the embodiment of the present invention provides a model training apparatus for edge equipment, including:

a model parameter obtaining module 21, configured to obtain model parameters of a model to be trained;

the model parameter sending module 22 is configured to select a group of optimal clients based on an improved firework algorithm, and correspondingly send the model parameters to each client in the group according to a preset splitting manner, so that each client trains a local model corresponding to the client according to the corresponding model parameters;

the global model aggregation module 23 is configured to obtain a trained local model sent by a client and model parameters corresponding to the trained local model, and aggregate the local model and the corresponding parameters to obtain a trained global model;

and the model training iteration module 24 is configured to, when it is determined that the first preset iteration termination condition is not met, return to the improved firework algorithm, select a group of optimal clients, and correspondingly send the model parameters to each client in the group according to a preset splitting manner, so that each client trains a local model corresponding to the client according to the corresponding model parameters.

Compared with the prior art, the model training device for the edge device provided by the embodiment of the invention can perform preferred selection on the client combination participating in training based on an improved firework algorithm during each training so as to ensure the training precision of a final model, in addition, the global model of the cloud platform server is split into a plurality of local models by a splitting method, the local models are distributed to different clients for training, and the training results are sent to the server for aggregation, so that the efficient training of a large model in a limited edge network is realized, and the model training precision of the edge device is further improved.

As an alternative embodiment, the model parameter sending module 22 includes:

As an optional embodiment, for example, the generating a group of fireworks with a specified population size based on a preset chaotic mapping method, and calculating a fitness value of each of the fireworks specifically includes:

generating a group of chaotic variable sequences C with the same length as the population size N and with the same size D by using a chaotic mapping equation_x＝{cx₁,cx₂,…,cx_NIn which cx_i＝{cx_i1,cx_i2,…,cx_id,…,cx_iDN is the size of the population, D is the dimension of the fireworks. Then, the chaotic variable cx is expressed according to the following equation (9)_dMapping to (L) for (0,1)_d,U_d) I.e. extending the traversal of chaotic motion from (0,1) to (L)_d,U_d). Thus generating an initial pyrotechnic x ═ x of size N₁,x₂,…,x_N}. Wherein x_i＝{x_i1,x_i2,…,x_id,…,x_iD},(L_d,U_d) Is the variable optimization interval of the optimization problem. In the present invention, all clients are encoded, for example, assuming that there are 10 clients for allocation, the ten clients are encoded as { '0', '1', '2', …, '9' }, in which case, the variable interval of the optimization problem is the encoding interval of the client.

As an alternative embodiment, for example, the calculating the number of explosion sparks of each firework according to the fitness value of each firework, and limiting the number of explosion sparks specifically includes:

The global model comprises a plurality of parameters needing to be trained, such as 100 parameters, because each local model needs to train 100 parameters, considering that the data volume taken by the local model is too small, the accuracy of the 100 trained parameters is not high, the 100 parameters are divided into 10 parts, 1-10, 11-20, … and 90-100, so that the first part of parameters (1-10) are sent to clients with 1-100 codes for local model training, the models mainly optimize the parameters of 1-10, the second part of parameters are sent to the clients with 101-200 codes for local model training, and thus the parameters of each part are put into different clients for training. The model splitting is performed in the global server, and then the corresponding parameters are optimized by each client and uploaded to the global model.

Further, in the fireworks algorithm, fireworks and sparks generated by explosion together constitute the whole fireworks population. Each firework explodes and produces a certain amplitude of fireworks. To balance the development and exploratory capabilities of the algorithm, each firework produces a different number of sparks. Generally, fireworks with poor adaptability have the characteristics of large explosion radius and less sparks generated by explosion; the firework with better adaptability has the characteristics of small explosion radius and more sparks generated by explosion. The number of the explosion sparks of the ith firework can be calculated, namely the calculation formula of the number of the explosion sparks of each firework according to the fitness value of each firework is as follows:

where 0< a, b <1 are two constant parameters that limit the population size.

As an optional embodiment, for example, the calculating the explosion radius of each firework according to the fitness value of each firework and detecting the minimum value of the explosion radii specifically includes:

here, during each iteration,

can be obtained by a non-linear decreasing function of equation (6).

Wherein A is_initAnd A_finalRespectively, the initial and final minimum amplitudes, eval_maxIs the maximum number of function computations, and t represents the current number of evaluations.

As an alternative embodiment, for example, the generating of the explosion sparks and the gaussian abrupt change sparks according to the number of the explosion sparks and the explosion radius specifically includes:

x_ik＝x_ik+g*(x_Bk-x_ik) (15)

X_ik＝X_LB，k+rand(X_UB，k-X_LB，k) (16)

wherein x is_LB,kAnd x_UB,kRespectively an upper and a lower bound in the k-dimension of the search space.

As an alternative embodiment, after the global model aggregation module 23, the apparatus is further configured to:

In addition, it should be noted that specific implementation schemes and advantageous effects of the embodiments of the model training apparatus for edge devices provided in the embodiments of the present invention are the same as those of the embodiments of the model training method for edge devices provided in the embodiments of the present invention, and are not described herein again.

An embodiment of the present invention provides a terminal device, and referring to fig. 3, the terminal device is a schematic structural diagram provided in the embodiment of the present invention. The terminal device 3 of this embodiment includes: a processor 30, a memory 31 and a computer program stored in said memory 31 and executable on said processor 30. The processor 30, when executing the computer program, implements the method for model training of edge devices according to any of the embodiments of the first aspect. Alternatively, the processor 30 implements the functions of the modules in the above device embodiments when executing the computer program.

Illustratively, the computer program may be divided into one or more modules, which are stored in the memory 31 and executed by the processor 30 to accomplish the present invention. The one or more modules may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution process of the computer program in the terminal device 3.

The terminal device 3 may be a desktop computer, a notebook, a palm computer, a cloud server, or other computing devices. The terminal device 3 may include, but is not limited to, a processor 30 and a memory 31. It will be appreciated by those skilled in the art that the schematic diagram is merely an example of a terminal device, and does not constitute a limitation of the terminal device, and may include more or less components than those shown, or combine some components, or different components, for example, the terminal device 3 may further include an input-output device, a network access device, a bus, etc.

The Processor 30 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, or the like. The general purpose processor may be a microprocessor or the processor may be any conventional processor or the like, and the processor 30 is the control center of the terminal device 3 and connects the various parts of the whole terminal device 3 by various interfaces and lines.

The memory 31 may be used for storing the computer programs and/or modules, and the processor 30 implements various functions of the terminal device 3 by running or executing the computer programs and/or modules stored in the memory 31 and calling data stored in the memory 31. The memory 31 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. In addition, the memory 31 may include a high speed random access memory, and may also include a non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other volatile solid state storage device.

Wherein, the module integrated by the terminal device 3 can be stored in a computer readable storage medium if it is implemented in the form of software functional unit and sold or used as a stand-alone product. Based on such understanding, all or part of the flow in the method according to the above embodiments may be implemented by a computer program, which may be stored in a computer readable storage medium and used by the processor 30 to implement the steps of the above embodiments of the method. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.

It should be noted that the above-described device embodiments are merely illustrative, where the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. In addition, in the drawings of the embodiment of the apparatus provided by the present invention, the connection relationship between the modules indicates that there is a communication connection between them, and may be specifically implemented as one or more communication buses or signal lines. One of ordinary skill in the art can understand and implement it without inventive effort.

The embodiment of the present invention provides a computer-readable storage medium, where the computer-readable storage medium includes a stored computer program, where when the computer program runs, the apparatus where the computer-readable storage medium is located is controlled to execute the above-mentioned model training method for edge equipment.

Those skilled in the art will appreciate that the modules in the devices in the embodiments may be adaptively changed and arranged in one or more devices different from the embodiments. The modules or units in the embodiments may be combined into one module or unit, and furthermore, they may be divided into a plurality of sub-modules or sub-units. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.

It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements.

Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features included in other embodiments, rather than other features, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the claims of the present invention, any of the claimed embodiments may be used in any combination.

While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention.

Claims

1. A model training method of an edge device is applied to a server and comprises the following steps:

obtaining model parameters of a model to be trained;

2. The edge device model training method according to claim 1, wherein the selecting an optimal set of clients based on the improved firework algorithm and sending the model parameters to each client in the set according to a preset splitting manner, so that each client trains a local model corresponding to the client according to the corresponding model parameters, comprises:

3. The model training method of the edge device as claimed in claim 2, wherein the calculation formula for calculating the number of explosion sparks of each firework according to the fitness value of each firework is as follows:

ε is all constants.

4. The model training method of the edge device as claimed in claim 2, wherein the calculation formula for calculating the explosion radius of each firework according to the fitness value of each firework is as follows:

5. The method for training the model of the edge device according to claim 2, wherein after the obtaining the trained local model and the model parameters corresponding to the trained local model sent by the client and aggregating the local model and the parameters corresponding to the local model to obtain the trained global model, the method further comprises:

6. A model training apparatus for an edge device, comprising:

7. The model training device for the edge device according to claim 6, wherein the model parameter sending module selects a group of optimal clients based on an improved firework algorithm, and correspondingly sends the model parameters to each client in the group according to a preset splitting manner, so that each client trains the local model corresponding to the client according to the corresponding model parameters, and the model training device comprises:

8. A terminal device comprising a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, the processor implementing the model training method of the edge device according to any one of claims 1 to 5 when executing the computer program.

9. A computer-readable storage medium, comprising a stored computer program, wherein the computer program, when executed, controls an apparatus in which the computer-readable storage medium is located to perform the model training method for an edge device according to any one of claims 1 to 5.