CN117076113B

CN117076113B - Industrial heterogeneous equipment multi-job scheduling method based on federal learning

Info

Publication number: CN117076113B
Application number: CN202311035418.1A
Authority: CN
Inventors: 陈卓; 周川
Original assignee: Chongqing University of Technology
Current assignee: Chongqing University of Technology
Priority date: 2023-08-17
Filing date: 2023-08-17
Publication date: 2024-09-06
Anticipated expiration: 2043-08-17
Also published as: CN117076113A

Abstract

The invention provides an industrial heterogeneous equipment multi-job scheduling method based on federal learning, which comprises the following steps: s1, describing the optimization problem of the multi-job actual edge network as the following optimization problem formula: s2, modeling an optimization problem formula as a Dec-POMDP model, wherein the related parameters comprise: S, A, O, R, P; and S3, solving a Dec-POMDP model by adopting a federal multi-job scheduling algorithm FMJS to obtain a scheduling strategy. The method can balance the times of dispatching each industrial client to the federal operation under the heterogeneous distributed industrial environment so as to ensure that the industrial clients are dispatched fairly and efficiently, and the model can be converged at last with high efficiency; in addition, the edge intelligent network facing industrial manufacture is a changeable network environment, and not only is the network state itself changed, but also the data facing time of the local of the industrial client is continuously changed.

Description

Industrial heterogeneous equipment multi-job scheduling method based on federal learning

Technical Field

The invention relates to the technical field of industrial computing and networks, in particular to a multi-job scheduling method for industrial heterogeneous equipment based on federal learning.

Background

Along with the continuous improvement of the requirements of industrial production and manufacture on informatization and intelligence and the continuous evolution of communication network technology, the technology of industrial Internet of things is promoted. Meanwhile, as various complicated production and manufacturing scenes, especially new applications which need machine learning to provide auxiliary decision making and intelligent support are more and more, the new intelligent industrial applications are better balanced in the aspects of individuality, low time delay, operation efficiency and the like through the introduction of Edge Computing (EC). However, the deployment of machine learning models in an industrial manufacturing environment is quite different from the deployment of traditional machine learning, the model is usually trained in a centralized manner, multiparty data are collected to one node (such as a data center) for calculation and training of the model, and large-scale data are collected in a single node, so that the accurate model can be obtained, but the node is far away from a service request end, high data transmission delay is inevitably caused, and meanwhile, the huge risk of leakage of user privacy data in the transmission process exists. Therefore, under the industrial production scene, the local data of each participating entity is fully utilized to carry out model training on the premise of protecting the user data privacy, so that a global model with higher precision is obtained, and the model is a more suitable machine learning model construction mode under the industrial scene.

Based on the analysis, the invention is oriented to the industrial production and manufacturing scene, and a new method for optimally scheduling the service requests from heterogeneous industrial equipment is researched on the basis of federal learning (FEDERATED LEARNING, FL). Under the federal learning model, each industrial client (such as an industrial data collector, an industrial sensor or an industrial computing unit) performs coordinated computation by a parameter server (such as an industrial edge service node or an industrial gateway) positioned at the edge of the industrial Internet of things, original local data of the industrial client cannot be transmitted in a communication link, only model parameters obtained through gradient descent are transmitted, on one hand, the pressure of network bandwidth can be reduced because the original data (such as industrial image, industrial video and other data) are not required to be transmitted, on the other hand, the privacy safety of the data in the industrial process is effectively protected, and the contradiction between privacy protection and large-scale data required by machine learning is relieved.

Although the federal learning training mode and model architecture have advantages in the fields of distributed machine learning and privacy computing, the problems of data islands, communication pressure and the like can be solved, but a plurality of problems still exist and deserve research and optimization. For federal learning of a single task, in order to ensure the accuracy and performance of the model, equipment needs to be selected from a huge industrial client set to participate in training, however, considering the problems of time overhead and the like of distributed model training, only a part of equipment is selected by a server to participate in the task, and equipment which is not scheduled and selected is in an idle state, so that the use efficiency of the industrial client is low. Moreover, edge computing needs to provide more optimized services in terms of real-time performance and low energy consumption for terminal devices at the network edge side, so that a large number of industrial clients can obtain more accurate services and less cost through a federal learning mode.

Disclosure of Invention

The invention aims at least solving the technical problems in the prior art, and particularly creatively provides an industrial heterogeneous equipment multi-job scheduling method based on federal learning.

In order to achieve the above object of the present invention, the present invention provides a multi-job scheduling method for industrial heterogeneous devices based on federal learning, comprising the steps of:

s1, describing the optimization problem of the multi-job actual edge network as the following optimization problem formula:

Wherein the method comprises the steps of Representing a value function corresponding to the operation j in r rounds;

Both phi ₁ and phi ₂ are constants and are used for measuring the weights of the two indexes;

representing a time sensitivity of participation by the industrial client in job j;

representing the job enthusiasm for participation in job j by an industrial client;

and in any training round r, the following constraint is satisfied:

where L _j represents a given penalty value for job j;

representing a global loss function corresponding to the job j;

Representing a set of jobs;

indicating whether the industrial client with index n participates in the execution of the job j in the r-th round;

N represents the total training round number;

J represents the total number of jobs;

s2, modeling an optimization problem formula as a Dec-POMDP model, wherein the related parameters comprise: S, A, O, R, P, wherein S is the state set representation of each agent, A is the joint action set of multiple agents, O is the observation set of the agents, R is the overall rewards of the multiple agent system, and P is the state transition probability of the environment;

and S3, solving a Dec-POMDP model by adopting a federal multi-job scheduling algorithm FMJS to obtain a scheduling strategy.

Further, the method comprises the steps of,AndThe calculation formula of (2) is as follows:

Wherein the method comprises the steps of Representing a global loss function corresponding to the job j;

Lambda _j represents the weight value of the corresponding job;

D _j denotes the data size of job j;

n represents the total number of industrial clients;

Indicating whether the industrial client with index n participates in the execution of job j;

representing the data size of the industrial client participating in job j with index n;

representing industrial clients A loss value calculated based on the related data of the job j;

The loss value obtained by using the gradient descent algorithm based on the input/output data pair (x, y) is shown.

Further, the federal multi-job scheduling algorithm FMJS includes the steps of:

Abstracting each federal operation into a plurality of agents, and respectively executing corresponding agent networks, wherein the inputs of the networks are the observed value O _n,r of the agents and the action A _n,r-1 of the previous round, and the inputs are the value function Q _n of the corresponding agents; wherein O _n,r represents the nth agent, the observed value of the r-th round, and Q _n is a value function in reinforcement learning; the intelligent network is composed of an FC full-connection network layer, a GRU door control unit and an FC full-connection network layer in sequence;

In order to ensure the learning performance of the intelligent agent, data of the output of the intelligent agent network and the multi-operation environment are stored in an experience memory bank, meanwhile, the rewarding value is calculated through the output of the intelligent agent network, and the calculated rewarding value is stored in the experience memory bank; combining the current distributed intelligent agent value function and the experience data of the experience memory bank through a hybrid network, and inputting the combined function value into the intelligent agent network;

And repeating the steps until the set training round times are reached, mixing the network output functions, and enabling the dispatcher trained through the process to dispatch the appropriate industrial client for the corresponding job according to the current system state of the industrial heterogeneous equipment to participate in a single federal learning task.

The agent is an abstract representation of an independent federal learning task, i.e., the independent federal learning task is a task that uses different data and training of different network models, e.g., user a needs to participate in the task that CNN trains using facial data, and user B needs to participate in the task that LSTM trains using text data. If user C needs to participate in the task of LSTM training using text data, then BC is not independent, AC, AB is independent at this point.

Further, the calculation formula of the reward value is as follows:

Wherein c _j represents the contribution degree of the j-th job to the whole multi-job system;

R _j represents the prize value of each agent;

indicating the job enthusiasm for participation in job j by the industrial client.

Further, the method comprises the steps of,The calculation formula of (2) is as follows:

wherein ω is a constant and 0.ltoreq.ω.ltoreq.1;

representing the time sensitivity of an industrial client with index n to job j at round r; to the left of the equation To the right of the equation for the value to be adjustedIs the adjusted value.

Representing a correction factor;

indicating the time sensitivity of the industrial client to job j at round r.

Further, the saidThe value of (2) is determined according to the following judging conditions:

When r=0 or In the time-course of which the first and second contact surfaces,

When r is not equal to 0In the time-course of which the first and second contact surfaces,

The correction factorThe calculation formula of (2) is as follows:

Wherein, Representing a datasetData size of (2);

representing a global data set corresponding to the j-th job;

Indicating that industrial client n is scheduled to participate in job j a number of times in r rounds.

Wherein, Indicating the equipment enthusiasm for job j when the industrial client with index n is at the r-th round;

exp represents an exponential function;

Beta ₁ and beta ₂ are constants;

representing a training time of the device;

n represents an nth industrial client;

Indicating the device aggressiveness for job j at the r-th round of industrial client.

Further, training time of the deviceThe following distribution is satisfied:

Wherein the method comprises the steps of Representing execution time of industrial client n-oriented job j at the r-th roundProbability distribution of compliance;

t represents a function variable;

e represents a natural base;

mu _n and q _n represent calculated force fluctuation values and calculated force maximum values of the industrial client n, respectively;

representing the size of the job j related data set locally owned by industrial client n;

i represents the number of local model updates for the industrial client.

In summary, by adopting the technical scheme, the method and the device can balance the times of dispatching each device to the federal operation in the heterogeneous distributed data environment so as to ensure that the industrial client is fairly dispatched, and the model can be converged at last with high efficiency; in addition, the edge intelligent network is a changeable network environment, the network state changes, the data of the local of the industrial client terminal changes continuously in time, and the method can schedule the equipment with the best state to participate in federal operation, so that better model performance can be obtained. The invention has the following specific advantages:

1) In the face of a real edge network scene, a novel value function model is provided, wherein the value function model relates to timeliness of data in each edge device and time of participation of the device in work;

2) In order to improve the overall training efficiency of the multi-operation system, a novel scheduling method is designed based on a value function model as a reward standard.

Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.

Drawings

The foregoing and/or additional aspects and advantages of the invention will become apparent and may be better understood from the following description of embodiments taken in conjunction with the accompanying drawings in which:

FIG. 1 is a diagram of a multi-job federal learning application framework of the present invention.

FIG. 2 is a diagram illustrating the difference between the MIT of the present invention and the conventional multi-task learning.

FIG. 3 is a schematic diagram of the process of MIT according to the present invention.

FIG. 4 is a flow chart of a multiple federal operation in accordance with the present invention.

Fig. 5 is a diagram showing the convergence of CNN of the present invention at IID and NIID.

Fig. 6 is a diagram of the convergence contrast of the present invention LeNet with IIDs and NIID.

Fig. 7 is a diagram showing convergence contrast for VGG of the present invention at IID and NIID.

Detailed Description

Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative only and are not to be construed as limiting the invention.

1 System model

1.1 Formalized definition

The system frame comprises a central parameter server and a plurality of industrial clients, and all the industrial clients form a device pool of the multi-operating system. Further, the single industrial client may further abstract into a plurality of executing sub-processes according to the number of tasks to be executed, each sub-process completes the corresponding task training according to the corresponding task, fig. 1 shows a multi-job federal learning framework, and the parameter server concrete may abstract into a plurality of AP (Access Point) interfaces with independent computing and scheduling capabilities according to the number of jobs to be processed, so as to simulate the parameter servers corresponding to different jobs.

In the framework shown in FIG. 1, an owned industry client setWherein the method comprises the steps ofFor the total number of industrial clients,Means "equivalent to"; all industrial clients constitute a pool of candidate industrial clients. The set of jobs to be executed is a plurality of independent jobs (Multi INDEPENDENT TASK, MIT) without any models and data intersections, which differs from conventional Multi-task learning as shown in fig. 2, and the success of Multi-task learning (Multi TASK LEARNING, MTL) is that different tasks can mutually absorb model information that is useful for each other to obtain a more robust model, while the Multi-jobs considered herein do not have any data and model correlation, and the industrial clients for model training between the Multi-jobs are only partially identical. The job set may be represented asThe number is recorded as Representing the J-th job. The type of job may be a text-type application, an image data-type application, or the like. The number of data sets owned by the industrial client with index n is J _n, namely the industrial client can participate in model training of J _n jobs, wherein J _n is more than 0 and less than or equal to J, and the number of data sets owned by all industrial clients can be described as a problem shown in formula (1):

Where N represents the total number of industrial clients;

The number of jobs provided for each industrial client n is added.

Meanwhile, in order to reduce the calculation burden and energy consumption of the industrial clients, each industrial client participates in training of one job at most in the same training round, and in any round, all jobs need to be obtained a certain number of industrial clients to realize operation, and further constraint and explanation are given to the scheduling of the industrial clients and the execution condition of multiple jobs, as shown in a formula (2):

Wherein, Indicating whether the industrial client indexed n participates in the execution of job j in the r-th round.

The wireless access point AP has a list of industrial client devices that can participate in their corresponding jobs, each job being independently scheduled from candidate edge industrial clientsThe industrial client participates in the execution process of the industrial client, the industrial client serves as a participant of specific federal operation and an owner of original data, and the industrial client iteratively trains a local model by combining the local data and completes updating and uploading of model parameters.

1.2 Problem description and handling procedures

If industry clientHaving a dataset corresponding to the j-th job, the dataset is expressed asX _j,Y_j is the input data vector and the output tag vector, respectively, the data size of the data sample isOtherwiseNamely, for j operations corresponding to n industrial clients, the number of data samples is an empty set, so that the data size corresponding to the data samples is 0; representing industrial clientsThere is no data set for the job, i.e., it is not scheduled to participate in the job training process with index j. The overall dataset for job j is represented asThe corresponding data size is

To facilitate the description of the problem, the present disclosure is based on a general synchronous federation learning method to study the problem of industrial client scheduling and job execution of multiple federation jobs, corresponding to multiple operating system global loss functionsExpressed as:

Lambda _j represents the weight value of the corresponding job.

D _j denotes the data size of job j;

n represents the total number of industrial clients;

The loss value obtained by using the gradient descent algorithm based on the input/output data pair (x, y) is shown. The overall optimization goal of the federal multi-job model is to make Minimum:

The process of MIT is shown in fig. 3, and the whole process includes seven core steps, firstly, the server schedules an optimal device to participate in a corresponding federal job by a plurality of virtual APs based on the state of the whole multi-job system (such as the idle degree of the device, etc.), and transmits a global model of the corresponding job to the scheduled device, so as to complete step ①. The scheduled device receives the global model and starts to complete several iterative training of the local model based on the local data, and then uploads the corresponding device information and model parameters to the server, corresponding to step ②～④ in fig. 3, which is the same as the classical federal learning execution process. The AP receives information from the industrial client side and then further processes the information, on the one hand, step ⑤ needs to record relevant performance parameters of the industrial client, perform performance monitoring, and facilitate step ⑥ to calculate the scheduled priority of the industrial client for equipment scheduling of the corresponding operation of the next round; on the other hand, step ⑦ needs to use an aggregation method to update the global model of the corresponding operation by parameter aggregation of the DNN model. And (5) circulating the process until the specified training round number or model convergence is reached.

Further, we explore a value function model based on time sensitivity, how often industrial clients are scheduled, and how aggressive an industrial client is to participate, to optimize industrial client scheduling problems in an edge network, which is determined by a combination of Job time sensitivity (AGE SENSITIVITY of Job, ASJ) and Job aggressiveness (Work Enthusiasm of Device, WED) of the Device, where the ASJ of a Job corresponds to the time sensitivity (AGE SENSITIVITY of Device, ASD) of each industrial client participating in the Job, by which the strategy of scheduling industrial clients is evaluated and optimized to find industrial clients with newer data, so that the fitting effect of the model is improved, while reducing the training time of the Job as a whole.

Definition 1 (AGE SENSITIVITY of Device, ASD) in particular,Representing the temporal sensitivity of an industrial client indexed n at round r for job j, the initial value of this parameter is defined as a constantAssuming that the industrial client n is scheduled by the AP to participate in training after completing the corresponding job, the data of the terminal of the internet of things is immediately collected again to update the information in the local database, and in the process of updating the data, the corresponding data is updatedWill be set to- ≡and the ASD value corresponding to the industrial client not scheduled on round r will decrease, while the AP will select the industrial client with larger ASD to participate in the federal job, because the data freshness of such industrial client will be higher, and the ASD of industrial client n will be expressed as:

Wherein the method comprises the steps of Indicating that industrial client n is scheduled into job j, otherwiseAnd make the following basic assumptions: ① In the initialization stage, the data owned by each industrial client is the latest data, namely: Representing the time sensitivity of the industrial client with index d to job j at round 1; ② In the process of re-collecting and updating terminal data after the device is scheduled, the communication link is assumed to be in a smooth state and no abnormal blocking and retransmission can occur, but a certain waiting time is required for data updating, so that the ASD value of the industrial client after the scheduled becomes- -infinity, and the device cannot be scheduled at this stage. At the same time, in order to avoid that the industrial client with the advantage of ASD is always preferentially called, and a small part of industrial client has no opportunity to participate, a correction factor is introduced For balancing the relationship between ASD and number of scheduled times, in particularThe method comprises the following steps:

Wherein, Representing a datasetData size of (2);

representing a global data set corresponding to the j-th job;

Combining equations (5) and (6), the final form of ASD is expressed as:

Wherein ω is a constant and 0.ltoreq.ω.ltoreq.1.

The overall time sensitivity of job j is accumulated by the scheduled industrial client corresponding ASD to get ASJ:

Definition 2 (Work Enthusiasm of Device, WED) WED is the job enthusiasm for an industrial client to participate in job j Metrics of whichTime corresponding to execution of job j by deviceCorrelation, assuming training time of deviceThe following distribution is satisfied:

Wherein mu _n > 0 and q _n > 0 represent calculated force fluctuation values and calculated force maximum values of the industrial client n, respectively, which are constants, and I represents the local model update times of the industrial client.

Representing the size of the job j related data set locally owned by industrial client n.

Representing execution time of industrial client n-oriented job j at the r-th roundObeying a probability distribution, in particular a shift index distribution.

T represents a pure functional variable, which can be understood asIs a random variable in probability, t is a sumCorresponding variables.

Device aggressivenessThen it is expressed as:

Wherein, Representing a training time of the device;

Beta ₁ and beta ₂ are constants, determined by the system's requirements for quality of service. As the required execution time of an industrial client at job j increases, the aggressiveness of device participation in the corresponding job will decrease, as such work will occupy significant computational and communication resources of the industrial client.

The overall equipment enthusiasm for job j is therefore expressed as:

n represents an nth industrial client;

to sum up, the value function corresponding to the operation j at the r-round is composed of ASJ and WED, and can be expressed as:

Wherein, psi ₁ and psi ₂ are constants and are used for measuring the weights of the two indexes.

The optimization problem of the multi-job actual edge-oriented network can be described again as shown in equation (13), and in any training round r, the constraint is satisfied: the global loss value of any job j is smaller than or equal to a given loss value L _j and meets the constraint of a formula (2) so as to enable the global value function of a multi-job systemThe value is the largest. Since a plurality of jobs have a complex relationship in terms of scheduling selection of an industrial client, if job a and job B schedule the same device at the same time, but the device can only respond to one job at the same time, difficulty in selecting the device occurs. Therefore, the scheduling result of one job will have a certain potential impact on the industrial client scheduling process of other jobs, and the solution space of the scheduling problem is exponential.

Device scheduling algorithm in2 federal multi-job

Based on the foregoing description of the problem and the system model, a federal multi-job scheduling algorithm FMJS is proposed for solving the problem in equation (13). Wherein each individual federal learning operation is abstracted to a single agent. What industrial client scheduling policy is adopted by a single job in a certain round will influence scheduling execution of other jobs by influencing idle state of equipment in a certain time period, so that a globally optimal solution of a formula (13) needs to be solved cooperatively by a plurality of jobs, namely, a multi-agent adopts a cooperative mode, efficiency and cooperation level of a model are improved, meanwhile, a complex and changeable corresponding relation between the job and the industrial client enables the multi-job system to be in an unstable environment for a long time, the problem is modeled as a partially observable Markov decision model (Dec-POMDP) with decentralization, and the problem is approximately solved by adopting a reinforcement learning method, wherein the model is described by six-tuple shown as a formula (14), and one agent set is involved in the multi-agent systemThe state set of each intelligent agent is represented as S, the joint action set of the multi-intelligent agent is represented as A, the observation set of the intelligent agent is represented as O, the overall reward of the multi-intelligent agent system is represented as R, P is the state transition probability of the environment, and P (S' |S, A) represents the probability that the multi-intelligent agent takes the joint action A to transition from the state to the state under the state S.

Wherein S _J represents a set of state sets corresponding to agents with index J, and more generally S _j represents a set of state of the jth agent;

A _J represents the action set of the J-th agent;

O _J represents the observation set of the J-th agent;

The state space S _j of the agent is set in relation to the large environment in which the agent is located, i.e. the idle state of each industrial client in the industrial client pool in the r-th round R-wheel accumulated dispatch timesAnd the job enthusiasm of industrial clients

Because it is discussed herein that multiple federal jobs schedule devices in the same candidate industrial client pool, but the industrial clients for which individual federal jobs are available for scheduling are not exactly identical, and because the state space of an agent is related to the state of an industrial client available for scheduling, the state space of the i-th agent and the j-th agent satisfy:

S_i≈S_j,1≤i≤J,1≤j≤J,i≠j (15)

The designed multi-job-oriented device scheduling algorithm FMJS refers to the design thought of the multi-Agent reinforcement learning algorithm QMIX, such as a scheduler part shown in fig. 4, the framework mainly comprises three parts of an Agent Network (Agent Network), an experience memory bank (Experience Memory) and a hybrid Network (Mixing Network), each federal job is abstracted into a plurality of agents, corresponding Agent networks are respectively executed, the input of the Network is observation O _n,r of the Agent and action A _n,r-1 of the previous round, and the output is a value function Q _n of the corresponding Agent, wherein the Agent networks sequentially comprise an FC full-connection Network layer, a GRU door control unit and an FC full-connection Network layer. In order to ensure the learning performance of the intelligent agent, the data of the output of the intelligent agent network and the multi-operation environment are stored in an experience memory bank for training the intelligent agent, and meanwhile, the distributed intelligent agent value functions are combined through a mixed network. The server side comprises a scheduler and an aggregator, and the scheduler trained by the process can schedule a proper industrial client for corresponding jobs according to the current system state to participate in a single federal learning task, including training of models and aggregation of models.

Because overall optimality of the multi-operating system, i.e., the combined prize value R (total prize) for multiple agents is maximized, and whether the individual prize value R _j for each agent itself is maximized or not, is prioritized herein, the scheduler is designed as a "central training distribution enforcement" architecture, and the combined prize value R is obtained by aggregating the prize values R _j for each agent using equation (16).

Where c _j represents the contribution of the j-th job to the entire multi-job system. R _j consists of ASJ and WED as set forth above in terms of weight values ψ ₁ and ψ ₂, respectively.

The experience memory stores records of the observed value, action, state and the like of each intelligent agent history, and p data records are randomly extracted from the experience memory for training the reinforcement learning model so as to reduce strong correlation caused by adjacent state data. The hybrid network combines the global state (such as system information including whether the industrial client is idle) and the local value function Q _j by nonlinear method to obtain the combined global value function Q _tot, and sets the reinforcement learning reward value R in FMJS to be equal to the target setting of the multi-operation systemPositive correlation, i.e The corresponding value function at r-round is shown. Meanwhile, in order to ensure that the global Q _tot is consistent with the monotonicity of the local value function Q _j of each agent, the action selected by the largest local value function can also maximize the global value function, wherein the local action a _j of each agent is selected by adopting a greedy strategy, and the corresponding action is obtained by maximizing the local value function Q _j, as shown in formula (17):

Wherein Q ₁、Q₂、…Q_J is calculated by the agent networks shown in fig. 4, the input of each agent network is the observation of the environment of the multi-operation system and the decision made by the agent network in the last time, namely, the input of the agent network corresponds to O and a in the formula (14), and the specific calculation flow is that the input sequentially passes through the fully connected neural network (FC), the gate cycle neural network (GRU), and the fully connected neural network (FC), and the probability value (the range of the value is 0-1) of the corresponding number is output according to the maximum industrial client number which can be scheduled for each operation, so that the corresponding Q _j,Q_tot is calculated by the hybrid network in the scheduler network of fig. 4.

The pseudo code portion of FMJS is shown in table 1.

Table 1 FMJS scheduling algorithm

It follows that the method of the invention has the following advantages:

1) A system framework for parallel execution of multiple heterogeneous federation jobs is presented. In the practical application scene of edge calculation, each industrial client can participate in a plurality of jobs, a certain job can be executed by a plurality of industrial clients, and in order to ensure that the plurality of jobs in the system can run simultaneously so as to improve the execution efficiency of a multi-job system, a multi-heterogeneous federal job parallel execution framework is provided.

2) A value function model combining time sensitivity and work enthusiasm is presented. Considering that in the mobile edge computing scene, the real-time performance of the data has a larger influence on the training and application of the model (such as unmanned), if the fitting degree of the data and time is larger, the corresponding data real-time performance is higher, namely the time sensitivity of the data is higher, and the model fitting effect is better. Meanwhile, various terminal devices are sensitive to the use of energy consumption (such as electric energy) resources, if the devices participate in a certain job for a long time, the resource consumption caused by communication and calculation is more than that of the short-time job, and the work enthusiasm of the devices can be influenced. Therefore, a value function model combining time sensitivity and work enthusiasm is proposed, and the purpose of evaluating and optimizing the performance of the multi-heterogeneous federal operating system is to evaluate and optimize the model precision and time overhead.

3) An equipment scheduling algorithm based on multi-agent reinforcement learning is provided. The method comprises the steps of abstracting a plurality of federal jobs to be processed into a plurality of agents, adopting a mode of central learning and distributed execution, and independently carrying out local strategy formulation by each agent through global information learning, namely selecting idle and optimal equipment from an equipment pool to participate in corresponding jobs, wherein in order to ensure the overall optimization of a multi-operation system, the scheduling algorithm is completed based on a cooperation mechanism, and the agents acquire an overall rewarding value only through joint action.

3 Experiment and Performance evaluation

3.1 Experimental scene establishment

The experiment uses computer vision as the basic type task, the CNN, leNet, VGG classical neural network model is integrated in the federal learning model, and the three tasks form the image classification operation which needs to be executed in parallel by the multi-federal operation system, namely J=3, and J represents the total operation quantity. According to the layer number of the neural network and the number of the neurons, the complexity of the VGG model is maximum, the complexity of the CNN is secondary (comprising five layers of convolution), and the complexity of the LeNet (comprising three layers of convolution) is minimum. The network model structure of the three operations has no common structure and data inclusion relationship, belongs to the operations independent of each other, and the framework foundation of the network model is TensorFlow 2.3.0GPU. Based on three independent network models, three different public data sets are selected in the chapter and are used for training and predicting the neural network models respectively. The three public data sets were EMNIST LETTERS, EMNIST DIGITALS, and CIFAR-10, respectively. Emnist is an expanded version of the classical handwriting digital dataset MNIST, each image in MNIST is a 28 x 28 size handwriting character, EMNIST LETTERS and EMNIST DIGITALS are subsets of the dataset, letters have 145600 images in total, letters do not distinguish between 26 categories in case, and training set size and test set size corresponding to each category of letter images are 4800 and 800 respectively. Digitals has 10 categories of digital image data, a total of 28000 images, where each number corresponds to a training set and test set of sizes 2400 and 400, respectively.

The CIFAR-10 data sets also have 10 kinds of image data, but the image types are more abundant, and the method has the characteristics of larger noise, various features and the like, the size of each image is 32 multiplied by 32, the image is a three-channel color picture, and the number of training sets and test sets of each kind of image is 5000 and 1000 respectively. The CNN employed a dataset of EMNIST LETTERS, VGG using a dataset of CIFAR-10 and LeNet was trained from dataset EMNIST DIGITALS. Wherein the data-related features are shown in Table 2:

TABLE 2 data Scale and Attribute characterization for Industrial clients

	Emnist Digitals	Emnist Letters	CIFAR-10
				Training set size	2400	1248	500
Test set size	400	208	100
				Feature size	28×28	28×28	28×28
Training model	LeNet	CNN	VGG
				Isomerism of	Y/N	Y/N	Y/N

Furthermore, in order to effectively simulate the federation learning application scene, the original data are respectively distributed to the corresponding edge industrial clients according to different sampling modes, specificallySuch data is divided by changing parametersAnd further, the degree of isomerism of the data is changed, so that the influence of the isomerism data on the whole federal learning is convenient to discuss and analyze, and the Y/N in the isomerism column of the table 2 indicates whether the isomerism data has a isomerism data relationship or not. Wherein the method comprises the steps ofThe collection and partitioning of the presentation data is done in IID mode, i.e. each industrial client has the same number and class of images obtained randomly from the corresponding dataset.It means that each industrial client randomly extracts two categories of images and then takes out several image data from the remaining categories. Regardless of the data sampling scheme, the total amount of data owned by an industrial client that owns the same data set is assumed to be the same.

As used herein, the hardware resource is NVIDIA GeForce RTX2080Ti to emulate a multi-federal task distributed environment that contains a total of one parameter server and 120 industrial client devices, i.e., n=120. The 120 devices and the parameter server can be connected through a network, and a plurality of AP interfaces are virtualized in the parameter server for multiplexing, wherein the number of the APs is 3. Wherein each job randomly takes out 100 industrial clients from the 120 terminals to complete corresponding identification for dispatching the industrial clients by the job, and industrial client sets for dispatching three types of jobs are respectivelyWherein the method comprises the steps of Specifically, each job is only aggregated from the corresponding industrial clients during one round of executionA small percentage of the optimal devices are selected to execute the job.

3.2 Experimental results and analysis

In the experimental process, 200 rounds of training are required to be completed for the operation corresponding to CNN in FMJS algorithm, 300 rounds of training are required to be completed for LeNet, 400 rounds of iterative training are required for VGG, when the last operation is completed, the multi-operation system is finished, the total training round number of FMJS scheduling algorithm is 1000, wherein batch is set to 32, a RMSprop optimizer is adopted, the learning rate is set to 0.0009, and data generated by interaction of an agent and the environment are stored in a memory experience library after data normalization processing. Meanwhile, by taking reference to the thought that the task with larger difficulty is given priority scheduling in the literature, the multi-task learning performance can be improved, the contribution degree of VGG, CNN, leNet to the whole multi-operation system is respectively set as c ₁＝0.4,c₂＝0.36,c₃ =0.24, and the weight values corresponding to ASJ and WED are respectively set as: psi ₁＝0.5,ψ₂ = 1.5. Initial value of ASDSet to 10. Device aggressivenessThe two parameters involved, β ₁ and β ₂, were set to 0.001 and-10, respectively.

The time overhead required by the algorithm is tested and evaluated, and compared with the performance of the reference algorithm in 4.3.2, and the data are divided into two groups according to the data acquisition mode (NIID-2, IID), wherein each group comprises performance comparison of three tasks (VGG, CNN, leNet) among different scheduling algorithms.

To evaluate the effectiveness of FMJS algorithm, three jobs were based on NIID and IID feature data, and compared to the above-described baseline algorithm for performance, training time elapsed and model accuracy results, see fig. 5,6, and 7, respectively. Wherein the corresponding accuracy curve will exhibit a certain fluctuation due to the presence of certain data heterogeneous disturbances based on the NIID characteristic data. Meanwhile, because the algorithm proposed herein completes the training and evaluation of scheduling performance based on the value function model proposed by the formula (12), the accuracy achieved by the three jobs after the convergence of the base model is higher than that of the reference algorithm. Furthermore, because of the aggressiveness of participation of the devices, the total time overhead required by FMJS scheduled devices will be significantly reduced compared to the baseline algorithm.

The left and right subgraphs in fig. 5 respectively show the convergence effect of EMNIST LETTERS data sets in the CNN model after being divided based on IID and NIID modes, and the devices scheduled and selected by each scheduling algorithm are applied to the CNN model. Since there is much less interference from heterogeneous aspects of data than non-independent co-distributed (NIID) data if the data features are independent co-distributed (IID) among devices having the same data set, comparing CNN IID and CNN NIID scenarios, it can be found that industrial clients scheduled based on various methods can exhibit a more stable and smooth convergence effect in CNN IID over time, while time-precision curves in CNN NIID scenarios exhibit a certain jitter, indicating that heterogeneous data does affect the convergence process of the model, wherein FMJS can achieve higher prediction precision in both CNN IID and CNN NIID scenarios. In particular, in the NIID scenario, the industrial client sequence scheduled based on the FMJS algorithm is obviously superior to other reference algorithms, while the Greedy algorithm has a certain advantage in convergence speed in the training initial stage compared with the FMJS algorithm, the Greedy algorithm is limited by the locally optimal property of the Greedy algorithm, when the scheduling algorithm tends to converge, the scheduling performance is even worse than that of the Random scheduling algorithm and the CS scheduling algorithm, and the influence of the ASD on the equipment scheduling is not considered by the other algorithms, so that the precision and the convergence performance in the NIID scenario are both worse than FMJS, and even the CS algorithm is weaker than that of the Random scheduling algorithm when the timeliness problem is considered.

The industrial client sequences scheduled by the various scheduling algorithms, the time required to complete 200 rounds of CNN model training, and the converged accuracy values are shown in table 3. In NIID environments, FMJS can save 20.8% of time on average and improve 4.38% of accuracy on average compared with other algorithms. In the IID environment, the time can be saved by 38.4 percent on average, and the precision can be improved by 2.16 percent on average. Because the IID scene has better isomorphism of the data, the training and convergence of the model can be facilitated, and the NIID scene brings more challenges to the training of the model, the FMJS algorithm provided herein has more obvious advantages under NIID data scene facing federal learning.

Table 3 time overhead and accuracy for each algorithm to complete CNN operation

FIG. 6 illustrates the training effect of applying the scheduling industrial client sequences of the various scheduling algorithms to the LeNet model after EMNIST DIGITALS is partitioned based on the IID and NIID modes. An execution effect similar to CNN IID, CNN NIID can be obtained, greeny has a certain fine advantage in convergence speed in the early stage of training, and FMJS presents better convergence effect and accuracy with the advancement of time. Table 4 shows the total time required for 300 rounds of the LeNet model training to complete and the model accuracy achieved. Although the classical LeNet model can reach higher precision, the federally-learned heterogeneous distributed computing architecture can still improve the precision by 1.25% on average by the scheduling algorithm in NIID scenes, and meanwhile, the time can be saved by 11.4% on average compared with other reference algorithms. In the IID environment, the time can be saved by 14.2 percent on average, and the precision can be improved by 0.06 percent on average.

Table 4 time overhead and accuracy for each algorithm to complete the LeNet operation

Fig. 7 shows that CIFAR-10 is based on the IID and NIID mode, and the training effect of applying the scheduling device sequence of each scheduling algorithm in the VGG model, since the multi-job scheduling algorithm sets the contribution weight corresponding to the VGG to be the maximum during training, the performance efficiency and effect of FMJS in VGG IID and VGG NIID are different from CNN and LeNet, and the convergence speed in the initial stage FMJS is faster than that of the greeny scheduling algorithm, because the contribution weight corresponding to the VGG is preferentially prone to providing the industrial client selection for the VGG during FMJS training. Table 5 shows the total time required for completion of 400 rounds of VGG model training and the model accuracy achieved. In NIID environments, FMJS can save 11.3% of time on average and improve 4.8% of accuracy on average compared with other algorithms. In the IID environment, the time can be saved by 20.6 percent on average, and the precision can be improved by 1.61 percent on average.

Table 5 time overhead and accuracy for each algorithm to complete VGG operation

According to the weight c ₁＝0.4,c₂＝0.36,c₃ =0.24 of the operation, the FMJS algorithm proposed in the chapter reduces the overall time overhead by 25.4% and 14.7% respectively in IID and NIID scenes, and improves the overall accuracy by 1.43% and 3.79% respectively.

In summary, in the edge-oriented intelligent network, a plurality of jobs need to run independently at the same time and a plurality of industrial clients can be used for selecting application scenes participating in different jobs, a multi-federal job learning structure is provided to improve the working efficiency and performance of a multi-job system, meanwhile, a multi-job system optimization problem based on time sensitivity and working enthusiasm of equipment is introduced, the problem is modeled as a Dec-POMDP model, a federal multi-job-oriented equipment scheduling algorithm FMJS is designed based on QMIX reinforcement learning model, the selection and scheduling of the multi-job equipment is converted into the scheduling of the multi-agent equipment, and finally, the overall maximum rewards of the multi-agent system are taken as standards, and the overall optimal action selection strategy is trained and made. Compared with a plurality of standard methods, the scheduling algorithm provided by the chapter has the advantages that the highest precision of 2.93% and 4.88% is improved and the training time of 25.4% and 14.7% is reduced in IID and NIID scenes.

While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: many changes, modifications, substitutions and variations may be made to the embodiments without departing from the spirit and principles of the invention, the scope of which is defined by the claims and their equivalents.

Claims

1. The multi-job scheduling method for the industrial heterogeneous equipment based on federal learning is characterized by comprising the following steps of:

and in any training round r, the following constraint is satisfied:

where L _j represents a given penalty value for job j;

representing a global loss function corresponding to the job j;

Representing a set of jobs;

N represents the total training round number;

J represents the total number of jobs;

s3, solving a Dec-POMDP model by adopting a federal multi-job scheduling algorithm FMJS to obtain a scheduling strategy;

The federal multi-job scheduling algorithm FMJS includes the steps of:

storing data related to the output of the intelligent network and the multi-operation environment into an experience memory bank, calculating a reward value through the output of the intelligent network, and storing the calculated reward value into the experience memory bank; combining the current distributed intelligent agent value function and the experience data of the experience memory bank through a hybrid network, and inputting the combined function value into the intelligent agent network;

2. The multi-job scheduling method for industrial heterogeneous equipment based on federal learning according to claim 1, wherein,AndThe calculation formula of (2) is as follows:

Lambda _j represents the weight value of the corresponding job;

D _j denotes the data size of job j;

n represents the total number of industrial clients;

3. The federally learned industrial heterogeneous equipment multi-job scheduling method according to claim 1, wherein the prize value is calculated as follows:

R _j represents the prize value of each agent;

4. A multi-job scheduling method for federally learned industrial heterogeneous equipment according to claim 1 or claim 3,The calculation formula of (2) is as follows:

wherein ω is a constant and 0.ltoreq.ω.ltoreq.1;

representing the time sensitivity of an industrial client with index n to job j at round r;

Representing a correction factor;

indicating the time sensitivity of the industrial client to job j at round r.

5. The federally learned multi-job scheduling method for industrial heterogeneous devices according to claim 4, wherein the steps ofThe value of (2) is determined according to the following judging conditions:

When r=0 or In the time-course of which the first and second contact surfaces,

The correction factorThe calculation formula of (2) is as follows:

Wherein, Representing a datasetData size of (2);

representing a global data set corresponding to the j-th job;

6. A multi-job scheduling method for federally learned industrial heterogeneous equipment according to claim 1 or claim 3,The calculation formula of (2) is as follows:

exp represents an exponential function;

Beta ₁ and beta ₂ are constants;

representing a training time of the device;

n represents an nth industrial client;

7. The federally learned multi-job scheduling method for industrial heterogeneous devices according to claim 6, wherein the training time of the devices isThe following distribution is satisfied:

t represents a function variable;

e represents a natural base;

i represents the number of local model updates for the industrial client.