CN117915405B

CN117915405B - Distributed multi-unmanned aerial vehicle cooperative task unloading method

Info

Publication number: CN117915405B
Application number: CN202410302408.8A
Authority: CN
Inventors: 韦宝泉; 肖昊祥; 邓芳明; 姜玉静; 于小四; 曾建军; 沈阳; 喻斌; 陈斌; 潘剑
Original assignee: East China Jiaotong University
Current assignee: East China Jiaotong University
Priority date: 2024-03-18
Filing date: 2024-03-18
Publication date: 2024-05-31
Anticipated expiration: 2044-03-18
Also published as: CN117915405A

Abstract

The invention discloses a distributed multi-unmanned aerial vehicle cooperative task unloading method, which comprises the following steps: step S1, a distributed multi-unmanned aerial vehicle cooperative system model is established according to time-varying characteristics reached by an unmanned aerial vehicle inspection task and calculation load of an independent edge node, wherein the distributed multi-unmanned aerial vehicle cooperative system model comprises a task model, a calculation unloading model and a communication model; and S2, constructing an unloading decision problem aiming at minimizing system time delay based on the task model and the calculation unloading model, converting the unloading decision problem into a Markov decision process, and then solving the Markov decision process through a multi-agent depth deterministic strategy gradient algorithm to obtain a task collaborative global optimal unloading strategy among multiple unmanned aerial vehicles in the current time slot. The invention can fully utilize the computing resources of the edge nodes to carry out reasonable task unloading and resource allocation.

Description

Distributed multi-unmanned aerial vehicle cooperative task unloading method

Technical Field

The invention relates to the technical field of data processing, in particular to a distributed multi-unmanned aerial vehicle collaborative task unloading method.

Background

In recent years, the inspection of overhead transmission lines by using unmanned aerial vehicles is widely applied, and the inspection efficiency and reliability are improved. However, along with continuous popularization of the unmanned aerial vehicle power inspection and continuous improvement of the power inspection performance requirements, massive data generated by the unmanned aerial vehicle power inspection needs a large amount of time to be uploaded to a cloud center for processing, and the problem of high communication and time delay costs exists.

The rise of mobile edge calculation can alleviate the problems, and the MEC server is deployed to the edge side closer to the terminal equipment, so that real-time calculation and communication services are provided for the terminal, and the inspection efficiency is effectively improved. However, due to the unstable communication links in the partially remote areas, the reliability of data transmission is low, and particularly when the conventional fixed MEC server is faced with sudden power failure, the range of resources that can be provided by the conventional fixed MEC server is limited, which may cause failure of task offloading or aggravated delay problems, and high-quality service cannot be provided. In addition, computing and communication resources in the inspection system are limited, and when a plurality of unmanned aerial vehicles perform inspection simultaneously, the problem of resource competition exists, and the problem of how to fully utilize computing resources of edge nodes to perform reasonable task unloading and resource allocation is solved by the technicians in the field.

Disclosure of Invention

The invention aims to provide a distributed multi-unmanned aerial vehicle collaborative task unloading method, which is used for fully utilizing computing resources of edge nodes to carry out reasonable task unloading and resource allocation.

A distributed multi-unmanned aerial vehicle cooperative task unloading method comprises the following steps:

Step S1, a distributed multi-unmanned aerial vehicle collaborative system model is established according to time-varying characteristics reached by an unmanned aerial vehicle inspection task and calculation load of an independent edge node, wherein the distributed multi-unmanned aerial vehicle collaborative system model comprises a task model, a calculation unloading model and a communication model;

Step S2, constructing an unloading decision problem aiming at minimizing system time delay based on a task model and a calculation unloading model, converting the unloading decision problem into a Markov decision process, then solving the Markov decision process through a multi-agent depth deterministic strategy gradient algorithm to obtain a task collaborative global optimal unloading strategy among multiple unmanned aerial vehicles in a current time slot, wherein each unmanned aerial vehicle is provided with a actor network, and the actor network is used for making a decision according to the task collaborative global optimal unloading strategy, carrying out local calculation resource allocation and task migration among unmanned aerial vehicles, sharing the state of the next time slot to a communication model, and thus carrying out unloading action of the next time slot;

The task collaborative global optimal unloading strategy is used for indicating the unmanned aerial vehicle to unload the load task of the current time slot; the task collaborative global optimal unloading strategy is obtained based on backlog information of a task queue on the unmanned aerial vehicle; the communication model is a global state sharing model based on a variational recurrent neural network and is used for extracting and encoding potential information in a high-dimensional global state and deploying the potential information in a centralized user plane function.

According to the distributed multi-unmanned aerial vehicle collaborative task unloading method provided by the invention, unmanned aerial vehicles carrying an edge server are flexibly deployed as air movable edge nodes integrating inspection, calculation and storage, the time-varying characteristics of task arrival and the calculation loads of a plurality of independent edge nodes are comprehensively considered, a distributed multi-unmanned aerial vehicle collaborative system model is established on the basis, the arrival tasks are divided into different priorities according to the tolerance time of the tasks, an unloading decision problem aiming at minimizing system time delay is established, then the problem is converted into a Markov decision process, the problem is solved through a multi-agent depth deterministic strategy gradient algorithm, a global state sharing model based on a variational recurrent neural network (Variational RNN) is designed as a communication model, the communication overhead among unmanned aerial vehicles can be remarkably reduced, further, a task collaborative global optimal unloading strategy among the unmanned aerial vehicles is obtained, each unmanned aerial vehicle makes a decision according to the local state and shared global information, the task allocation among the unmanned aerial vehicles is realized, the task processing time delay is effectively reduced, and the resource utilization rate is improved. According to the invention, the unloading strategy can be adaptively adjusted according to the emergency condition and the load state of the task on the unmanned aerial vehicle, so that the overall task processing time delay of the system is minimized, the load balance is realized, and the online inspection efficiency is improved.

Drawings

Fig. 1 is a schematic flow chart of a distributed multi-unmanned aerial vehicle cooperative task unloading method of the invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Referring to fig. 1, the method for offloading collaborative tasks of a distributed multi-unmanned aerial vehicle provided by the embodiment of the invention includes steps S1 to S2.

Step S1, a distributed multi-unmanned aerial vehicle collaborative system model is established according to time-varying characteristics reached by an unmanned aerial vehicle inspection task and calculation load of an independent edge node, and the distributed multi-unmanned aerial vehicle collaborative system model comprises a task model, a calculation unloading model and a communication model.

In this embodiment, the distributed multi-unmanned aerial vehicle collaborative task unloading method is applied to an online inspection system of a power transmission line, the online inspection system of the power transmission line comprises a line layer and unmanned aerial vehicle layers, the line layer is power transmission line electrical equipment to be monitored, the unmanned aerial vehicle layers are unmanned aerial vehicle units carrying MEC servers, each unmanned aerial vehicle with a calculation processing unit and a communication antenna hovers in a preset area, and serves as a mobile edge node to provide calculation and communication services, and the unmanned aerial vehicles are connected through a wireless backhaul link, so that tasks can be migrated to adjacent edge nodes to perform collaborative calculation. The task of offloading to different UAVs is random due to the arbitrary distribution of users.

In this embodiment, an Actor (Actor) network and a Critic (Critic) network corresponding to each unmanned aerial vehicle are obtained after training based on generated sample data.

In order to coordinate offloading decisions for all agents, the present invention designs a communication model for collecting and sharing global states. The communication model is a global state sharing model based on a variational recurrent neural network (Variational RNN), a V Variable Automatic Encoder (VAE) is integrated into the Recurrent Neural Network (RNN) and used for extracting and encoding potential information in a high-dimensional global state and deploying the potential information into a centralized User Plane Function (UPF), and the model can remarkably reduce the communication overhead of task data transmission between unmanned aerial vehicles.

Step S2, constructing an unloading decision problem aiming at minimizing system time delay based on a task model and a calculation unloading model, converting the unloading decision problem into a Markov decision process, then solving the Markov decision process through a multi-agent depth deterministic strategy gradient algorithm (MADDPG) to obtain a task collaborative global optimal unloading strategy among multiple unmanned aerial vehicles in a current time slot, wherein each unmanned aerial vehicle is provided with a actor network, and the actor network is used for making a decision according to the task collaborative global optimal unloading strategy, carrying out local calculation resource allocation and task migration among the unmanned aerial vehicles, sharing the state of the next time slot to a communication model, and thus carrying out unloading action of the next time slot;

The task collaborative global optimal unloading strategy is used for indicating the unmanned aerial vehicle to unload the load task of the current time slot; and the task collaborative global optimal unloading strategy is obtained based on backlog information of the task queues on the unmanned aerial vehicle.

In this embodiment, the task model divides the computing tasks reached by the unmanned aerial vehicle into K priorities according to the urgency of the tasks, and tasks with the same priority form the same task queue as the task tolerance time delay is lower and the corresponding priority is higher. The ith task in queue k of drone b is denoted by q _b,k,i,D _b,k,i represents the amount of task data, c _b,k,i represents the CPU cycles required to complete the task,/>Representing the maximum tolerated delay of a task, the following variables are defined:

(1) The number of tasks in queue k unloading the current time slot to drone b is denoted as N _b,k (t), then the number of tasks in drone b is denoted as ；

(2) The queue load is the CPU period required to complete all the computational tasks in the queue k of drone b, expressed asThe queue load of unmanned plane b is recorded as/>；

(3) The degree of urgency of backlog tasks in queue k of drone b is defined asWherein U _b,k,i (t) represents the remaining time before the timeout of the ith task in the queue k of the unmanned aerial vehicle b, and the emergency degree of the backlog task of the unmanned aerial vehicle b is expressed as/>；

In addition, the maximum task number in the queue is represented by N _m, the maximum queue load and the emergency degree of the maximum backlog task are represented by L _m and U _m respectively, the task number in each queue is normalized to obtain the task number N _b (t) in the normalized unmanned aerial vehicle b, the queue load L _b (t) of the normalized unmanned aerial vehicle b and the emergency degree U _b (t) of the backlog task of the normalized unmanned aerial vehicle b, wherein ,n_b(t)=N_b(t)/N_m,l_b(t)=L_b(t)/L_m,u_b(t)=U_b(t)/U_m.

The calculation unloading model comprises a local calculation unit of the unmanned aerial vehicle and an execution calculation unit which is migrated to other unmanned aerial vehicles;

the unmanned aerial vehicle local computing unit meets the following conditional expression:

；

wherein, Representing the total delay of the local computing unit of the unmanned aerial vehicle,/>Representing the duration of time task q _b,k,i waits in the queue from task arrival to time t,/>Calculation delay representing CPU cycle number required by unmanned plane b to complete i queuing tasks,/>, andRepresents the number of CPU cycles required by unmanned plane b to complete i queued tasks, j represents the j-th task,/>CPU cycle indicating allocation of unmanned plane b to queue k at time t,/>=，/>Representing the total computation resources of unmanned plane b,/>The ratio of the CPU cycles allocated to queue k at time t is shown.

The execution computing unit migrated to other unmanned aerial vehicles meets the following conditional expression:

；

wherein, Representation of task q _b,k,i migrated to lightly loaded drone/>On execution, at unmanned aerial vehicle/>/>, In queue k in (1)Tasks; /(I)Representing the total delay to migrate to the executing computing units on other drones,Representing task/>Waiting time in queue from task arrival to time t,/>Representing task transfer time between unmanned aerial vehicles,/>Representation of unmanned plane/>Completion of task/>Is calculated as time delay/>Is unmanned plane b to unmanned plane/>The rate of the transmission task, B ^up, P, H, and H are the channel bandwidths between the unmanned aerial vehiclesRepresenting noise power,/>Represents the/>Tasks,/>Representation of unmanned plane/>Completion/>CPU cycles required for queuing tasks,/>Representation of unmanned plane/>The CPU cycles allocated to queue k at time t.

Wherein, the expression for minimizing the system delay is: b represents the total number of unmanned aerial vehicles, and N represents the total number of tasks.

In this embodiment, in the step of converting the unloading decision problem into the markov decision process, each unmanned aerial vehicle is regarded as an agent, and states, actions and rewards are designed:

(1) Status of

At time slot t, the local state s _b (t) of the unmanned aerial vehicle b is defined as task information in the unmanned aerial vehicle b, and is composed of the normalized task number n _b (t) in the unmanned aerial vehicle b, the normalized queue load l _b (t) of the unmanned aerial vehicle b and the emergency degree u _b (t) of the backlog task of the normalized unmanned aerial vehicle b, and is denoted by s _b(t)={n_b(t),l_b(t),u_b (t).

(2) Action

In order to adapt to the dynamic task demands, each unmanned aerial vehicle needs to adjust the computing resource allocation of each queue, and the unmanned aerial vehicle with heavy load can transfer the task to the adjacent unmanned aerial vehicle with light load, so that load balancing is realized. Action of unmanned plane bIncluding computing resource allocation/>And task migration decisions/>(Proportion of queue load to be migrated at time t), i.e. >;

(3) Rewards

To minimize the total time consumption while avoiding task timeouts. Thus, for drone b, the instant prize is defined as the time remaining between task completion and task timeoutThe following is shown:

；

wherein, Representing the priority, the higher the priority, the corresponding priority weight/>The larger the task is, the more rewards the system can get if it is completed in time, and the shorter the time it takes to process the task, the larger the rewards the system can get, otherwise there is no rewards. The goal for the overall system is to maximize the total rewards for all unmanned aerial vehicles.

Total rewards of all unmanned aerial vehiclesThe method comprises the following steps:

。

In the multi-unmanned plane cooperative system, since the calculation task can be executed on a local or adjacent unmanned plane, all unmanned planes need to coordinate actions to fully utilize the distributed calculation resources, namely, the return of the distributed cooperative system is improved to the greatest extent. Thus, the key issue is to find an optimal strategy to maximize long-term return 。

And further converting the unloading decision problem into solving the following conditional expression:

；

wherein, Representing discount factors,/>Representing a value function,/>Representing an optimal policy, E representing the desire to do any action based on state, s representing a state variable,/>Indicating an initial state.

In addition, in this embodiment, in the step of solving the markov decision process by the multi-agent depth deterministic strategy gradient algorithm, the actor network learning task on each unmanned aerial vehicle cooperates with a global optimal offloading strategy, which will include a local state s _b (t) and a global stateMapping system states of (a) to actions；

The commentator network on each unmanned aerial vehicle is deployed on a centralized user plane function and is used for estimating an action value function, evaluating system rewards after taking actions on states, and in an offline training stage, errors between output and actual rewards of the commentator network are transmitted back and are used for updating parameters of the actor network;

wherein LSTM network is used for state estimation, and history state sequence is used To estimate the current state/>S _b (t-1) represents the state at time t-1, the LSTM network is composed of a forgetting gate, an input gate and an output gate, and the state estimation process is represented as:

；

wherein, Representing the hidden state at time t, in particularThe encoded time-dependent features, s _b (t-2), represent the state at time t-2; is an LSTM encoding function,/> Representing the unmanned plane state at time t-1,/>Represents the hidden state at time t-1,/>Is a network parameter;

training LSTM networks based on collected historical trajectories during offline phase and by estimating And s _b (t)/>Updating network parameters/>The expression is as follows:

；

Wherein T represents the total time slot;

in the operation phase, according to To estimate s _b (t).

In addition, in this embodiment, in the multi-agent depth deterministic strategy gradient algorithm, the training process of the actor network and the reviewer network is as follows:

considering the complete information process in time slot t, each agent estimates the state according to the historical track And global state/>A decision is made. Implementing slave state/>, with offload decision function in actor networkTo action/>And then each unmanned aerial vehicle performs computing resource allocation and task migration according to the offloading decision. At the end of this period, the current offloading decision is evaluated with the reviewer network, computing the per-state/>Action taken/>The rewards obtained are then obtained by minimizing the loss function/>Updating parameters, loss function/>Is the deviation between the actual prize value and the estimated value after taking action; while the communication model updates the global state to g (t+1) for the next decision.

Wherein the loss functionThe expression of (2) is:

wherein, Representing actual rewards value output by critics network,/>Estimated value calculation function representing commentator network,/>Representing network parameters of the reviewer network.

In summary, according to the distributed multi-unmanned aerial vehicle collaborative task unloading method provided by the invention, unmanned aerial vehicles carrying an edge server are flexibly deployed as air movable edge nodes integrating inspection, calculation and storage, the time-varying characteristics of task arrival and the calculation loads of a plurality of independent edge nodes are comprehensively considered, a distributed multi-unmanned aerial vehicle collaborative system model is built on the basis, arrival tasks are divided into different priorities according to the tolerance time of the tasks, an unloading decision problem aiming at minimizing system time delay is built, then the problem is converted into a Markov decision process, the problem is solved through a multi-agent depth deterministic strategy gradient algorithm, a global state sharing model based on a variational recurrent neural network (Variational RNN) is designed as a communication model, the communication overhead among unmanned aerial vehicles can be remarkably reduced, a multi-unmanned aerial vehicle task collaborative global optimal unloading strategy is further obtained, each unmanned aerial vehicle makes a decision according to the local state and shared global information, the task migration among unmanned aerial vehicles is realized, the time delay of task processing is effectively reduced, and the resource utilization rate is improved. According to the invention, the unloading strategy can be adaptively adjusted according to the emergency condition and the load state of the task on the unmanned aerial vehicle, so that the overall task processing time delay of the system is minimized, the load balance is realized, and the online inspection efficiency is improved.

In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: many changes, modifications, substitutions and variations may be made to the embodiments without departing from the spirit and principles of the invention, the scope of which is defined by the claims and their equivalents.

Claims

1. The distributed multi-unmanned aerial vehicle cooperative task unloading method is characterized by comprising the following steps of:

the task collaborative global optimal unloading strategy is used for indicating the unmanned aerial vehicle to unload the load task of the current time slot; the task collaborative global optimal unloading strategy is obtained based on backlog information of a task queue on the unmanned aerial vehicle; the communication model is a global state sharing model based on a variational recurrent neural network and is used for extracting and encoding potential information in a high-dimensional global state and deploying the potential information in a centralized user plane function;

The distributed multi-unmanned aerial vehicle collaborative task unloading method is applied to an online inspection system of a power transmission line, the online inspection system of the power transmission line comprises a line layer and unmanned aerial vehicle layers, the line layer is power transmission line electrical equipment to be monitored, the unmanned aerial vehicle layers are unmanned aerial vehicle units carrying MEC servers, each unmanned aerial vehicle with a calculation processing unit and a communication antenna hovers in a preset area and serves as a mobile edge node to provide calculation and communication services, and the unmanned aerial vehicles are connected through a wireless backhaul link;

The task model divides the calculation tasks reached by the unmanned aerial vehicle into K priorities according to the emergency degree of the tasks, q _b,k,i is used for representing the ith task in the queue K of the unmanned aerial vehicle b, D _b,k,i represents the amount of task data, c _b,k,i represents the CPU cycles required to complete the task,/>Representing the maximum tolerated delay of a task, the following variables are defined:

The maximum task number in the queue is represented by N _m, the maximum queue load and the emergency degree of the maximum backlog task are represented by L _m and U _m respectively, the task number in each queue is normalized to obtain the task number N _b (t) in the normalized unmanned aerial vehicle b, the queue load L _b (t) of the normalized unmanned aerial vehicle b and the emergency degree U _b (t) of the backlog task of the normalized unmanned aerial vehicle b, wherein ,n_b(t)=N_b(t)/N_m,l_b(t)=L_b(t)/L_m,u_b(t)=U_b(t)/U_m;

；

wherein, Representing the total delay of the local computing unit of the unmanned aerial vehicle,/>Representing the duration of time task q _b,k,i waits in the queue from task arrival to time t,/>Calculation delay representing CPU cycle number required by unmanned plane b to complete i queuing tasks,/>, andRepresents the number of CPU cycles required by unmanned plane b to complete i queued tasks, j represents the j-th task,/>CPU cycle indicating allocation of unmanned plane b to queue k at time t,/>=，/>Representing the total computation resources of unmanned plane b,/>A ratio indicating the CPU cycles allocated to queue k at time t;

；

wherein, Representation of task q _b,k,i migrated to lightly loaded drone/>On execution, at unmanned aerial vehicle/>/>, In queue k in (1)Tasks; /(I)Representing the total delay to migrate to the executing computing units on other drones,Representing task/>Waiting time in queue from task arrival to time t,/>Representing task transfer time between unmanned aerial vehicles,/>Representation of unmanned plane/>Completion of task/>Is calculated as time delay/>Is unmanned plane b to unmanned plane/>The rate of the transmission task, B ^up, P, H, and H are the channel bandwidths between the unmanned aerial vehiclesRepresenting noise power,/>Represents the/>Tasks,/>Representation of unmanned plane/>Completion/>CPU cycles required for queuing tasks,/>Representation of unmanned plane/>CPU period allocated to queue k at time t;

the expression to minimize the system latency is: B represents the total number of unmanned aerial vehicles, N represents the total number of tasks;

In the step of converting the offloading decision-making problem into a markov decision-making process, each unmanned aerial vehicle is regarded as an agent, and states, actions and rewards are designed:

(1) Status:

Defining the local state s _b (t) of the unmanned aerial vehicle b as task information in the unmanned aerial vehicle b at a time slot t, wherein the task information consists of the normalized task number n _b (t) in the unmanned aerial vehicle b, the normalized queue load l _b (t) of the unmanned aerial vehicle b and the emergency degree u _b (t) of the backlog task of the normalized unmanned aerial vehicle b, and the task information is represented by s _b(t)={n_b(t), l_b(t), u_b (t);

(2) The actions are as follows:

Action of unmanned plane b Including computing resource allocation/>And task migration decisions/>I.e.;

(3) Rewarding:

for drone b, the instant rewards are defined as the time remaining between task completion and task timeout The following is shown:

；

wherein, Representing a priority weight;

then, total rewards of all unmanned aerial vehicles The method comprises the following steps:

；

wherein, Representing discount factors,/>Representing a value function,/>Representing an optimal policy, E representing the desire to do any action based on state, s representing a state variable,/>Representing an initial state;

In the step of solving the Markov decision process by the multi-agent depth deterministic strategy gradient algorithm, the actor network learning task on each unmanned plane cooperates with a global optimal unloading strategy, and the task cooperates with the global optimal unloading strategy to comprise a local state s _b (t) and a global state Mapping of system states to actions/>；

；

wherein, The hidden state at time t is expressed, specifically/>The encoded time-dependent features, s _b (t-2), represent the state at time t-2; /(I)Is an LSTM encoding function,/>Representing the unmanned plane state at time t-1,/>Represents the hidden state at time t-1,/>Is a network parameter;

；

Wherein T represents the total time slot;

in the operation phase, according to To estimate s _b (t).