CN114513814A - Edge network computing resource dynamic optimization method based on unmanned aerial vehicle auxiliary node - Google Patents

Edge network computing resource dynamic optimization method based on unmanned aerial vehicle auxiliary node Download PDF

Info

Publication number
CN114513814A
CN114513814A CN202210079544.6A CN202210079544A CN114513814A CN 114513814 A CN114513814 A CN 114513814A CN 202210079544 A CN202210079544 A CN 202210079544A CN 114513814 A CN114513814 A CN 114513814A
Authority
CN
China
Prior art keywords
aerial vehicle
unmanned aerial
user
network
task
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210079544.6A
Other languages
Chinese (zh)
Inventor
鲍宁海
高鹏雷
陈奎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN202210079544.6A priority Critical patent/CN114513814A/en
Publication of CN114513814A publication Critical patent/CN114513814A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W28/00Network traffic management; Network resource management
    • H04W28/02Traffic management, e.g. flow control or congestion control
    • H04W28/08Load balancing or load distribution
    • H04W28/09Management thereof
    • H04W28/0925Management thereof using policies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W24/00Supervisory, monitoring or testing arrangements
    • H04W24/02Arrangements for optimising operational condition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W24/00Supervisory, monitoring or testing arrangements
    • H04W24/06Testing, supervising or monitoring using simulated traffic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W28/00Network traffic management; Network resource management
    • H04W28/02Traffic management, e.g. flow control or congestion control
    • H04W28/0226Traffic management, e.g. flow control or congestion control based on location or mobility
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W28/00Network traffic management; Network resource management
    • H04W28/02Traffic management, e.g. flow control or congestion control
    • H04W28/08Load balancing or load distribution
    • H04W28/09Management thereof
    • H04W28/0958Management thereof based on metrics or performance parameters
    • H04W28/0967Quality of Service [QoS] parameters
    • H04W28/0975Quality of Service [QoS] parameters for reducing delays
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/30Services specially adapted for particular environments, situations or purposes
    • H04W4/40Services specially adapted for particular environments, situations or purposes for vehicles, e.g. vehicle-to-pedestrians [V2P]

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Quality & Reliability (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

The invention discloses a dynamic optimization method for computing resources of an edge network based on unmanned aerial vehicle auxiliary nodes, and belongs to the technical field of communication. Aiming at the problems that server computing resources are insufficient and task unloading quality is deteriorated due to sudden local user traffic in an edge network cell, a computing resource dynamic optimization method based on unmanned aerial vehicle auxiliary node self-adaptive cruise is provided. According to the position distribution and task unloading requirements of ground users, a deep reinforcement learning method is adopted to dynamically plan the cruising track of the unmanned aerial vehicle, and the utilization rate of server resources of unmanned aerial vehicle nodes and base station nodes in the cruising process is maximized through a task unloading scheduling strategy, so that the task interruption rate of local users is effectively reduced, and the average task unloading delay is reduced.

Description

Edge network computing resource dynamic optimization method based on unmanned aerial vehicle auxiliary node
Technical Field
The invention belongs to the technical field of communication, and particularly relates to a dynamic optimization method for computing resources of an edge network based on an unmanned aerial vehicle auxiliary node.
Background
With the popularization and development of mobile networks, novel applications such as augmented reality, virtual reality and automatic driving are emerging continuously, and the daily life of people is greatly enriched. However, these applications are generally time-delay-demanding and also consume a large amount of computing resources, and it is difficult for a mobile terminal to achieve fast and efficient processing for such applications. The mobile edge computing can provide computing resources required by task unloading for the user in a short distance by sinking the cloud resources to the edge network, and effectively shortens the transmission delay between the user and the cloud server.
However, rapid changes in the distribution of terrestrial users and random traffic bursts of local area users may cause huge pressure on the fixed server resources of the edge network, resulting in situations of low utilization of computing resources and deteriorated user service experience. Therefore, the low-altitude unmanned aerial vehicle is used as an auxiliary node of the edge computing network, flexible resource supplement is provided for ground nodes, and the method becomes an important mode for future network construction and development.
The invention provides a dynamic computing resource optimization method based on unmanned aerial vehicle auxiliary node adaptive cruise, aiming at the problems of server computing resource shortage and task unloading quality deterioration caused by local user traffic burst in an edge network cell, so that the task interruption rate of local users is effectively reduced, and the average task unloading delay is reduced.
Disclosure of Invention
The present invention is directed to solving the above problems of the prior art. An edge network computing resource dynamic optimization method based on unmanned aerial vehicle auxiliary nodes is provided. The technical scheme of the invention is as follows:
an edge network computing resource dynamic optimization method based on unmanned aerial vehicle auxiliary nodes comprises the following steps:
101. constructing a discrete time-state model according to a Markov decision process, comprising the steps of dispersing the cruising time of the unmanned aerial vehicle into time slots, and setting a time slot variable k and a ground-air network state vector skUnmanned aerial vehicle three-dimensional motion vector akUnmanned aerial vehicle action reward function rkWherein s isk,ak,rkCorresponding transition and change are carried out along with the increase of the time slot number k, and the initialized time slot variable k is 0;
102. the method comprises the steps that an unmanned aerial vehicle controller is used as an intelligent agent, a depth reinforcement learning model is constructed on the basis of a double-delay depth certainty strategy gradient algorithm idea, and the method comprises the steps of establishing a system environment collector, an unmanned aerial vehicle action strategy network pi, an unmanned aerial vehicle state-action value network Q, a task scheduling strategy generator, an unmanned aerial vehicle action reward generator, an experience sample storage area E and a random sample set Mini-Batch;
103. and if the three-dimensional coordinate position of the unmanned aerial vehicle does not change in the continuous n time slots, jumping to step 106, otherwise, determining a user object set I of the unmanned aerial vehicle j according to the effective coverage range of the unmanned aerial vehicle jjUser object set I of base station oo=I-IjWherein I represents the whole user object set, and is obtained through a task scheduling strategy generatorjAnd IoTask offload decision variable set
Figure BDA0003485538640000021
And
Figure BDA0003485538640000022
jumping to step 104;
104. according to
Figure BDA0003485538640000023
And
Figure BDA0003485538640000024
executing task unloading request of user i, and obtaining corresponding reward value r through unmanned aerial vehicle action reward generatorkAcquiring a k time slot unmanned aerial vehicle three-dimensional motion vector a through an unmanned aerial vehicle motion strategy network pikSpace-ground network state vector s of k time slotskAnd motion vector akCalculating to obtain sk+1Will [ s ]k,ak,rk,sk+1]Storing the experience sample into an experience sample storage area E;
105. randomly sampling from an experience sample storage area E to obtain a Mini-Batch sample set, respectively importing the Mini-Batch sample set into an action strategy network pi and a state-action value network Q for training, and jumping to step 103;
106. the algorithm ends.
Go toStep(s) of constructing a discrete time-state model according to a Markov decision process in step (101), wherein a ground-to-air network state vector s of k time slotskUnmanned aerial vehicle three-dimensional motion vector akUnmanned aerial vehicle action reward function rkAs shown in formulas (1), (2) and (3):
Figure BDA0003485538640000025
Figure BDA0003485538640000026
Figure BDA0003485538640000031
in the formula (1), the first and second groups,
Figure BDA0003485538640000032
representing the three-dimensional coordinate position of drone j in k time slot,
Figure BDA0003485538640000033
representing the two-dimensional coordinate position of the user i in the k time slot; in the formula (2), the first and second groups,
Figure BDA0003485538640000034
indicating the horizontal direction of motion of drone j in k slots,
Figure BDA0003485538640000035
representing the vertical movement distance of the unmanned plane j in the k time slot; in the formula (3), ω represents the weighting factor of the unmanned aerial vehicle action reward function, ω ∈ (0,1), Δ t represents the time slot size,
Figure BDA0003485538640000036
represents the average unit task delay of k time slot user i, as shown in equation (4),
Figure BDA0003485538640000037
the average unit task time delay of the user i representing the k time slot meets the average unit task tolerance time delay tauiOtherwise is
Figure BDA0003485538640000038
As shown in equation (5):
Figure BDA0003485538640000039
Figure BDA00034855386400000310
in the formula (4), the first and second groups,
Figure BDA00034855386400000311
representing the connection state of the user i and the unmanned plane j, and if the user i unloads the task to the unmanned plane j for execution in the k time slot
Figure BDA00034855386400000312
Otherwise
Figure BDA00034855386400000313
Figure BDA00034855386400000314
Representing the connection state of the user i and the base station o, if the user i unloads the task to the base station o for execution in the k time slot
Figure BDA00034855386400000315
Otherwise
Figure BDA00034855386400000316
User i can only be connected with one unmanned aerial vehicle or base station at most in k time slot, namely
Figure BDA00034855386400000317
Figure BDA00034855386400000318
Indicating the amount of tasks user i offloads to drone j in k slots,
Figure BDA00034855386400000319
indicating the amount of tasks, τ, that user i offloads to base station o in k slotsiAnd indicating the average unit task tolerance time delay of the user i.
Further, the step 102 is to construct a deep reinforcement learning model based on the idea of a double-delay deep deterministic strategy gradient algorithm, and includes establishing a system environment collector, an unmanned aerial vehicle action strategy network pi, an unmanned aerial vehicle state-action value network Q, a task scheduling strategy generator, an unmanned aerial vehicle action reward generator, an experience sample storage area E, and a random sample set Mini-Batch, which specifically includes:
the system environment collector is used for collecting two-dimensional coordinate positions of ground users in a k time slot ground-air network
Figure BDA0003485538640000041
User task unloading request and three-dimensional coordinate position of unmanned aerial vehicle
Figure BDA0003485538640000042
And the remaining available computing resources of the drone; unmanned aerial vehicle action strategy network pi generation k time slot ground-air network state skThree-dimensional motion vector a of lower unmanned aerial vehiclek(ii) a Generation of k-slot ground-air network state s by unmanned aerial vehicle state-action value network QkLower execution unmanned aerial vehicle three-dimensional motion vector akThe action evaluation value q of (1); the task scheduling strategy generator is used for generating a k time slot user unloading strategy and obtaining a task unloading decision variable set
Figure BDA0003485538640000043
And
Figure BDA0003485538640000044
the unmanned aerial vehicle action reward generator generates an action reward value r of an unmanned aerial vehicle j in a k time slot after finishing an unloading task in the k time slotk(ii) a Unmanned aerial vehicle carries out akThe ground-to-air network state after the action is composed ofkIs transferred to sk+1(ii) a K time slot experience sample [ s ] is added in the experience sample storage area Ek,ak,rk,sk+1](ii) a The random sample set Mini-Batch is composed of an empirical sample storage area E which randomly extracts a fixed number of samples. The unmanned aerial vehicle action strategy network pi and the unmanned aerial vehicle state-action value network Q are both neural networks and respectively comprise a plurality of hidden layers, and each hidden layer comprises a plurality of neurons.
Further, the task scheduling policy generator in step 103 decides a set of task offload variables of the user
Figure BDA0003485538640000045
And
Figure BDA0003485538640000046
the method comprises the following steps:
1) adding user I in effective coverage area of unmanned aerial vehicle j into unmanned aerial vehicle j service object set IjLet us order
Figure BDA0003485538640000047
Figure BDA0003485538640000048
Base station o service object set Io=I-IjLet us order
Figure BDA0003485538640000049
Are respectively to IjAnd IoUser i according to
Figure BDA00034855386400000410
Arranging in descending order;
2) offloading latency according to user i task
Figure BDA00034855386400000411
Calculation of IjWorkload offloaded by user i to drone j
Figure BDA00034855386400000412
3) Offloading latency according to user i task
Figure BDA00034855386400000413
Calculation of IoWorkload for user i offloading to base station o
Figure BDA00034855386400000414
Further, the task amount unloaded by the user i to the unmanned aerial vehicle j in the step 2)
Figure BDA00034855386400000415
The calculation method of (2) is shown in the formulas (6) and (7):
Figure BDA0003485538640000051
Figure BDA0003485538640000052
wherein the content of the first and second substances,
Figure BDA0003485538640000053
denotes the computing resources allocated by k-slot drone j for user i, CjRepresenting the total amount of computing resources for drone j,
Figure BDA0003485538640000054
represents the uplink transmission rate from the k-slot user i to the unmanned plane j, F represents the size of the task unit,
Figure BDA0003485538640000055
representing the task complexity of user i.
Further, the task amount of the user i to be offloaded to the base station o in the step 3)
Figure BDA0003485538640000056
The calculation method of (2) is shown in the formulas (8) and (9):
Figure BDA0003485538640000057
Figure BDA0003485538640000058
wherein the content of the first and second substances,
Figure BDA0003485538640000059
indicating the computing resources allocated by the k-slot base station o to user i, CoRepresenting the total amount of computational resources of the base station o,
Figure BDA00034855386400000510
indicating the uplink transmission rate of k-slot user i to base station o.
Further, the task unloading time delay of the user i in the step 2) and the step 3) is shown in formula (10), and the task unloading time delay constraint is shown in formula (11):
Figure BDA00034855386400000511
Figure BDA00034855386400000512
in the formula (10), because the size of the task calculation result is much smaller than that of the task, only the uplink transmission delay of the user task unloading and the calculation delay of the task are considered, the downlink transmission delay of the task calculation result is ignored,
Figure BDA0003485538640000061
representing the total time delay for task offloading for k-slot user i,
Figure BDA0003485538640000062
the transmission delay of the unloading task of the k time slot user i is represented as shown in formula (12);
Figure BDA0003485538640000063
indicating uninstallation of the taskThe calculated delay of a transaction is shown in equation (13):
Figure BDA0003485538640000064
Figure BDA0003485538640000065
in the formula (12), the first and second groups,
Figure BDA0003485538640000066
the uplink transmission rates from k slot user i to drone j and base station o are respectively represented by equations (14) and (15):
Figure BDA0003485538640000067
Figure BDA0003485538640000068
in the formulas (14) and (15), W is the user channel bandwidth, piFor user transmit power, σ2In order to be able to measure the power of the noise,
Figure BDA0003485538640000069
and
Figure BDA00034855386400000610
representing the communication channel gains for k-slot user i to drone j and base station o, respectively.
Further, in the step 104, a k-slot three-dimensional action vector a of the unmanned aerial vehicle is obtained through the unmanned aerial vehicle action strategy network pikSpace-ground network state vector s of k time slotskAnd motion vector akCalculating to obtain sk+1The method specifically comprises the following steps:
space-to-ground network state vector of k time slots
Figure BDA00034855386400000611
Inputting an unmanned aerial vehicle action strategy network pi, and obtaining a three-dimensional action vector of an unmanned aerial vehicle j through forward propagation of neurons in each layer in the pi network
Figure BDA00034855386400000612
Is obtained by calculation
Figure BDA00034855386400000613
Wherein the content of the first and second substances,
Figure BDA00034855386400000614
and L is the horizontal moving distance of the k time slot unmanned plane j.
Further, in the step 105, a Mini-Batch sample data set is obtained from the empirical sample storage area E in a random sampling manner, and the method for optimizing the state-action value network and the action policy network includes:
to solve the state-action value network Q, Q includes Q(s)k,ak1 Q) And
Figure BDA0003485538640000071
action policy network pi(s)kπ) The network is unstable in the learning process, and Q(s) is definedk,ak1 Q) Is Q'(s)k,ak1 Q′),
Figure BDA0003485538640000072
The target network of
Figure BDA0003485538640000073
π(skπ) Is pi'(s)kπ′)。
Updating a state-action value network by a gradient descent method
Figure BDA0003485538640000074
Parameter (d) of
Figure BDA0003485538640000075
As shown in equation (16):
Figure BDA0003485538640000076
wherein
Figure BDA0003485538640000077
Is composed of
Figure BDA0003485538640000078
The learning rate of (a) is determined,
Figure BDA0003485538640000079
to represent
Figure BDA00034855386400000710
Network structure parameters, loss functions
Figure BDA00034855386400000711
As shown in equation (17):
Figure BDA00034855386400000712
wherein, a'k+1=ak+1+ ε, ε -clip (N (0, σ), - κ, κ), clip (·) denotes the clipping function, N denotes the mean 0, the variance σ Gaussian noise, κ denotes the clipping parameter, γ denotes the discounting factor, X denotes the sample set randomly sampled from E, X ═ { X ·k},xk=[sk,ak,rk,sk+1]。
Action policy network pi(s)kπ) Network parameter θ ofπThe update is shown in equation (18):
Figure BDA00034855386400000713
wherein muπIs pi(s)kπ) Learning rate of thetaπDenotes pi(s)kπ) Network structure parameter,. pi(s)kπ) Of (2) a gradient of the strategy
Figure BDA00034855386400000714
As shown in equation (19):
Figure BDA00034855386400000715
target network
Figure BDA0003485538640000081
And pi'(s)kπ′) Network parameter in
Figure BDA0003485538640000082
And thetaπ′The updating is shown in formulas (20) and (21), wherein the factor is updated
Figure BDA0003485538640000083
Figure BDA0003485538640000084
Figure BDA0003485538640000085
The invention has the following advantages and beneficial effects:
the invention discloses a dynamic optimization method for computing resources of an edge network based on unmanned aerial vehicle auxiliary nodes. The existing problem of task unloading of the edge network based on unmanned aerial vehicle assistance mostly focuses on reducing task unloading time delay of ground users through optimized deployment of unmanned aerial vehicle resources, but neglects the situation that local area user traffic is sudden possibly in an actual scene. The invention provides a dynamic optimization method of computing resources based on unmanned aerial vehicle auxiliary node adaptive cruise, aiming at the problems of shortage of edge server resources and deterioration of task unloading quality caused by burst of local traffic in a cell. According to the position distribution and task unloading requirements of ground users, a deep reinforcement learning method is adopted to dynamically plan the cruising track of the unmanned aerial vehicle, and the utilization rate of server resources of unmanned aerial vehicle nodes and base station nodes in the cruising process is maximized through a task unloading scheduling strategy, so that the task interruption rate of local users is effectively reduced, and the average task unloading delay is reduced.
Drawings
Fig. 1 is a flowchart of a method for dynamically optimizing edge network computing resources based on an auxiliary node of an unmanned aerial vehicle according to a preferred embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be described in detail and clearly with reference to the accompanying drawings. The described embodiments are only some of the embodiments of the present invention.
The technical scheme for solving the technical problems is as follows:
the concepts and models involved in the present disclosure are as follows.
1. And (3) system model:
assuming that users in an edge network cell are randomly distributed, an edge server can provide task unloading service for the users in the cell through a cell base station. An unmanned aerial vehicle auxiliary edge node is configured in the cell, and task unloading service can be provided for users in the effective coverage range of the unmanned aerial vehicle auxiliary edge node. When local user traffic in a cell is sudden, the unmanned aerial vehicle node can optimize the distribution state of computing resources and task unloading scheduling through self-adaptive cruise, the task interruption rate of local users is reduced, and the average task unloading delay is reduced.
2. Other symbols relating to the present invention are described below:
sk: state vector
ak: motion vector
rk: reward function
π(skπ): unmanned aerial vehicle action policy network
Q(sk,ak1 Q)、
Figure BDA0003485538640000091
Unmanned aerial vehicle state-action value network
θ: neural network parameters
Δ t: time slot size
Figure BDA0003485538640000092
Average unit task time delay of user i in k time slot
Figure BDA0003485538640000093
Whether user i meets the average unit task tolerance time delay in k time slot
Figure BDA0003485538640000094
User i offloads to unmanned aerial vehicle j task volume in k time slot
Figure BDA0003485538640000095
User i offloads to base station o task volume in k time slot
Figure BDA0003485538640000096
Indicating that unmanned plane j allocates computing resources for user i in k time slot
Figure BDA0003485538640000097
Indicating the computing resources allocated by base station o to user i in k time slots
Figure BDA0003485538640000098
Horizontal moving direction of unmanned aerial vehicle in k time slot
Figure BDA0003485538640000099
Vertical movement distance of unmanned aerial vehicle in k time slot
Figure BDA00034855386400000910
User i is in k time slot and unmanned aerial vehicle j's connected state
Figure BDA00034855386400000911
Connection state of user i with base station o in k time slot
Figure BDA00034855386400000912
Transmission delay of user i offload task
Dcomp: computing latency for user i offloading tasks
Cj: total amount of computing resources for drone j
Co: total amount of computing resources of base station o
W: user channel bandwidth
pi: user i transmit power
σ2: noise power
Figure BDA0003485538640000101
Representing communication channel gain from k slot user i to drone j
Figure BDA0003485538640000102
Representing communication channel gain for k-slot users i to base station o
F: unit task size
Figure BDA0003485538640000103
User i task complexity
The technical scheme of the invention is explained as follows:
1. task offload latency and constraints thereof
Task offload latency for user i
Figure BDA0003485538640000104
As shown in equation (1), the task latency constraint is shown in equation (2):
Figure BDA0003485538640000105
Figure BDA0003485538640000106
in the formula (1)
Figure BDA0003485538640000107
The transmission delay of the unloading task of the k-slot user i is shown, as shown in formula (3),
Figure BDA0003485538640000108
the calculation time delay of the user i for unloading the task is shown as formula (4):
Figure BDA0003485538640000109
Figure BDA00034855386400001010
in the formula (3), F represents the unit task size,
Figure BDA00034855386400001011
respectively representing the uplink transmission rates from k time slot user i to unmanned aerial vehicle j and base station o, in formula (4)
Figure BDA00034855386400001012
Indicating the task complexity of the user i,
Figure BDA00034855386400001013
indicating the computing resources allocated by drone j for user i in k time slots,
Figure BDA00034855386400001014
indicating the computational resources allocated by base station o for user i in k time slots.
Figure BDA00034855386400001015
As shown in formulas (5) and (6):
Figure BDA0003485538640000111
Figure BDA0003485538640000112
in equations (5) and (6), W is the user channel bandwidth, piFor user transmit power, σ2In order to be able to measure the power of the noise,
Figure BDA0003485538640000113
and
Figure BDA0003485538640000114
representing the communication channel gains for k slot user i to drone j and base station o, respectively.
2. State vector, action vector and reward function of Markov decision model
The ground-air network state vector, the unmanned aerial vehicle three-dimensional motion vector and the unmanned aerial vehicle motion reward function are respectively shown in formulas (7), (8) and (9):
Figure BDA0003485538640000115
Figure BDA0003485538640000116
Figure BDA0003485538640000117
in the formula (7), the first and second groups,
Figure BDA0003485538640000118
representing the three-dimensional coordinate position of drone j in k time slot,
Figure BDA0003485538640000119
representing the two-dimensional coordinate position of the user i in the k time slot; in the formula (8), the first and second groups,
Figure BDA00034855386400001110
indicating the horizontal direction of motion of drone j in k slots,
Figure BDA00034855386400001111
indicating the vertical movement distance of drone j in k time slot. In the formula (9), ω represents the weighting factor of the unmanned aerial vehicle action reward function, ω belongs to (0,1),
Figure BDA00034855386400001112
represents the average unit task delay of k time slot user i, as shown in equation (10),
Figure BDA00034855386400001113
the average unit task time delay of the user i of the k time slot meets the average unit task tolerance time delay, otherwise, the average unit task tolerance time delay is
Figure BDA00034855386400001114
As shown in formula (11):
Figure BDA00034855386400001115
Figure BDA00034855386400001116
in the formula (10), the first and second groups,
Figure BDA0003485538640000121
representing user i and dronej, if the user i unloads the task to the unmanned plane j for execution in the k time slot, then
Figure BDA0003485538640000122
Otherwise
Figure BDA0003485538640000123
Figure BDA0003485538640000124
Representing the connection state of the user i and the base station o, if the user i unloads the task to the base station o for execution in the k time slot
Figure BDA0003485538640000125
Otherwise
Figure BDA0003485538640000126
User i can only be connected with one unmanned aerial vehicle or base station at most in k time slot, namely
Figure BDA0003485538640000127
The delta-t represents the size of the time slot,
Figure BDA0003485538640000128
indicating the amount of tasks user i offloads to drone j in k slots,
Figure BDA0003485538640000129
indicating the amount of tasks, τ, that user i offloads to base station o in k slotsiAnd indicating the average unit task tolerance time delay of the user i.
3. Deep reinforcement learning model constructed based on double-delay deep certainty strategy gradient algorithm idea
According to the Markov decision process, the cruising time of the unmanned aerial vehicle is divided into a plurality of time slots with the same size, and in any time slot K (epsilon) K, the relative position relation and the connection state of the unmanned aerial vehicle and a ground user are unchanged.
An unmanned aerial vehicle controller deployed in a base station control center is used as an intelligent agent, and the deep construction is based on the idea of a double-delay deep deterministic strategy gradient algorithmThe method comprises the following steps of Learning a model, wherein the idea of a double-delay depth deterministic strategy gradient algorithm is derived from the documents Fujimoto S, Hoof H V, Meger D.addressing Function application Error in Actor-critical methods.35th International Conference on Machine Learning, ICML 2018 and July 10,2018-July 15,2018. The deep reinforcement learning model comprises a system environment collector and an unmanned aerial vehicle action strategy network pi(s) based on a neural networkkπ) And unmanned aerial vehicle state-action value network Q(s)k,akQ) The system comprises a task scheduling strategy generator, an unmanned aerial vehicle action reward generator, an experience sample storage area E and a random sample set Mini-Batch.
The system environment collector is used for collecting two-dimensional coordinate positions of ground users in a k time slot ground-air network
Figure BDA00034855386400001210
User task unloading request and three-dimensional coordinate position of unmanned aerial vehicle
Figure BDA00034855386400001211
And the remaining available computing resources of the drone; unmanned aerial vehicle action strategy network pi(s)kπ) For generating a ground-to-air network state vector s at k time slotskThree-dimensional motion vector a of lower unmanned aerial vehiclek。π(skπ) Two hidden layers can be adopted, each layer is a neural network with 256 neurons, and the neuron activation function can adopt a Relu function; unmanned aerial vehicle state-action value network Q(s)k,akQ) For generating a ground-to-air network state vector s at k time slotskThree-dimensional motion vector a of lower execution unmanned aerial vehiclekThe operation evaluation value Q of (2) may be Q(s) having the same structurek,ak1 Q) And
Figure BDA0003485538640000131
the neural network can adopt three layers of hidden layers, 256 neurons are respectively configured, and the neuron activation function can adopt a Relu function.
Task scheduling policy generator for generating k time slotsUser unloading strategy, and respectively obtaining user object set I of unmanned aerial vehicle jjAnd user object set I of base station ooTask offload decision variable set
Figure BDA0003485538640000132
And
Figure BDA0003485538640000133
the unmanned aerial vehicle action reward generator generates an action reward value r after the unmanned aerial vehicle j finishes the unloading task of the k time slotk(ii) a Unmanned aerial vehicle carries out akThe ground-to-air network state after the action is composed ofkIs transferred to sk+1(ii) a K time slot experience sample [ s ] is added in the experience sample storage area Ek,ak,rk,sk+1](ii) a The random sample set Mini-Batch is formed by randomly extracting a fixed number of samples from an experience sample storage area E; leading the Mini-Batch sample set into an action strategy network pi(s) respectivelykπ) State-action value network Q(s)k,ak1 Q) And
Figure BDA0003485538640000134
training to update neural network parameters θπ、θ1 Q
Figure BDA0003485538640000135
4. By the ground-to-air network state vector skAnd motion vector akCalculating to obtain sk+1Method (2)
Space-to-ground network state vector of k time slots
Figure BDA0003485538640000136
Inputting an unmanned aerial vehicle action strategy network pi, and obtaining a three-dimensional action vector of an unmanned aerial vehicle j through forward propagation of neurons in each layer in the pi network
Figure BDA0003485538640000137
Is obtained by calculation
Figure BDA0003485538640000138
Wherein the content of the first and second substances,
Figure BDA0003485538640000139
and L is the horizontal moving distance of the k time slot unmanned plane j.
5. Task volume calculation for user offloading to unmanned aerial vehicle
The task amount calculation method for unloading the user to the unmanned aerial vehicle is shown in formulas (12) and (13):
Figure BDA00034855386400001310
Figure BDA0003485538640000141
wherein the content of the first and second substances,
Figure BDA0003485538640000142
indicating the computing resources allocated by k-slot drone j to user i, CjRepresenting the total amount of computing resources for drone j.
6. Workload calculation for user offloading to base station
The calculation method of the task load unloaded from the user to the base station is shown in the formulas (14) and (15):
Figure BDA0003485538640000143
Figure BDA0003485538640000144
wherein the content of the first and second substances,
Figure BDA0003485538640000145
indicating the computing resources allocated by the k-slot base station o to user i, CoRepresenting the total amount of computational resources of base station o.
7. User unloading task scheduling method
1) User in effective coverage area of unmanned plane jI join unmanned j service object set IjLet us order
Figure BDA0003485538640000146
Figure BDA0003485538640000147
Base station o service object set Io=I-IjLet us order
Figure BDA0003485538640000148
Are respectively to IjAnd IoUser i according to
Figure BDA0003485538640000149
Arranging in descending order;
2) according to the formulas (12) and (13), I is calculatedjWorkload offloaded by user i to drone j
Figure BDA00034855386400001410
3) Calculating I according to the formulas (14) and (15)oWorkload for user i offloading to base station o
Figure BDA00034855386400001411
8. State-action value network, action strategy network updating method
To solve the state-action value network Q(s)k,ak1 Q) And
Figure BDA00034855386400001412
action policy network pi(s)kπ) The unstable problem of the network in the learning process defines Q(s)k,ak1 Q) Is Q'(s)k,ak1 Q′),
Figure BDA0003485538640000151
The target network of
Figure BDA0003485538640000152
π(skπ) Is pi'(s)kπ′)。
Updating a state-action value network by a gradient descent method
Figure BDA0003485538640000153
Parameter (d) of
Figure BDA0003485538640000154
As shown in equation (16):
Figure BDA0003485538640000155
wherein
Figure BDA0003485538640000156
Is composed of
Figure BDA0003485538640000157
Learning rate, loss function
Figure BDA0003485538640000158
As shown in equation (17):
Figure BDA0003485538640000159
wherein, a'k+1=ak+1+ ε, ε -clip (N (0, σ), - κ, κ), clip (·) denotes the clipping function, N denotes the mean 0, the variance σ Gaussian noise, κ denotes the clipping parameter, γ denotes the discounting factor, X denotes the sample set randomly sampled from E, X ═ { X ·k},xk=[sk,ak,rk,sk+1]。
Action policy network pi(s)kπ) Parameter thetaπThe update is shown in equation (18):
Figure BDA00034855386400001510
wherein muπIs pi(s)kπ) Learning rate of (n), n(s)kπ) Of (2) a gradient of the strategy
Figure BDA00034855386400001511
As shown in equation (19):
Figure BDA00034855386400001512
target network
Figure BDA00034855386400001513
And pi'(s)kπ′) Network parameter θ inπ′As shown in equations (20), (21), wherein the factors are updated
Figure BDA00034855386400001514
Figure BDA00034855386400001515
Figure BDA00034855386400001516
A dynamic optimization method for edge network computing resources based on unmanned aerial vehicle auxiliary nodes is specifically implemented by the following steps:
step 1: constructing a discrete time-state model according to a Markov decision process, comprising the steps of dispersing the cruising time of the unmanned aerial vehicle into time slots, and setting a time slot variable k and a ground-air network state vector skUnmanned aerial vehicle three-dimensional motion vector akUnmanned aerial vehicle action reward function rkWherein s isk,ak,rkCorresponding transition and change are carried out along with the increase of the time slot number k, and the initialized time slot variable k is 0;
step 2: will have no effect onThe man-machine controller is used as an intelligent agent, a depth reinforcement learning model is constructed according to the thought of the double-delay depth certainty strategy gradient algorithm, and an unmanned aerial vehicle action strategy network pi(s) is establishedkπ) Unmanned aerial vehicle state-action value network Q(s)k,ak1 Q) And
Figure BDA0003485538640000161
and step 3: and (4) making k equal to k +1, if the three-dimensional position of the unmanned aerial vehicle does not change in continuous n time slots, jumping to step 6, otherwise, determining a user object set I of the unmanned aerial vehicle j according to the effective coverage range of the unmanned aerial vehicle jjUser object set I of base station oo=I-IjWherein, I represents the whole user object set, and is obtained by the task scheduling strategy generatorjAnd IoTask offload decision variable set
Figure BDA0003485538640000162
And
Figure BDA0003485538640000163
jumping to the step 4;
and 4, step 4: according to
Figure BDA0003485538640000164
And
Figure BDA0003485538640000165
performing task offloading of user i, obtaining corresponding reward value r by unmanned aerial vehicle action reward generatorkK time slot unmanned aerial vehicle three-dimensional motion vector a is obtained through unmanned aerial vehicle motion strategy network pikSpace-ground network state vector s of k time slotskAnd motion vector akCalculating to obtain sk+1Will [ s ]k,ak,rk,sk+1]Storing the experience sample into an experience sample storage area E;
and 5: randomly sampling from the storage area E to obtain a Mini-Batch sample set, and respectively importing the Mini-Batch sample set into an action strategy network pi(s)kπ) And state-action value network Q(s)k,ak1 Q) And
Figure BDA0003485538640000166
training is carried out, and the step 3 is skipped;
step 6: the algorithm ends.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The above examples are to be construed as merely illustrative and not limitative of the remainder of the disclosure. After reading the description of the invention, the skilled person can make various changes or modifications to the invention, and these equivalent changes and modifications also fall into the scope of the invention defined by the claims.

Claims (9)

1. A dynamic optimization method for computing resources of an edge network based on an unmanned aerial vehicle auxiliary node is characterized by comprising the following steps:
101. constructing a discrete time-state model according to a Markov decision process, comprising the steps of dispersing the cruising time of the unmanned aerial vehicle into time slots, and setting a time slot variable k and a ground-air network state vector skUnmanned aerial vehicle three-dimensional motion vector akUnmanned aerial vehicle action reward function rkWherein s isk,ak,rkCorresponding transition and change are carried out along with the increase of the time slot number k, and the initialized time slot variable k is 0;
102. the method comprises the steps that an unmanned aerial vehicle controller is used as an intelligent agent, a depth reinforcement learning model is constructed on the basis of a double-delay depth certainty strategy gradient algorithm idea, and the method comprises the steps of establishing a system environment collector, an unmanned aerial vehicle action strategy network pi, an unmanned aerial vehicle state-action value network Q, a task scheduling strategy generator, an unmanned aerial vehicle action reward generator, an experience sample storage area E and a random sample set Mini-Batch;
103. and if the three-dimensional coordinate position of the unmanned aerial vehicle does not change in the continuous n time slots, jumping to step 106, otherwise, determining a user object set I of the unmanned aerial vehicle j according to the effective coverage range of the unmanned aerial vehicle jjUser object set I of base station oo=I-IjWherein I represents the whole user object set, and is obtained through a task scheduling strategy generatorjAnd IoTask offload decision variable set
Figure FDA0003485538630000011
And
Figure FDA0003485538630000012
jumping to step 104;
104. according to
Figure FDA0003485538630000013
And
Figure FDA0003485538630000014
executing task unloading request of user i, and obtaining corresponding reward value r through unmanned aerial vehicle action reward generatorkAcquiring a k time slot unmanned aerial vehicle three-dimensional motion vector a through an unmanned aerial vehicle motion strategy network pikSpace-ground network state vector s of k time slotskAnd motion vector akCalculating to obtain sk+1Will [ s ]k,ak,rk,sk+1]Storing the experience sample into an experience sample storage area E;
105. randomly sampling from an empirical sample storage area E to obtain a Mini-Batch sample set, respectively importing the Mini-Batch sample set into an action strategy network pi and a state-action value network Q for training, and jumping to a step 103;
106. the algorithm ends.
2. The method of claim 1, wherein the discrete time-state model is constructed in step 101 according to a Markov decision process, wherein the k-slot ground-to-air network state vector s iskUnmanned aerial vehicle three-dimensional motion vector akUnmanned aerial vehicle action reward function rkAs shown in formulas (1), (2) and (3):
Figure FDA0003485538630000021
Figure FDA0003485538630000022
Figure FDA0003485538630000023
in the formula (1), the first and second groups,
Figure FDA0003485538630000024
representing the three-dimensional coordinate position of drone j in k time slot,
Figure FDA0003485538630000025
representing the two-dimensional coordinate position of the user i in the k time slot; in the formula (2), the first and second groups,
Figure FDA0003485538630000026
indicating the horizontal direction of motion of drone j in k slots,
Figure FDA0003485538630000027
representing the vertical movement distance of the unmanned plane j in the k time slot; in the formula (3), ω represents the weighting factor of the unmanned aerial vehicle action reward function, ω ∈ (0,1), Δ t represents the time slot size,
Figure FDA0003485538630000028
represents the average unit task delay of k time slot user i, as shown in equation (4),
Figure FDA0003485538630000029
the average unit task time delay of the user i representing the k time slot meets the average unit task tolerance time delay tauiOtherwise is
Figure FDA00034855386300000210
As shown in equation (5):
Figure FDA00034855386300000211
Figure FDA00034855386300000212
in the formula (4), the first and second groups,
Figure FDA00034855386300000213
representing the connection state of the user i and the unmanned plane j, and if the user i unloads the task to the unmanned plane j for execution in the k time slot
Figure FDA00034855386300000214
Otherwise
Figure FDA00034855386300000215
Representing the connection state of the user i and the base station o, if the user i unloads the task to the base station o for execution in the k time slot
Figure FDA00034855386300000216
Otherwise
Figure FDA00034855386300000217
User i can only be connected with one unmanned aerial vehicle or base station at most in k time slotI.e. by
Figure FDA00034855386300000218
Figure FDA00034855386300000219
Indicating the amount of tasks user i offloads to drone j in k slots,
Figure FDA00034855386300000220
indicating the amount of tasks, τ, that user i offloads to base station o in k slotsiAnd indicating the average unit task tolerance time delay of the user i.
3. The method according to claim 1, wherein the step 102 is a deep reinforcement learning model constructed based on a double-delay deep deterministic strategy gradient algorithm idea, and includes a system environment collector, an unmanned aerial vehicle action strategy network pi, an unmanned aerial vehicle state-action value network Q, a task scheduling strategy generator, an unmanned aerial vehicle action reward generator, an experience sample storage area E, and a random sample set Mini-Batch, and specifically includes:
the system environment collector is used for collecting two-dimensional coordinate positions of ground users in a k time slot ground-air network
Figure FDA0003485538630000031
User task unloading request and three-dimensional coordinate position of unmanned aerial vehicle
Figure FDA0003485538630000032
And the remaining available computing resources of the drone; unmanned aerial vehicle action strategy network pi generation k time slot ground-air network state skThree-dimensional motion vector a of lower unmanned aerial vehiclek(ii) a Generation of k-slot ground-air network state s by unmanned aerial vehicle state-action value network QkLower execution unmanned aerial vehicle three-dimensional motion vector akThe action evaluation value q of (1); the task scheduling strategy generator is used for generating a k time slot user unloading strategy and obtaining a task unloading decision variable set
Figure FDA0003485538630000033
And
Figure FDA0003485538630000034
the unmanned aerial vehicle action reward generator generates an action reward value r of an unmanned aerial vehicle j in a k time slot after finishing an unloading task in the k time slotk(ii) a Unmanned aerial vehicle carries out akThe ground-to-air network state after the action is composed ofkIs transferred to sk+1(ii) a K time slot experience sample [ s ] is added in the experience sample storage area Ek,ak,rk,sk+1](ii) a The random sample set Mini-Batch is composed of an empirical sample storage area E which randomly extracts a fixed number of samples. The unmanned aerial vehicle action strategy network pi and the unmanned aerial vehicle state-action value network Q are both neural networks and respectively comprise a plurality of hidden layers, and each hidden layer comprises a plurality of neurons.
4. The method of claim 1, wherein the task scheduling policy generator decides a set of task offload variables for the user in step 103
Figure FDA0003485538630000035
And
Figure FDA0003485538630000036
the method comprises the following steps:
1) adding user I in effective coverage area of unmanned aerial vehicle j into unmanned aerial vehicle j service object set IjLet us order
Figure FDA0003485538630000037
Figure FDA0003485538630000038
Base station o service object set Io=I-IjLet us order
Figure FDA0003485538630000039
Are respectively to IjAnd IoUser i according to
Figure FDA00034855386300000310
Arranging in descending order;
2) offloading latency according to user i task
Figure FDA00034855386300000311
Calculation of IjWorkload offloaded by user i to drone j
Figure FDA00034855386300000312
3) Offloading latency according to user i task
Figure FDA00034855386300000313
Calculation of IoWorkload for user i offloading to base station o
Figure FDA00034855386300000314
5. The method for dynamically optimizing edge network computing resources based on unmanned aerial vehicle auxiliary nodes as claimed in claim 4, wherein the task amount offloaded by the user i to the unmanned aerial vehicle j in the step 2)
Figure FDA0003485538630000041
The calculation method of (2) is shown in the formulas (6) and (7):
Figure FDA0003485538630000042
Figure FDA0003485538630000043
wherein the content of the first and second substances,
Figure FDA0003485538630000044
denotes the computing resources allocated by k-slot drone j for user i, CjRepresenting the total amount of computing resources for drone j,
Figure FDA0003485538630000045
represents the uplink transmission rate from the k-slot user i to the unmanned plane j, F represents the size of the task unit,
Figure FDA0003485538630000046
representing the task complexity of user i.
6. The method for dynamically optimizing edge network computing resources based on unmanned aerial vehicle auxiliary nodes as claimed in claim 5, wherein the task amount offloaded from the user i to the base station o in the step 3)
Figure FDA0003485538630000047
The calculation method of (2) is shown in the formulas (8) and (9):
Figure FDA0003485538630000048
Figure FDA0003485538630000049
wherein the content of the first and second substances,
Figure FDA00034855386300000410
indicating the computing resources allocated by the k-slot base station o to user i, CoRepresenting the total amount of computational resources of the base station o,
Figure FDA00034855386300000411
indicating the uplink transmission rate of k-slot user i to base station o.
7. The method for dynamically optimizing edge network computing resources based on unmanned aerial vehicle auxiliary nodes according to claim 4, wherein the task offload delay of the user i in the step 2) and the step 3) is shown in formula (10), and the task offload delay constraint is shown in formula (11):
Figure FDA0003485538630000051
Figure FDA0003485538630000052
in the formula (10), because the size of the task calculation result is much smaller than that of the task, only the uplink transmission delay of the user task unloading and the calculation delay of the task are considered, the downlink transmission delay of the task calculation result is ignored,
Figure FDA0003485538630000053
representing the total time delay for task offloading for k-slot user i,
Figure FDA0003485538630000054
the transmission delay of the unloading task of the k time slot user i is represented as shown in formula (12);
Figure FDA0003485538630000055
represents the calculation time delay of the unloading task, as shown in formula (13):
Figure FDA0003485538630000056
Figure FDA0003485538630000057
in the formula (12), the first and second groups,
Figure FDA0003485538630000058
the uplink transmission rates from k slot user i to drone j and base station o are respectively represented by equations (14) and (15):
Figure FDA0003485538630000059
Figure FDA00034855386300000510
in the formulas (14) and (15), W is the user channel bandwidth, piFor user transmit power, σ2In order to be able to measure the power of the noise,
Figure FDA00034855386300000511
and
Figure FDA00034855386300000512
representing the communication channel gains for k-slot user i to drone j and base station o, respectively.
8. The dynamic optimization method for edge network computing resources based on unmanned aerial vehicle auxiliary nodes as claimed in claim 1, wherein k-slot unmanned aerial vehicle three-dimensional motion vector a is obtained through unmanned aerial vehicle motion strategy network pi in step 104kSpace-ground network state vector s of k time slotskAnd motion vector akCalculating to obtain sk+1The method specifically comprises the following steps:
space-earth network state vector for k time slots
Figure FDA00034855386300000513
Inputting an unmanned aerial vehicle action strategy network pi, and obtaining a three-dimensional action vector of an unmanned aerial vehicle j through forward propagation of neurons in each layer in the pi network
Figure FDA0003485538630000061
Is obtained by calculation
Figure FDA0003485538630000062
Wherein the content of the first and second substances,
Figure FDA0003485538630000063
and L is the horizontal moving distance of the k time slot unmanned plane j.
9. The method according to claim 1, wherein in step 105, a Mini-Batch sample data set is obtained from an experience sample storage area E by a random sampling method, and the method for optimizing the state-action value network and the action policy network comprises:
to solve the state-action value network Q, Q comprises
Figure FDA0003485538630000064
And
Figure FDA0003485538630000065
and action strategy network pi(s)kπ) The network is defined in the unstable problem of learning process
Figure FDA0003485538630000066
The target network of
Figure FDA0003485538630000067
The target network of
Figure FDA0003485538630000068
π(skπ) Target network of(s) is pi'(s)kπ′)。
Updating a state-action value network by a gradient descent method
Figure FDA0003485538630000069
Parameter (d) of
Figure FDA00034855386300000610
As shown in equation (16):
Figure FDA00034855386300000611
wherein
Figure FDA00034855386300000612
Is composed of
Figure FDA00034855386300000613
The learning rate of (a) is determined,
Figure FDA00034855386300000614
to represent
Figure FDA00034855386300000615
Network structure parameters, loss functions
Figure FDA00034855386300000616
As shown in equation (17):
Figure FDA00034855386300000617
wherein, a'k+1=ak+1+ ε, ε -clip (N (0, σ), - κ, κ), clip (·) denotes the clipping function, N denotes the mean 0, the variance σ Gaussian noise, κ denotes the clipping parameter, γ denotes the discounting factor, X denotes the sample set randomly sampled from E, X ═ { X ·k},xk=[sk,ak,rk,sk+1]。
Action policy network pi(s)kπ) Network parameter θ ofπThe update is shown in equation (18):
Figure FDA0003485538630000071
wherein muπIs pi(s)kπ) Learning rate of thetaπDenotes pi(s)kπ) Network structure parameter,. pi(s)kπ) Of (2) a gradient of the strategy
Figure FDA0003485538630000072
As shown in equation (19):
Figure FDA0003485538630000073
target network
Figure FDA0003485538630000074
And pi'(s)kπ′) Network parameter in
Figure FDA0003485538630000075
And thetaπ′The updating is shown in the formulas (20) and (21), wherein, the factor is updated
Figure FDA0003485538630000077
Figure FDA0003485538630000076
Figure FDA0003485538630000078
CN202210079544.6A 2022-01-24 2022-01-24 Edge network computing resource dynamic optimization method based on unmanned aerial vehicle auxiliary node Pending CN114513814A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210079544.6A CN114513814A (en) 2022-01-24 2022-01-24 Edge network computing resource dynamic optimization method based on unmanned aerial vehicle auxiliary node

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210079544.6A CN114513814A (en) 2022-01-24 2022-01-24 Edge network computing resource dynamic optimization method based on unmanned aerial vehicle auxiliary node

Publications (1)

Publication Number Publication Date
CN114513814A true CN114513814A (en) 2022-05-17

Family

ID=81549326

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210079544.6A Pending CN114513814A (en) 2022-01-24 2022-01-24 Edge network computing resource dynamic optimization method based on unmanned aerial vehicle auxiliary node

Country Status (1)

Country Link
CN (1) CN114513814A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116257361A (en) * 2023-03-15 2023-06-13 北京信息科技大学 Unmanned aerial vehicle-assisted fault-prone mobile edge computing resource scheduling optimization method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116257361A (en) * 2023-03-15 2023-06-13 北京信息科技大学 Unmanned aerial vehicle-assisted fault-prone mobile edge computing resource scheduling optimization method
CN116257361B (en) * 2023-03-15 2023-11-10 北京信息科技大学 Unmanned aerial vehicle-assisted fault-prone mobile edge computing resource scheduling optimization method

Similar Documents

Publication Publication Date Title
CN111800828B (en) Mobile edge computing resource allocation method for ultra-dense network
CN111556461B (en) Vehicle-mounted edge network task distribution and unloading method based on deep Q network
CN111132077A (en) Multi-access edge computing task unloading method based on D2D in Internet of vehicles environment
Chen et al. Efficiency and fairness oriented dynamic task offloading in internet of vehicles
CN112543049B (en) Energy efficiency optimization method and device of integrated ground satellite network
CN112911648A (en) Air-ground combined mobile edge calculation unloading optimization method
CN113395654A (en) Method for task unloading and resource allocation of multiple unmanned aerial vehicles of edge computing system
CN114051254B (en) Green cloud edge collaborative computing unloading method based on star-ground fusion network
CN113543074A (en) Joint computing migration and resource allocation method based on vehicle-road cloud cooperation
CN113613301B (en) Air-ground integrated network intelligent switching method based on DQN
WO2022242468A1 (en) Task offloading method and apparatus, scheduling optimization method and apparatus, electronic device, and storage medium
CN113282352A (en) Energy-saving unloading method based on multi-unmanned aerial vehicle cooperative auxiliary edge calculation
Zhou et al. Dynamic channel allocation for multi-UAVs: A deep reinforcement learning approach
CN116887355A (en) Multi-unmanned aerial vehicle fair collaboration and task unloading optimization method and system
CN114785397A (en) Unmanned aerial vehicle base station control method, flight trajectory optimization model construction and training method
CN116321293A (en) Edge computing unloading and resource allocation method based on multi-agent reinforcement learning
CN116112981A (en) Unmanned aerial vehicle task unloading method based on edge calculation
CN114513814A (en) Edge network computing resource dynamic optimization method based on unmanned aerial vehicle auxiliary node
Li et al. DNN partition and offloading strategy with improved particle swarm genetic algorithm in VEC
CN112911618B (en) Unmanned aerial vehicle server task unloading scheduling method based on resource exit scene
CN114520991B (en) Unmanned aerial vehicle cluster-based edge network self-adaptive deployment method
CN111930435B (en) Task unloading decision method based on PD-BPSO technology
Zhou et al. Improved artificial bee colony algorithm-based channel allocation scheme in low earth orbit satellite downlinks
CN116208968B (en) Track planning method and device based on federal learning
CN116321181A (en) Online track and resource optimization method for multi-unmanned aerial vehicle auxiliary edge calculation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination