CN114268986A - Unmanned aerial vehicle computing unloading and charging service efficiency optimization method - Google Patents

Unmanned aerial vehicle computing unloading and charging service efficiency optimization method Download PDF

Info

Publication number
CN114268986A
CN114268986A CN202111529547.7A CN202111529547A CN114268986A CN 114268986 A CN114268986 A CN 114268986A CN 202111529547 A CN202111529547 A CN 202111529547A CN 114268986 A CN114268986 A CN 114268986A
Authority
CN
China
Prior art keywords
aerial vehicle
unmanned aerial
internet
service
things
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111529547.7A
Other languages
Chinese (zh)
Inventor
丁文锐
王晨晨
罗祎喆
王玉峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN202111529547.7A priority Critical patent/CN114268986A/en
Publication of CN114268986A publication Critical patent/CN114268986A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses an unmanned aerial vehicle computing unloading and charging service efficiency optimization method facing to the Internet of things, which comprises the following steps: a reinforcement learning network is constructed in advance; sensing a current environmental state; determining the action of the unmanned aerial vehicle through a reinforcement learning network based on the perception state data; and determining the flight path of the unmanned aerial vehicle and the decision of calculating unloading service and charging service of the sensor of the Internet of things based on the action of the unmanned aerial vehicle. According to the invention, the unmanned aerial vehicle can be used as a mobile edge server to provide calculation unloading service, and can also be used as a wireless charging device to charge the Internet of things device, and a deep reinforcement learning algorithm is used for unmanned aerial vehicle track planning and user service decision by modeling the position of the Internet of things sensor, the position of the unmanned aerial vehicle, the residual electric quantity of the Internet of things sensor, task information of the Internet of things sensor and the channel state between the unmanned aerial vehicle and the Internet of things sensor, so that the calculation complexity is low, and the calculation capability and the working time of the Internet of things system are effectively improved.

Description

Unmanned aerial vehicle computing unloading and charging service efficiency optimization method
Technical Field
The invention relates to the field of mobile edge calculation and the field of wireless energy transmission, in particular to the field of a neural network-based deep reinforcement learning algorithm, and specifically relates to an unmanned aerial vehicle calculation unloading and charging service efficiency optimization method.
Background
In recent years, with the development of the technology of the internet of things, various emerging applications of the internet of things emerge endlessly, however, many devices of the internet of things have limited or even no computing power, so the computing requirements are high, and the working time is limited due to the limited battery energy. The mobile edge computing can effectively meet the flow requirement of the Internet of things equipment in a task unloading mode by deploying computing resources near the Internet of things equipment, and reduces task delay. The unmanned aerial vehicle can be used as a mobile edge server to achieve task unloading of the Internet of things equipment due to high mobility of the unmanned aerial vehicle, and in addition, the unmanned aerial vehicle can also be used as wireless charging equipment to charge the Internet of things equipment, but the energy supply of the unmanned aerial vehicle is limited, so that how to design a flight path planning and user service strategy of the unmanned aerial vehicle is crucial to achieving maximized system efficiency.
Disclosure of Invention
Aiming at the problems, the invention provides a method for optimizing the efficiency of the calculation unloading and charging service of the unmanned aerial vehicle, which maximizes the long-term reward of the system under the condition of the energy constraint of the unmanned aerial vehicle, and reasonably plans the flight path and the user service mechanism of the unmanned aerial vehicle by reasonably setting a reward function, thereby maximizing the system efficiency.
The invention relates to an unmanned aerial vehicle computing unloading and charging service efficiency optimization method, which comprises the following specific steps:
step 1: and constructing a deep reinforcement learning network based on the DQN.
Step 2: and (3) sensing the current environment state information, and determining executable actions of the unmanned aerial vehicle through the deep reinforcement learning network constructed in the step 1.
And step 3: based on the current environment state sensing information, the values of all actions in the state are predicted through a deep reinforcement learning network, the action with the maximum value is selected, and then the flight target position of the unmanned aerial vehicle, the unmanned aerial vehicle service target and the service type decision of the Internet of things equipment are determined, wherein the service type decision comprises the calculation of unloading service and charging service, and meanwhile, the energy consumption of the unmanned aerial vehicle is calculated.
And 4, step 4: and calculating a reward function, and iteratively updating the DQN deep reinforcement learning network.
The invention has the advantages that:
according to the unmanned aerial vehicle computing unloading and charging service efficiency optimization method, long-term reward of a system is maximized under the condition of energy constraint of the unmanned aerial vehicle, the flight path and user service mechanism of the unmanned aerial vehicle are reasonably planned by reasonably setting a reward function, the service process is modeled as a Markov decision process, an intelligent decision method based on deep reinforcement learning is designed, and the flight path, user allocation and service type of the unmanned aerial vehicle are optimized in a combined manner, so that the unmanned aerial vehicle can make a decision quickly according to environmental state information, the problem of high computation amount of a traditional algorithm is solved, and the decision cost is reduced.
Drawings
Fig. 1 is a flowchart of a method for optimizing efficiency of a calculation offloading and charging service of an unmanned aerial vehicle according to the present invention;
fig. 2 is a diagram of a system model applied in the method for optimizing efficiency of calculation offloading and charging service for an unmanned aerial vehicle according to the present invention;
fig. 3 is a simulation diagram of cumulative rewards as a function of iteration number in an embodiment of the method for calculating the efficiency optimization of the offloading and charging services by the unmanned aerial vehicle of the invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings.
The invention discloses an unmanned aerial vehicle computing unloading and charging service efficiency optimization method, which comprises the following steps as shown in figure 1:
step 1: and constructing a deep reinforcement learning network based on the DQN, and initializing parameters of the DQN neural network.
(1) And (3) constructing a DQN algorithm, namely a neural network structure including a real network and an estimation network, wherein the neural network structure and the real network structure are the same but have different parameters, the input is the current environmental state and the action taken by the unmanned aerial vehicle, and the output is the value of the environmental state-action pair.
(2) And constructing a memory base to store data generated by interaction between the unmanned aerial vehicle and the environment, wherein the data comprises environment state information data, unmanned aerial vehicle action data and the like generated by each interaction, and randomly extracting a part of data in the memory base for updating when updating the estimated network parameters in the DQN algorithm every time so as to break the correlation and unsteady distribution problems among the data.
Step 2: and (3) sensing the current environment state information, and determining the action of the unmanned aerial vehicle through the deep reinforcement learning network constructed in the step (1), wherein the action comprises unmanned aerial vehicle flight target position selection, service target selection and service type selection.
Sensing current position information of each Internet of things device, current position information of the unmanned aerial vehicle, current residual power information of each Internet of things device, current residual power information of the unmanned aerial vehicle, calculation task size information (namely the number of data bits required to be processed) of each Internet of things device, and channel state information between the unmanned aerial vehicle and each Internet of things device by using an unmanned aerial vehicle airborne sensor as state information. The environmental state information data is expressed in the form of the following matrix:
Figure BDA0003410227520000021
wherein S (t) represents environmental status information at time t,
Figure BDA0003410227520000022
respectively representing the positions of N pieces of Internet of things equipment at the time t; lV(t) represents the position of the drone at time t;
Figure BDA0003410227520000023
respectively representing the residual electric quantity of N pieces of Internet of things equipment at the time t; eV(t) represents the remaining capacity of the unmanned aerial vehicle at the moment t;
Figure BDA0003410227520000024
respectively representing the task sizes of N pieces of Internet of things equipment at t moment, wherein the unit is bit;
Figure BDA0003410227520000025
respectively represent the channel gain between N thing networking device and the unmanned aerial vehicle at moment t.
Based on the perception state data, determining the action of the unmanned aerial vehicle through a deep reinforcement learning network, and the method comprises the following steps: unmanned aerial vehicle flight target position selection, service target selection and service type selection.
The motion data is expressed in matrix form:
A(t)=[an(t),am(t),aT(t)]
wherein A (t) represents the action executed by the unmanned aerial vehicle at the time t, an(t) Internet of things equipment for unmanned aerial vehicle service selection at time t, an(t)∈{1,2,…,N};am(t) represents the position to which the unmanned aerial vehicle flies at time t, am(t) is belonged to {1,2, …, M }, and represents flying to a certain access point in M access points which are set in advance; a isT(t) indicates the type of service selected by the drone at time t, aT(t)∈{0,1},aT(t) ═ 0 denotes that the drone provides computation offload service, aTIf (t) ═ 1 indicates that the drone provides the charging service, the drone has M × 2 × 2 executable actions in total.
And step 3: based on the current environment state sensing information, the values of all actions in the state are predicted through a deep reinforcement learning network, the action with the maximum value is selected, and then the flight target position of the unmanned aerial vehicle, the unmanned aerial vehicle service target and the service type decision of the Internet of things equipment are determined, wherein the service type decision comprises the calculation of unloading service and charging service, and meanwhile, the energy consumption of the unmanned aerial vehicle is calculated.
Wherein, unmanned aerial vehicle energy consumption includes 4 kinds of energy consumption, as follows:
(1) unmanned aerial vehicle flight energy consumption: the energy consumption of the horizontal flight of the unmanned aerial vehicle is considered, and the constant horizontal flight speed V of the unmanned aerial vehicle is consideredfFlying in the air with the height h and the flying power PfThe flight energy consumption of the unmanned aerial vehicle at the moment t is related to the current moment position and the next moment position, and the position state of the unmanned aerial vehicle under the three-dimensional coordinate system at the moment t is lV(t)=[xV(t),yV(t),h]Then unmanned aerial vehicle flight energy consumption does:
Figure BDA0003410227520000031
(2) unmanned aerial vehicle energy consumption that spirals: the method refers to the hovering energy consumption of the unmanned aerial vehicle for providing calculation unloading service or charging service for the Internet of things equipment at the moment t, and the hovering power of the unmanned aerial vehicle is considered to be constant PhWhen the unmanned aerial vehicle provides calculation unloading service, the channel is considered as a direct-view channel, and the position state of the Internet of things equipment n
Figure BDA0003410227520000032
The channel gain and the data transmission rate between the unmanned aerial vehicle and the internet of things device n are respectively as follows:
Figure BDA0003410227520000033
Figure BDA0003410227520000034
where B is the channel bandwidth, p0For channel power gain at a reference distance of 1 meter, PnIs the fixed transmission power, sigma, of the Internet of things equipment n2And the power is gaussian white noise, the rotating energy consumption when the unmanned aerial vehicle provides the calculation unloading service is as follows:
Figure BDA0003410227520000041
wherein the content of the first and second substances,
Figure BDA0003410227520000042
the unit represents the task size of thing networking device n at moment t, and when unmanned aerial vehicle provided charging service, charging power was:
Figure BDA0003410227520000043
wherein, P0Emission power, beta, of unmanned aerial vehicle when providing charging service0For energy efficiency, beta0∈(0,1)。
Consider that the complete electric quantity of the equipment of the Internet of things is EbThen the energy consumption of hovering of unmanned aerial vehicle when providing charging service does:
Figure BDA0003410227520000044
in the formula (I), the compound is shown in the specification,
Figure BDA0003410227520000045
the residual capacity of the internet of things equipment n at the moment t.
(3) Calculating energy consumption by the unmanned aerial vehicle: at time t, consider the effective capacitance to be γcThe number of revolutions required by the CPU of the unmanned aerial vehicle for processing 1 bit of data is C, and the frequency of the CPU of the unmanned aerial vehicle is fcThen unmanned aerial vehicle calculates the energy consumption and is:
Figure BDA0003410227520000046
(4) the charging power consumption of the unmanned aerial vehicle is as follows:
Figure BDA0003410227520000047
therefore, the total energy consumption of the unmanned aerial vehicle is as follows:
W(t)=ef(t)+(1-aT(t))(eh-compute(t)+ecompute(t))+aT(t)(eh-charge(t)+echarge(t))
then the remaining energy of the unmanned aerial vehicle is: eV(t+1)=EV(t) -W (t). The residual energy of the unmanned aerial vehicle is used as a part of environment state information sensed by the unmanned aerial vehicle, has strong correlation with the reward function, and is used for assisting the deep reinforcement learning network to more accurately fit the state-action value.
And 4, step 4: deep reinforcement learning network for calculating reward function and iteratively updating DQN
The unmanned aerial vehicle processing the calculation tasks unloaded by the internet of things equipment can obtain positive rewards as follows:
Figure BDA0003410227520000048
wherein the content of the first and second substances,
Figure BDA0003410227520000049
for use in
Figure BDA00034102275200000410
Normalization of (1);
the unmanned aerial vehicle also obtains positive rewards for charging the internet of things equipment as follows:
Figure BDA00034102275200000411
combining the two positive reward acquisition methods, a reward function can be defined as:
Figure BDA0003410227520000051
wherein r ispenaltyThe penalty item when the electric quantity of the Internet of things equipment is exhausted in the working process of the unmanned aerial vehicle is represented; in addition, if the remaining electric quantity of the unmanned aerial vehicle is lower than a threshold b in any decision period, the next state is a termination state, and the unmanned aerial vehicle finishes the service and returns; when the unmanned aerial vehicle returns to the way and the energy exhaustion condition appears, the task is not completed, and the foot is added after the reward functionPenalty term r large enoughpenaltyFor example, when the number of the devices of the internet of things is N, rpenaltyCan be set to a value greater than or equal to 2N; when the situation that the energy of the Internet of things equipment is exhausted occurs in the service process of the unmanned aerial vehicle, a punishment item with a preset numerical value is added behind the reward function.
At each time t, the unmanned aerial vehicle is in a certain environmental state S (t), a state-action value (Q value) exists for any executable action a, and the current decision is to select the action with the maximum corresponding Q value, namely the action with the maximum corresponding Q value
A(t)←maxa Q(S(t),a)
After determining the action, the unmanned aerial vehicle executes the action to enter a next state S (t +1) and obtains a reward R (t +1), and meanwhile, updates a Q value corresponding to the action to update the neural network:
Q(S(t),A(t))←Q(S(t),A(t))+α[R(t+1)+γmaxaQ(S(t+1,a))-Q(S(t),A(t))]
with the training of the neural network, the state-action value Q gradually converges, the unmanned aerial vehicle selects the action corresponding to the maximum Q value according to each state, and if the average reward of 100 continuous rounds (experiments for 100 times) is greater than a preset value (set by self according to actual requirements), the optimal strategy of the flight trajectory and the user service of the unmanned aerial vehicle can be finally obtained.
The following further describes the embodiments of the present invention with reference to a method flowchart and a system model diagram.
In the system operation, firstly, the DQN neural network parameters are initialized, the environment state is initialized, actions are selected by a greedy strategy, the actions are executed, namely, the unmanned aerial vehicle flies to a target position to complete service on the selected target, after the decision is completed, the unmanned aerial vehicle can enter the next state and obtain rewards, state transfer data are stored in a memory bank, the data are selected from the memory bank to train the neural network to fit a state-action value function after the memory bank is full, and when the neural network is finally converged, the DQN network can be used for guiding the unmanned aerial vehicle to make the decision to obtain the optimal system efficiency. The algorithm overall flow is given below:
Figure BDA0003410227520000052
Figure BDA0003410227520000061
example (b):
in this example, an area with a size of 500m × 500m is considered, an unmanned aerial vehicle provides computation, unloading and charging services for internet of things equipment in the area in the air with a height of 5m, as shown in fig. 2, 32 coordinates are randomly generated in the area as positions of the internet of things equipment and as a part of an access point, another 32 coordinates are generated as a part of the access point, task information of the internet of things equipment is randomly generated, electric quantity of the internet of things equipment is randomly generated within a range of 0.05-0.2J, energy consumption caused by data transmission of the internet of things equipment is only considered in a task process, and specific parameter values of a system model are shown in the following table 1:
TABLE 1 System model parameters
Figure BDA0003410227520000062
Figure BDA0003410227520000071
The system model simulation environment adopts Python language, the open AI-based gym module is developed, the learning algorithm adopts DQN algorithm in an open-source enhanced learning algorithm library Stable bases of OpenAI, the discount factor gamma of the algorithm is 0.99, the learning rate is 0.0001, the capacity of the memory library is 10000, the total iteration frequency is 2.5 multiplied by 105The network structure is 163 × 256 × 256 × 128, and the remaining parameters take default values.
Fig. 3 is a simulation diagram of a system performance function, i.e., reward, changing with iteration times in the example of the present invention, and it can be seen from the diagram that accumulated experience is immediately explored in the initial stage of the algorithm, the obtained reward is small, the obtained reward is gradually increased with the increase of the iteration times, the system performance is gradually improved, and because the reward belongs to a greedy policy in the training process, certain fluctuation still exists in each round, but the algorithm tends to converge as a whole. Under the used network structure, the parameter number is 140672, the calculation amount is 280064, the calculation amount comprises 140032 times of multiplication operation and 140032 times of addition operation, the model is small, and the calculation speed is high.

Claims (8)

1. An unmanned aerial vehicle computing unloading and charging service efficiency optimization method is characterized by comprising the following steps: the design method comprises the following specific steps:
step 1: constructing a deep reinforcement learning network based on DQN;
step 2: sensing the current environment state information, and determining executable actions of the unmanned aerial vehicle through the deep reinforcement learning network constructed in the step 1;
and step 3: predicting values of all actions in the state through a deep reinforcement learning network based on the perceived current environment state information, selecting the action with the maximum value, and further determining a flight target position of the unmanned aerial vehicle, an unmanned aerial vehicle service target and a service type decision of the Internet of things equipment, wherein the service type decision comprises an unloading service and a charging service, and meanwhile, the energy consumption of the unmanned aerial vehicle is calculated;
and 4, step 4: and calculating a reward function, and iteratively updating the DQN deep reinforcement learning network.
2. The method of claim 1, wherein the method for optimizing the efficiency of the unmanned aerial vehicle computing offloading and charging service comprises: in the step 1, a DQN algorithm is constructed, wherein the DQN algorithm comprises a neural network structure including a real network and an estimation network, the input of the neural network structure and the input of the real network structure are both the current environmental state and the action taken by the unmanned aerial vehicle, and the output is the value of an environmental state-action pair; meanwhile, a memory base is constructed to store data generated by interaction between the unmanned aerial vehicle and the environment, wherein the data comprises environment state information data, unmanned aerial vehicle action data and the like generated by each interaction.
3. The method of claim 1, wherein the method for optimizing the efficiency of the unmanned aerial vehicle computing offloading and charging service comprises: in the step 2: the current environment state information comprises current position information of each Internet of things device, current position information of the unmanned aerial vehicle, current residual power information of each Internet of things device, current residual power information of the unmanned aerial vehicle, calculation task size information of each Internet of things device, and channel state information between the unmanned aerial vehicle and each Internet of things device as state information;
the executable actions of the drone include: unmanned aerial vehicle flight target position selection, service target selection and service type selection.
4. The method of claim 3, wherein the method for optimizing the efficiency of the unmanned aerial vehicle computing offloading and charging service comprises: in the step 2: the environmental state information data is expressed in the form of a matrix as follows:
Figure FDA0003410227510000011
wherein S (t) represents environmental status information at time t,
Figure FDA0003410227510000012
respectively representing the positions of N pieces of Internet of things equipment at the time t; lV(t) represents the position of the drone at time t;
Figure FDA0003410227510000013
respectively representing the residual electric quantity of N pieces of Internet of things equipment at the time t; eV(t) represents the remaining capacity of the unmanned aerial vehicle at the moment t;
Figure FDA0003410227510000014
respectively representing the task sizes of N pieces of Internet of things equipment at t moment, wherein the unit is bit;
Figure FDA0003410227510000015
respectively representing channel gains between the N pieces of Internet of things equipment and the unmanned aerial vehicle at the time t;
the executable action data of the unmanned aerial vehicle is expressed in a matrix form:
A(t)=[an(t),am(t),aT(t)]
wherein A (t) represents the action executed by the unmanned aerial vehicle at the time t, an(t) Internet of things equipment for unmanned aerial vehicle service selection at time t, an(t)∈{1,2,…,N};am(t) represents the position to which the unmanned aerial vehicle flies at time t, am(t) is belonged to {1,2, …, M }, and represents flying to a certain access point in M access points which are set in advance; a isT(t) indicates the type of service selected by the drone at time t, aT(t)∈{0,1},aT(t) ═ 0 denotes that the drone provides computation offload service, aTAnd (t) 1 represents that the unmanned aerial vehicle provides the charging service.
5. The method of claim 1, wherein the method for optimizing the efficiency of the unmanned aerial vehicle computing offloading and charging service comprises: in the step 3: unmanned aerial vehicle energy consumption includes 4 kinds of energy consumption, as follows:
(1) unmanned aerial vehicle flight energy consumption: the energy consumption of the horizontal flight of the unmanned aerial vehicle is considered, and the constant horizontal flight speed V of the unmanned aerial vehicle is consideredfFlying in the air with the height h and the flying power PfThe flight energy consumption of the unmanned aerial vehicle at the moment t is related to the current moment position and the next moment position, and the position state of the unmanned aerial vehicle under the three-dimensional coordinate system at the moment t is lV(t)=[xV(t),yV(t),h]Then unmanned aerial vehicle flight energy consumption does:
Figure FDA0003410227510000021
(2) unmanned aerial vehicle energy consumption that spirals: the method refers to the hovering energy consumption of the unmanned aerial vehicle for providing calculation unloading service or charging service for the Internet of things equipment at the moment t, and the hovering power of the unmanned aerial vehicle is considered to be constant PhWhen the unmanned aerial vehicle provides calculation unloading service, the channel is considered as a direct-view channel, and the position state of the Internet of things equipment n
Figure FDA0003410227510000022
The channel gain and the data transmission rate between the unmanned aerial vehicle and the internet of things device n are respectively as follows:
Figure FDA0003410227510000023
Figure FDA0003410227510000024
where B is the channel bandwidth, p0For channel power gain at a reference distance of 1 meter, PnIs the fixed transmission power, sigma, of the Internet of things equipment n2And the power is gaussian white noise, the rotating energy consumption when the unmanned aerial vehicle provides the calculation unloading service is as follows:
Figure FDA0003410227510000025
wherein the content of the first and second substances,
Figure FDA0003410227510000026
the unit represents the task size of thing networking device n at moment t, and when unmanned aerial vehicle provided charging service, charging power was:
Figure FDA0003410227510000027
wherein, P0Emission power, beta, of unmanned aerial vehicle when providing charging service0For energy efficiency, beta0∈(0,1);
Consider that the complete electric quantity of the equipment of the Internet of things is Eb
Figure FDA0003410227510000028
For the surplus capacity of thing networking device n at time t, then the energy consumption of hovering of unmanned aerial vehicle when providing charging service does:
Figure FDA0003410227510000031
(3) calculating energy consumption by the unmanned aerial vehicle: at time t, consider the effective capacitance to be γcThe number of revolutions required by the CPU of the unmanned aerial vehicle for processing 1 bit of data is C, and the frequency of the CPU of the unmanned aerial vehicle is fcThen unmanned aerial vehicle calculates the energy consumption and is:
Figure FDA0003410227510000032
(4) the charging power consumption of the unmanned aerial vehicle is as follows:
Figure FDA0003410227510000033
therefore, the total energy consumption of the unmanned aerial vehicle is as follows:
W(t)=ef(t)+(1-aT(t))(eh-compute(t)+ecompute(t))+aT(t)(eh-charge(t)+echarge(t))
the remaining energy of the unmanned aerial vehicle is as follows: eV(t+1)=EV(t)-W(t)。
6. The method of claim 1, wherein the method for optimizing the efficiency of the unmanned aerial vehicle computing offloading and charging service comprises: in the step 4: when the unmanned aerial vehicle processes the calculation task unloaded by the internet of things equipment and the unmanned aerial vehicle charges the internet of things equipment, positive rewards are obtained, and reward functions are defined as follows:
Figure FDA0003410227510000034
in the formula (I), the compound is shown in the specification,
Figure FDA0003410227510000035
the drone processes the positive rewards earned by the computing tasks offloaded by the internet of things device,
Figure FDA0003410227510000036
representing a positive reward obtained by the unmanned aerial vehicle for charging the internet device; a isT(t) represents the service type selected by the unmanned aerial vehicle at the moment t; r ispenaltyAnd a punishment item when the electric quantity of the Internet of things equipment is exhausted in the working process of the unmanned aerial vehicle is represented.
7. The method of optimizing computing offloading and charging service performance for unmanned aerial vehicle as claimed in claims 1-6, wherein: in the step 4: the application mode of the reward function is as follows: if the residual electric quantity of the unmanned aerial vehicle is lower than the threshold b in any decision period, the next state is a termination state, and the unmanned aerial vehicle finishes the service and returns; when the unmanned aerial vehicle returns to the way and the situation of energy exhaustion occurs, the task is not completed, and a penalty term r is added after the reward functionpenalty(ii) a When the situation that the energy of the Internet of things equipment is exhausted occurs in the service process of the unmanned aerial vehicle, a punishment item with a preset numerical value is added behind the reward function.
8. The method of optimizing computing offloading and charging service performance for unmanned aerial vehicle as claimed in claims 1-6, wherein: in the step 4: the iterative updating method of the DQN deep reinforcement learning network comprises the following steps:
after the action is determined, the unmanned aerial vehicle executes the action to enter the next state, obtains a reward value and updates the value corresponding to the action; with the training of the neural network, the values of all the states and actions are gradually converged, the unmanned aerial vehicle selects the action corresponding to the maximum Q value according to each state, and finally the optimal strategy of the flight trajectory of the unmanned aerial vehicle and the user service is obtained.
CN202111529547.7A 2021-12-14 2021-12-14 Unmanned aerial vehicle computing unloading and charging service efficiency optimization method Pending CN114268986A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111529547.7A CN114268986A (en) 2021-12-14 2021-12-14 Unmanned aerial vehicle computing unloading and charging service efficiency optimization method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111529547.7A CN114268986A (en) 2021-12-14 2021-12-14 Unmanned aerial vehicle computing unloading and charging service efficiency optimization method

Publications (1)

Publication Number Publication Date
CN114268986A true CN114268986A (en) 2022-04-01

Family

ID=80827104

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111529547.7A Pending CN114268986A (en) 2021-12-14 2021-12-14 Unmanned aerial vehicle computing unloading and charging service efficiency optimization method

Country Status (1)

Country Link
CN (1) CN114268986A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115314135A (en) * 2022-08-09 2022-11-08 电子科技大学 Communication perception integrated waveform design method for unmanned aerial vehicle cooperation
CN115713222A (en) * 2023-01-09 2023-02-24 南京邮电大学 Utility-driven unmanned aerial vehicle sensing network charging scheduling method
CN117856474A (en) * 2024-03-08 2024-04-09 广州国曜科技有限公司 Control method and device of wireless power transmission system
CN115314135B (en) * 2022-08-09 2024-06-11 电子科技大学 Unmanned aerial vehicle cooperative communication perception integrated waveform design method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110428115A (en) * 2019-08-13 2019-11-08 南京理工大学 Maximization system benefit method under dynamic environment based on deeply study
CN110488861A (en) * 2019-07-30 2019-11-22 北京邮电大学 Unmanned plane track optimizing method, device and unmanned plane based on deeply study
CN111786713A (en) * 2020-06-04 2020-10-16 大连理工大学 Unmanned aerial vehicle network hovering position optimization method based on multi-agent deep reinforcement learning
US20210165405A1 (en) * 2019-12-03 2021-06-03 University-Industry Cooperation Group Of Kyung Hee University Multiple unmanned aerial vehicles navigation optimization method and multiple unmanned aerial vehicles system using the same
CN113190039A (en) * 2021-04-27 2021-07-30 大连理工大学 Unmanned aerial vehicle acquisition path planning method based on hierarchical deep reinforcement learning

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110488861A (en) * 2019-07-30 2019-11-22 北京邮电大学 Unmanned plane track optimizing method, device and unmanned plane based on deeply study
CN110428115A (en) * 2019-08-13 2019-11-08 南京理工大学 Maximization system benefit method under dynamic environment based on deeply study
US20210165405A1 (en) * 2019-12-03 2021-06-03 University-Industry Cooperation Group Of Kyung Hee University Multiple unmanned aerial vehicles navigation optimization method and multiple unmanned aerial vehicles system using the same
CN111786713A (en) * 2020-06-04 2020-10-16 大连理工大学 Unmanned aerial vehicle network hovering position optimization method based on multi-agent deep reinforcement learning
CN113190039A (en) * 2021-04-27 2021-07-30 大连理工大学 Unmanned aerial vehicle acquisition path planning method based on hierarchical deep reinforcement learning

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115314135A (en) * 2022-08-09 2022-11-08 电子科技大学 Communication perception integrated waveform design method for unmanned aerial vehicle cooperation
CN115314135B (en) * 2022-08-09 2024-06-11 电子科技大学 Unmanned aerial vehicle cooperative communication perception integrated waveform design method
CN115713222A (en) * 2023-01-09 2023-02-24 南京邮电大学 Utility-driven unmanned aerial vehicle sensing network charging scheduling method
CN117856474A (en) * 2024-03-08 2024-04-09 广州国曜科技有限公司 Control method and device of wireless power transmission system
CN117856474B (en) * 2024-03-08 2024-05-14 广州国曜科技有限公司 Control method and device of wireless power transmission system

Similar Documents

Publication Publication Date Title
US11663920B2 (en) Method and device of path optimization for UAV, and storage medium thereof
WO2021017227A1 (en) Path optimization method and device for unmanned aerial vehicle, and storage medium
CN111556461B (en) Vehicle-mounted edge network task distribution and unloading method based on deep Q network
CN111754000A (en) Quality-aware edge intelligent federal learning method and system
CN114268986A (en) Unmanned aerial vehicle computing unloading and charging service efficiency optimization method
CN113395654A (en) Method for task unloading and resource allocation of multiple unmanned aerial vehicles of edge computing system
CN111176820A (en) Deep neural network-based edge computing task allocation method and device
CN115827108B (en) Unmanned aerial vehicle edge calculation unloading method based on multi-target deep reinforcement learning
CN113032904A (en) Model construction method, task allocation method, device, equipment and medium
CN114169234A (en) Scheduling optimization method and system for unmanned aerial vehicle-assisted mobile edge calculation
CN116451934B (en) Multi-unmanned aerial vehicle edge calculation path optimization and dependent task scheduling optimization method and system
CN113132943A (en) Task unloading scheduling and resource allocation method for vehicle-side cooperation in Internet of vehicles
CN113377131B (en) Method for acquiring unmanned aerial vehicle collected data track by using reinforcement learning
CN116489712B (en) Mobile edge computing task unloading method based on deep reinforcement learning
CN113660681A (en) Multi-agent resource optimization method applied to unmanned aerial vehicle cluster auxiliary transmission
Ebrahim et al. A deep learning approach for task offloading in multi-UAV aided mobile edge computing
CN113507717A (en) Unmanned aerial vehicle track optimization method and system based on vehicle track prediction
CN112804103A (en) Intelligent calculation migration method for joint resource allocation and control in block chain enabled Internet of things
CN113987692B (en) Deep neural network partitioning method for unmanned aerial vehicle and edge computing server
CN114840021A (en) Trajectory planning method, device, equipment and medium for data collection of unmanned aerial vehicle
CN111488208B (en) Bian Yun collaborative computing node scheduling optimization method based on variable-step-size bat algorithm
CN116774584A (en) Unmanned aerial vehicle differentiated service track optimization method based on multi-agent deep reinforcement learning
CN116546421A (en) Unmanned aerial vehicle position deployment and minimum energy consumption AWAQ algorithm based on edge calculation
CN111930435A (en) Task unloading decision method based on PD-BPSO technology
CN114942799B (en) Workflow scheduling method based on reinforcement learning in cloud edge environment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination