CN114268986A - Unmanned aerial vehicle computing unloading and charging service efficiency optimization method - Google Patents
Unmanned aerial vehicle computing unloading and charging service efficiency optimization method Download PDFInfo
- Publication number
- CN114268986A CN114268986A CN202111529547.7A CN202111529547A CN114268986A CN 114268986 A CN114268986 A CN 114268986A CN 202111529547 A CN202111529547 A CN 202111529547A CN 114268986 A CN114268986 A CN 114268986A
- Authority
- CN
- China
- Prior art keywords
- aerial vehicle
- unmanned aerial
- internet
- service
- things
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 42
- 238000005457 optimization Methods 0.000 title claims abstract description 9
- 230000009471 action Effects 0.000 claims abstract description 36
- 238000004364 calculation method Methods 0.000 claims abstract description 23
- 230000002787 reinforcement Effects 0.000 claims abstract description 20
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 14
- 230000007613 environmental effect Effects 0.000 claims abstract description 11
- 238000005265 energy consumption Methods 0.000 claims description 30
- 230000006870 function Effects 0.000 claims description 13
- 238000013528 artificial neural network Methods 0.000 claims description 12
- 230000008569 process Effects 0.000 claims description 10
- 230000005540 biological transmission Effects 0.000 claims description 6
- 230000003993 interaction Effects 0.000 claims description 4
- 239000011159 matrix material Substances 0.000 claims description 4
- 230000006855 networking Effects 0.000 claims description 4
- 238000012545 processing Methods 0.000 claims description 3
- 239000000126 substance Substances 0.000 claims description 3
- 238000012549 training Methods 0.000 claims description 3
- 150000001875 compounds Chemical class 0.000 claims description 2
- 238000013461 design Methods 0.000 claims description 2
- 230000008447 perception Effects 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 5
- 238000004088 simulation Methods 0.000 description 3
- 230000007774 longterm Effects 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 230000001186 cumulative effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Images
Landscapes
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses an unmanned aerial vehicle computing unloading and charging service efficiency optimization method facing to the Internet of things, which comprises the following steps: a reinforcement learning network is constructed in advance; sensing a current environmental state; determining the action of the unmanned aerial vehicle through a reinforcement learning network based on the perception state data; and determining the flight path of the unmanned aerial vehicle and the decision of calculating unloading service and charging service of the sensor of the Internet of things based on the action of the unmanned aerial vehicle. According to the invention, the unmanned aerial vehicle can be used as a mobile edge server to provide calculation unloading service, and can also be used as a wireless charging device to charge the Internet of things device, and a deep reinforcement learning algorithm is used for unmanned aerial vehicle track planning and user service decision by modeling the position of the Internet of things sensor, the position of the unmanned aerial vehicle, the residual electric quantity of the Internet of things sensor, task information of the Internet of things sensor and the channel state between the unmanned aerial vehicle and the Internet of things sensor, so that the calculation complexity is low, and the calculation capability and the working time of the Internet of things system are effectively improved.
Description
Technical Field
The invention relates to the field of mobile edge calculation and the field of wireless energy transmission, in particular to the field of a neural network-based deep reinforcement learning algorithm, and specifically relates to an unmanned aerial vehicle calculation unloading and charging service efficiency optimization method.
Background
In recent years, with the development of the technology of the internet of things, various emerging applications of the internet of things emerge endlessly, however, many devices of the internet of things have limited or even no computing power, so the computing requirements are high, and the working time is limited due to the limited battery energy. The mobile edge computing can effectively meet the flow requirement of the Internet of things equipment in a task unloading mode by deploying computing resources near the Internet of things equipment, and reduces task delay. The unmanned aerial vehicle can be used as a mobile edge server to achieve task unloading of the Internet of things equipment due to high mobility of the unmanned aerial vehicle, and in addition, the unmanned aerial vehicle can also be used as wireless charging equipment to charge the Internet of things equipment, but the energy supply of the unmanned aerial vehicle is limited, so that how to design a flight path planning and user service strategy of the unmanned aerial vehicle is crucial to achieving maximized system efficiency.
Disclosure of Invention
Aiming at the problems, the invention provides a method for optimizing the efficiency of the calculation unloading and charging service of the unmanned aerial vehicle, which maximizes the long-term reward of the system under the condition of the energy constraint of the unmanned aerial vehicle, and reasonably plans the flight path and the user service mechanism of the unmanned aerial vehicle by reasonably setting a reward function, thereby maximizing the system efficiency.
The invention relates to an unmanned aerial vehicle computing unloading and charging service efficiency optimization method, which comprises the following specific steps:
step 1: and constructing a deep reinforcement learning network based on the DQN.
Step 2: and (3) sensing the current environment state information, and determining executable actions of the unmanned aerial vehicle through the deep reinforcement learning network constructed in the step 1.
And step 3: based on the current environment state sensing information, the values of all actions in the state are predicted through a deep reinforcement learning network, the action with the maximum value is selected, and then the flight target position of the unmanned aerial vehicle, the unmanned aerial vehicle service target and the service type decision of the Internet of things equipment are determined, wherein the service type decision comprises the calculation of unloading service and charging service, and meanwhile, the energy consumption of the unmanned aerial vehicle is calculated.
And 4, step 4: and calculating a reward function, and iteratively updating the DQN deep reinforcement learning network.
The invention has the advantages that:
according to the unmanned aerial vehicle computing unloading and charging service efficiency optimization method, long-term reward of a system is maximized under the condition of energy constraint of the unmanned aerial vehicle, the flight path and user service mechanism of the unmanned aerial vehicle are reasonably planned by reasonably setting a reward function, the service process is modeled as a Markov decision process, an intelligent decision method based on deep reinforcement learning is designed, and the flight path, user allocation and service type of the unmanned aerial vehicle are optimized in a combined manner, so that the unmanned aerial vehicle can make a decision quickly according to environmental state information, the problem of high computation amount of a traditional algorithm is solved, and the decision cost is reduced.
Drawings
Fig. 1 is a flowchart of a method for optimizing efficiency of a calculation offloading and charging service of an unmanned aerial vehicle according to the present invention;
fig. 2 is a diagram of a system model applied in the method for optimizing efficiency of calculation offloading and charging service for an unmanned aerial vehicle according to the present invention;
fig. 3 is a simulation diagram of cumulative rewards as a function of iteration number in an embodiment of the method for calculating the efficiency optimization of the offloading and charging services by the unmanned aerial vehicle of the invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings.
The invention discloses an unmanned aerial vehicle computing unloading and charging service efficiency optimization method, which comprises the following steps as shown in figure 1:
step 1: and constructing a deep reinforcement learning network based on the DQN, and initializing parameters of the DQN neural network.
(1) And (3) constructing a DQN algorithm, namely a neural network structure including a real network and an estimation network, wherein the neural network structure and the real network structure are the same but have different parameters, the input is the current environmental state and the action taken by the unmanned aerial vehicle, and the output is the value of the environmental state-action pair.
(2) And constructing a memory base to store data generated by interaction between the unmanned aerial vehicle and the environment, wherein the data comprises environment state information data, unmanned aerial vehicle action data and the like generated by each interaction, and randomly extracting a part of data in the memory base for updating when updating the estimated network parameters in the DQN algorithm every time so as to break the correlation and unsteady distribution problems among the data.
Step 2: and (3) sensing the current environment state information, and determining the action of the unmanned aerial vehicle through the deep reinforcement learning network constructed in the step (1), wherein the action comprises unmanned aerial vehicle flight target position selection, service target selection and service type selection.
Sensing current position information of each Internet of things device, current position information of the unmanned aerial vehicle, current residual power information of each Internet of things device, current residual power information of the unmanned aerial vehicle, calculation task size information (namely the number of data bits required to be processed) of each Internet of things device, and channel state information between the unmanned aerial vehicle and each Internet of things device by using an unmanned aerial vehicle airborne sensor as state information. The environmental state information data is expressed in the form of the following matrix:
wherein S (t) represents environmental status information at time t,respectively representing the positions of N pieces of Internet of things equipment at the time t; lV(t) represents the position of the drone at time t;respectively representing the residual electric quantity of N pieces of Internet of things equipment at the time t; eV(t) represents the remaining capacity of the unmanned aerial vehicle at the moment t;respectively representing the task sizes of N pieces of Internet of things equipment at t moment, wherein the unit is bit;respectively represent the channel gain between N thing networking device and the unmanned aerial vehicle at moment t.
Based on the perception state data, determining the action of the unmanned aerial vehicle through a deep reinforcement learning network, and the method comprises the following steps: unmanned aerial vehicle flight target position selection, service target selection and service type selection.
The motion data is expressed in matrix form:
A(t)=[an(t),am(t),aT(t)]
wherein A (t) represents the action executed by the unmanned aerial vehicle at the time t, an(t) Internet of things equipment for unmanned aerial vehicle service selection at time t, an(t)∈{1,2,…,N};am(t) represents the position to which the unmanned aerial vehicle flies at time t, am(t) is belonged to {1,2, …, M }, and represents flying to a certain access point in M access points which are set in advance; a isT(t) indicates the type of service selected by the drone at time t, aT(t)∈{0,1},aT(t) ═ 0 denotes that the drone provides computation offload service, aTIf (t) ═ 1 indicates that the drone provides the charging service, the drone has M × 2 × 2 executable actions in total.
And step 3: based on the current environment state sensing information, the values of all actions in the state are predicted through a deep reinforcement learning network, the action with the maximum value is selected, and then the flight target position of the unmanned aerial vehicle, the unmanned aerial vehicle service target and the service type decision of the Internet of things equipment are determined, wherein the service type decision comprises the calculation of unloading service and charging service, and meanwhile, the energy consumption of the unmanned aerial vehicle is calculated.
Wherein, unmanned aerial vehicle energy consumption includes 4 kinds of energy consumption, as follows:
(1) unmanned aerial vehicle flight energy consumption: the energy consumption of the horizontal flight of the unmanned aerial vehicle is considered, and the constant horizontal flight speed V of the unmanned aerial vehicle is consideredfFlying in the air with the height h and the flying power PfThe flight energy consumption of the unmanned aerial vehicle at the moment t is related to the current moment position and the next moment position, and the position state of the unmanned aerial vehicle under the three-dimensional coordinate system at the moment t is lV(t)=[xV(t),yV(t),h]Then unmanned aerial vehicle flight energy consumption does:
(2) unmanned aerial vehicle energy consumption that spirals: the method refers to the hovering energy consumption of the unmanned aerial vehicle for providing calculation unloading service or charging service for the Internet of things equipment at the moment t, and the hovering power of the unmanned aerial vehicle is considered to be constant PhWhen the unmanned aerial vehicle provides calculation unloading service, the channel is considered as a direct-view channel, and the position state of the Internet of things equipment nThe channel gain and the data transmission rate between the unmanned aerial vehicle and the internet of things device n are respectively as follows:
where B is the channel bandwidth, p0For channel power gain at a reference distance of 1 meter, PnIs the fixed transmission power, sigma, of the Internet of things equipment n2And the power is gaussian white noise, the rotating energy consumption when the unmanned aerial vehicle provides the calculation unloading service is as follows:
wherein the content of the first and second substances,the unit represents the task size of thing networking device n at moment t, and when unmanned aerial vehicle provided charging service, charging power was:
wherein, P0Emission power, beta, of unmanned aerial vehicle when providing charging service0For energy efficiency, beta0∈(0,1)。
Consider that the complete electric quantity of the equipment of the Internet of things is EbThen the energy consumption of hovering of unmanned aerial vehicle when providing charging service does:
in the formula (I), the compound is shown in the specification,the residual capacity of the internet of things equipment n at the moment t.
(3) Calculating energy consumption by the unmanned aerial vehicle: at time t, consider the effective capacitance to be γcThe number of revolutions required by the CPU of the unmanned aerial vehicle for processing 1 bit of data is C, and the frequency of the CPU of the unmanned aerial vehicle is fcThen unmanned aerial vehicle calculates the energy consumption and is:
(4) the charging power consumption of the unmanned aerial vehicle is as follows:
therefore, the total energy consumption of the unmanned aerial vehicle is as follows:
W(t)=ef(t)+(1-aT(t))(eh-compute(t)+ecompute(t))+aT(t)(eh-charge(t)+echarge(t))
then the remaining energy of the unmanned aerial vehicle is: eV(t+1)=EV(t) -W (t). The residual energy of the unmanned aerial vehicle is used as a part of environment state information sensed by the unmanned aerial vehicle, has strong correlation with the reward function, and is used for assisting the deep reinforcement learning network to more accurately fit the state-action value.
And 4, step 4: deep reinforcement learning network for calculating reward function and iteratively updating DQN
The unmanned aerial vehicle processing the calculation tasks unloaded by the internet of things equipment can obtain positive rewards as follows:
the unmanned aerial vehicle also obtains positive rewards for charging the internet of things equipment as follows:
combining the two positive reward acquisition methods, a reward function can be defined as:
wherein r ispenaltyThe penalty item when the electric quantity of the Internet of things equipment is exhausted in the working process of the unmanned aerial vehicle is represented; in addition, if the remaining electric quantity of the unmanned aerial vehicle is lower than a threshold b in any decision period, the next state is a termination state, and the unmanned aerial vehicle finishes the service and returns; when the unmanned aerial vehicle returns to the way and the energy exhaustion condition appears, the task is not completed, and the foot is added after the reward functionPenalty term r large enoughpenaltyFor example, when the number of the devices of the internet of things is N, rpenaltyCan be set to a value greater than or equal to 2N; when the situation that the energy of the Internet of things equipment is exhausted occurs in the service process of the unmanned aerial vehicle, a punishment item with a preset numerical value is added behind the reward function.
At each time t, the unmanned aerial vehicle is in a certain environmental state S (t), a state-action value (Q value) exists for any executable action a, and the current decision is to select the action with the maximum corresponding Q value, namely the action with the maximum corresponding Q value
A(t)←maxa Q(S(t),a)
After determining the action, the unmanned aerial vehicle executes the action to enter a next state S (t +1) and obtains a reward R (t +1), and meanwhile, updates a Q value corresponding to the action to update the neural network:
Q(S(t),A(t))←Q(S(t),A(t))+α[R(t+1)+γmaxaQ(S(t+1,a))-Q(S(t),A(t))]
with the training of the neural network, the state-action value Q gradually converges, the unmanned aerial vehicle selects the action corresponding to the maximum Q value according to each state, and if the average reward of 100 continuous rounds (experiments for 100 times) is greater than a preset value (set by self according to actual requirements), the optimal strategy of the flight trajectory and the user service of the unmanned aerial vehicle can be finally obtained.
The following further describes the embodiments of the present invention with reference to a method flowchart and a system model diagram.
In the system operation, firstly, the DQN neural network parameters are initialized, the environment state is initialized, actions are selected by a greedy strategy, the actions are executed, namely, the unmanned aerial vehicle flies to a target position to complete service on the selected target, after the decision is completed, the unmanned aerial vehicle can enter the next state and obtain rewards, state transfer data are stored in a memory bank, the data are selected from the memory bank to train the neural network to fit a state-action value function after the memory bank is full, and when the neural network is finally converged, the DQN network can be used for guiding the unmanned aerial vehicle to make the decision to obtain the optimal system efficiency. The algorithm overall flow is given below:
example (b):
in this example, an area with a size of 500m × 500m is considered, an unmanned aerial vehicle provides computation, unloading and charging services for internet of things equipment in the area in the air with a height of 5m, as shown in fig. 2, 32 coordinates are randomly generated in the area as positions of the internet of things equipment and as a part of an access point, another 32 coordinates are generated as a part of the access point, task information of the internet of things equipment is randomly generated, electric quantity of the internet of things equipment is randomly generated within a range of 0.05-0.2J, energy consumption caused by data transmission of the internet of things equipment is only considered in a task process, and specific parameter values of a system model are shown in the following table 1:
TABLE 1 System model parameters
The system model simulation environment adopts Python language, the open AI-based gym module is developed, the learning algorithm adopts DQN algorithm in an open-source enhanced learning algorithm library Stable bases of OpenAI, the discount factor gamma of the algorithm is 0.99, the learning rate is 0.0001, the capacity of the memory library is 10000, the total iteration frequency is 2.5 multiplied by 105The network structure is 163 × 256 × 256 × 128, and the remaining parameters take default values.
Fig. 3 is a simulation diagram of a system performance function, i.e., reward, changing with iteration times in the example of the present invention, and it can be seen from the diagram that accumulated experience is immediately explored in the initial stage of the algorithm, the obtained reward is small, the obtained reward is gradually increased with the increase of the iteration times, the system performance is gradually improved, and because the reward belongs to a greedy policy in the training process, certain fluctuation still exists in each round, but the algorithm tends to converge as a whole. Under the used network structure, the parameter number is 140672, the calculation amount is 280064, the calculation amount comprises 140032 times of multiplication operation and 140032 times of addition operation, the model is small, and the calculation speed is high.
Claims (8)
1. An unmanned aerial vehicle computing unloading and charging service efficiency optimization method is characterized by comprising the following steps: the design method comprises the following specific steps:
step 1: constructing a deep reinforcement learning network based on DQN;
step 2: sensing the current environment state information, and determining executable actions of the unmanned aerial vehicle through the deep reinforcement learning network constructed in the step 1;
and step 3: predicting values of all actions in the state through a deep reinforcement learning network based on the perceived current environment state information, selecting the action with the maximum value, and further determining a flight target position of the unmanned aerial vehicle, an unmanned aerial vehicle service target and a service type decision of the Internet of things equipment, wherein the service type decision comprises an unloading service and a charging service, and meanwhile, the energy consumption of the unmanned aerial vehicle is calculated;
and 4, step 4: and calculating a reward function, and iteratively updating the DQN deep reinforcement learning network.
2. The method of claim 1, wherein the method for optimizing the efficiency of the unmanned aerial vehicle computing offloading and charging service comprises: in the step 1, a DQN algorithm is constructed, wherein the DQN algorithm comprises a neural network structure including a real network and an estimation network, the input of the neural network structure and the input of the real network structure are both the current environmental state and the action taken by the unmanned aerial vehicle, and the output is the value of an environmental state-action pair; meanwhile, a memory base is constructed to store data generated by interaction between the unmanned aerial vehicle and the environment, wherein the data comprises environment state information data, unmanned aerial vehicle action data and the like generated by each interaction.
3. The method of claim 1, wherein the method for optimizing the efficiency of the unmanned aerial vehicle computing offloading and charging service comprises: in the step 2: the current environment state information comprises current position information of each Internet of things device, current position information of the unmanned aerial vehicle, current residual power information of each Internet of things device, current residual power information of the unmanned aerial vehicle, calculation task size information of each Internet of things device, and channel state information between the unmanned aerial vehicle and each Internet of things device as state information;
the executable actions of the drone include: unmanned aerial vehicle flight target position selection, service target selection and service type selection.
4. The method of claim 3, wherein the method for optimizing the efficiency of the unmanned aerial vehicle computing offloading and charging service comprises: in the step 2: the environmental state information data is expressed in the form of a matrix as follows:
wherein S (t) represents environmental status information at time t,respectively representing the positions of N pieces of Internet of things equipment at the time t; lV(t) represents the position of the drone at time t;respectively representing the residual electric quantity of N pieces of Internet of things equipment at the time t; eV(t) represents the remaining capacity of the unmanned aerial vehicle at the moment t;respectively representing the task sizes of N pieces of Internet of things equipment at t moment, wherein the unit is bit;respectively representing channel gains between the N pieces of Internet of things equipment and the unmanned aerial vehicle at the time t;
the executable action data of the unmanned aerial vehicle is expressed in a matrix form:
A(t)=[an(t),am(t),aT(t)]
wherein A (t) represents the action executed by the unmanned aerial vehicle at the time t, an(t) Internet of things equipment for unmanned aerial vehicle service selection at time t, an(t)∈{1,2,…,N};am(t) represents the position to which the unmanned aerial vehicle flies at time t, am(t) is belonged to {1,2, …, M }, and represents flying to a certain access point in M access points which are set in advance; a isT(t) indicates the type of service selected by the drone at time t, aT(t)∈{0,1},aT(t) ═ 0 denotes that the drone provides computation offload service, aTAnd (t) 1 represents that the unmanned aerial vehicle provides the charging service.
5. The method of claim 1, wherein the method for optimizing the efficiency of the unmanned aerial vehicle computing offloading and charging service comprises: in the step 3: unmanned aerial vehicle energy consumption includes 4 kinds of energy consumption, as follows:
(1) unmanned aerial vehicle flight energy consumption: the energy consumption of the horizontal flight of the unmanned aerial vehicle is considered, and the constant horizontal flight speed V of the unmanned aerial vehicle is consideredfFlying in the air with the height h and the flying power PfThe flight energy consumption of the unmanned aerial vehicle at the moment t is related to the current moment position and the next moment position, and the position state of the unmanned aerial vehicle under the three-dimensional coordinate system at the moment t is lV(t)=[xV(t),yV(t),h]Then unmanned aerial vehicle flight energy consumption does:
(2) unmanned aerial vehicle energy consumption that spirals: the method refers to the hovering energy consumption of the unmanned aerial vehicle for providing calculation unloading service or charging service for the Internet of things equipment at the moment t, and the hovering power of the unmanned aerial vehicle is considered to be constant PhWhen the unmanned aerial vehicle provides calculation unloading service, the channel is considered as a direct-view channel, and the position state of the Internet of things equipment nThe channel gain and the data transmission rate between the unmanned aerial vehicle and the internet of things device n are respectively as follows:
where B is the channel bandwidth, p0For channel power gain at a reference distance of 1 meter, PnIs the fixed transmission power, sigma, of the Internet of things equipment n2And the power is gaussian white noise, the rotating energy consumption when the unmanned aerial vehicle provides the calculation unloading service is as follows:
wherein the content of the first and second substances,the unit represents the task size of thing networking device n at moment t, and when unmanned aerial vehicle provided charging service, charging power was:
wherein, P0Emission power, beta, of unmanned aerial vehicle when providing charging service0For energy efficiency, beta0∈(0,1);
Consider that the complete electric quantity of the equipment of the Internet of things is Eb,For the surplus capacity of thing networking device n at time t, then the energy consumption of hovering of unmanned aerial vehicle when providing charging service does:
(3) calculating energy consumption by the unmanned aerial vehicle: at time t, consider the effective capacitance to be γcThe number of revolutions required by the CPU of the unmanned aerial vehicle for processing 1 bit of data is C, and the frequency of the CPU of the unmanned aerial vehicle is fcThen unmanned aerial vehicle calculates the energy consumption and is:
(4) the charging power consumption of the unmanned aerial vehicle is as follows:
therefore, the total energy consumption of the unmanned aerial vehicle is as follows:
W(t)=ef(t)+(1-aT(t))(eh-compute(t)+ecompute(t))+aT(t)(eh-charge(t)+echarge(t))
the remaining energy of the unmanned aerial vehicle is as follows: eV(t+1)=EV(t)-W(t)。
6. The method of claim 1, wherein the method for optimizing the efficiency of the unmanned aerial vehicle computing offloading and charging service comprises: in the step 4: when the unmanned aerial vehicle processes the calculation task unloaded by the internet of things equipment and the unmanned aerial vehicle charges the internet of things equipment, positive rewards are obtained, and reward functions are defined as follows:
in the formula (I), the compound is shown in the specification,the drone processes the positive rewards earned by the computing tasks offloaded by the internet of things device,representing a positive reward obtained by the unmanned aerial vehicle for charging the internet device; a isT(t) represents the service type selected by the unmanned aerial vehicle at the moment t; r ispenaltyAnd a punishment item when the electric quantity of the Internet of things equipment is exhausted in the working process of the unmanned aerial vehicle is represented.
7. The method of optimizing computing offloading and charging service performance for unmanned aerial vehicle as claimed in claims 1-6, wherein: in the step 4: the application mode of the reward function is as follows: if the residual electric quantity of the unmanned aerial vehicle is lower than the threshold b in any decision period, the next state is a termination state, and the unmanned aerial vehicle finishes the service and returns; when the unmanned aerial vehicle returns to the way and the situation of energy exhaustion occurs, the task is not completed, and a penalty term r is added after the reward functionpenalty(ii) a When the situation that the energy of the Internet of things equipment is exhausted occurs in the service process of the unmanned aerial vehicle, a punishment item with a preset numerical value is added behind the reward function.
8. The method of optimizing computing offloading and charging service performance for unmanned aerial vehicle as claimed in claims 1-6, wherein: in the step 4: the iterative updating method of the DQN deep reinforcement learning network comprises the following steps:
after the action is determined, the unmanned aerial vehicle executes the action to enter the next state, obtains a reward value and updates the value corresponding to the action; with the training of the neural network, the values of all the states and actions are gradually converged, the unmanned aerial vehicle selects the action corresponding to the maximum Q value according to each state, and finally the optimal strategy of the flight trajectory of the unmanned aerial vehicle and the user service is obtained.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111529547.7A CN114268986A (en) | 2021-12-14 | 2021-12-14 | Unmanned aerial vehicle computing unloading and charging service efficiency optimization method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111529547.7A CN114268986A (en) | 2021-12-14 | 2021-12-14 | Unmanned aerial vehicle computing unloading and charging service efficiency optimization method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114268986A true CN114268986A (en) | 2022-04-01 |
Family
ID=80827104
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111529547.7A Pending CN114268986A (en) | 2021-12-14 | 2021-12-14 | Unmanned aerial vehicle computing unloading and charging service efficiency optimization method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114268986A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115314135A (en) * | 2022-08-09 | 2022-11-08 | 电子科技大学 | Communication perception integrated waveform design method for unmanned aerial vehicle cooperation |
CN115713222A (en) * | 2023-01-09 | 2023-02-24 | 南京邮电大学 | Utility-driven unmanned aerial vehicle sensing network charging scheduling method |
CN117856474A (en) * | 2024-03-08 | 2024-04-09 | 广州国曜科技有限公司 | Control method and device of wireless power transmission system |
CN115314135B (en) * | 2022-08-09 | 2024-06-11 | 电子科技大学 | Unmanned aerial vehicle cooperative communication perception integrated waveform design method |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110428115A (en) * | 2019-08-13 | 2019-11-08 | 南京理工大学 | Maximization system benefit method under dynamic environment based on deeply study |
CN110488861A (en) * | 2019-07-30 | 2019-11-22 | 北京邮电大学 | Unmanned plane track optimizing method, device and unmanned plane based on deeply study |
CN111786713A (en) * | 2020-06-04 | 2020-10-16 | 大连理工大学 | Unmanned aerial vehicle network hovering position optimization method based on multi-agent deep reinforcement learning |
US20210165405A1 (en) * | 2019-12-03 | 2021-06-03 | University-Industry Cooperation Group Of Kyung Hee University | Multiple unmanned aerial vehicles navigation optimization method and multiple unmanned aerial vehicles system using the same |
CN113190039A (en) * | 2021-04-27 | 2021-07-30 | 大连理工大学 | Unmanned aerial vehicle acquisition path planning method based on hierarchical deep reinforcement learning |
-
2021
- 2021-12-14 CN CN202111529547.7A patent/CN114268986A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110488861A (en) * | 2019-07-30 | 2019-11-22 | 北京邮电大学 | Unmanned plane track optimizing method, device and unmanned plane based on deeply study |
CN110428115A (en) * | 2019-08-13 | 2019-11-08 | 南京理工大学 | Maximization system benefit method under dynamic environment based on deeply study |
US20210165405A1 (en) * | 2019-12-03 | 2021-06-03 | University-Industry Cooperation Group Of Kyung Hee University | Multiple unmanned aerial vehicles navigation optimization method and multiple unmanned aerial vehicles system using the same |
CN111786713A (en) * | 2020-06-04 | 2020-10-16 | 大连理工大学 | Unmanned aerial vehicle network hovering position optimization method based on multi-agent deep reinforcement learning |
CN113190039A (en) * | 2021-04-27 | 2021-07-30 | 大连理工大学 | Unmanned aerial vehicle acquisition path planning method based on hierarchical deep reinforcement learning |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115314135A (en) * | 2022-08-09 | 2022-11-08 | 电子科技大学 | Communication perception integrated waveform design method for unmanned aerial vehicle cooperation |
CN115314135B (en) * | 2022-08-09 | 2024-06-11 | 电子科技大学 | Unmanned aerial vehicle cooperative communication perception integrated waveform design method |
CN115713222A (en) * | 2023-01-09 | 2023-02-24 | 南京邮电大学 | Utility-driven unmanned aerial vehicle sensing network charging scheduling method |
CN117856474A (en) * | 2024-03-08 | 2024-04-09 | 广州国曜科技有限公司 | Control method and device of wireless power transmission system |
CN117856474B (en) * | 2024-03-08 | 2024-05-14 | 广州国曜科技有限公司 | Control method and device of wireless power transmission system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11663920B2 (en) | Method and device of path optimization for UAV, and storage medium thereof | |
WO2021017227A1 (en) | Path optimization method and device for unmanned aerial vehicle, and storage medium | |
CN111556461B (en) | Vehicle-mounted edge network task distribution and unloading method based on deep Q network | |
CN111754000A (en) | Quality-aware edge intelligent federal learning method and system | |
CN114268986A (en) | Unmanned aerial vehicle computing unloading and charging service efficiency optimization method | |
CN113395654A (en) | Method for task unloading and resource allocation of multiple unmanned aerial vehicles of edge computing system | |
CN111176820A (en) | Deep neural network-based edge computing task allocation method and device | |
CN115827108B (en) | Unmanned aerial vehicle edge calculation unloading method based on multi-target deep reinforcement learning | |
CN113032904A (en) | Model construction method, task allocation method, device, equipment and medium | |
CN114169234A (en) | Scheduling optimization method and system for unmanned aerial vehicle-assisted mobile edge calculation | |
CN116451934B (en) | Multi-unmanned aerial vehicle edge calculation path optimization and dependent task scheduling optimization method and system | |
CN113132943A (en) | Task unloading scheduling and resource allocation method for vehicle-side cooperation in Internet of vehicles | |
CN113377131B (en) | Method for acquiring unmanned aerial vehicle collected data track by using reinforcement learning | |
CN116489712B (en) | Mobile edge computing task unloading method based on deep reinforcement learning | |
CN113660681A (en) | Multi-agent resource optimization method applied to unmanned aerial vehicle cluster auxiliary transmission | |
Ebrahim et al. | A deep learning approach for task offloading in multi-UAV aided mobile edge computing | |
CN113507717A (en) | Unmanned aerial vehicle track optimization method and system based on vehicle track prediction | |
CN112804103A (en) | Intelligent calculation migration method for joint resource allocation and control in block chain enabled Internet of things | |
CN113987692B (en) | Deep neural network partitioning method for unmanned aerial vehicle and edge computing server | |
CN114840021A (en) | Trajectory planning method, device, equipment and medium for data collection of unmanned aerial vehicle | |
CN111488208B (en) | Bian Yun collaborative computing node scheduling optimization method based on variable-step-size bat algorithm | |
CN116774584A (en) | Unmanned aerial vehicle differentiated service track optimization method based on multi-agent deep reinforcement learning | |
CN116546421A (en) | Unmanned aerial vehicle position deployment and minimum energy consumption AWAQ algorithm based on edge calculation | |
CN111930435A (en) | Task unloading decision method based on PD-BPSO technology | |
CN114942799B (en) | Workflow scheduling method based on reinforcement learning in cloud edge environment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |