CN114222251A - Adaptive network forming and track optimizing method for multiple unmanned aerial vehicles - Google Patents
Adaptive network forming and track optimizing method for multiple unmanned aerial vehicles Download PDFInfo
- Publication number
- CN114222251A CN114222251A CN202111439489.9A CN202111439489A CN114222251A CN 114222251 A CN114222251 A CN 114222251A CN 202111439489 A CN202111439489 A CN 202111439489A CN 114222251 A CN114222251 A CN 114222251A
- Authority
- CN
- China
- Prior art keywords
- unmanned aerial
- reward
- aerial vehicle
- drone
- strategy
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 35
- 230000003044 adaptive effect Effects 0.000 title claims abstract description 15
- 230000005540 biological transmission Effects 0.000 claims abstract description 26
- 238000005265 energy consumption Methods 0.000 claims abstract description 25
- 238000005457 optimization Methods 0.000 claims abstract description 20
- 230000002787 reinforcement Effects 0.000 claims abstract description 8
- 238000007493 shaping process Methods 0.000 claims description 24
- 230000009471 action Effects 0.000 claims description 11
- 230000006870 function Effects 0.000 claims description 11
- 230000008447 perception Effects 0.000 claims description 5
- GOLXNESZZPUPJE-UHFFFAOYSA-N spiromesifen Chemical compound CC1=CC(C)=CC(C)=C1C(C(O1)=O)=C(OC(=O)CC(C)(C)C)C11CCCC1 GOLXNESZZPUPJE-UHFFFAOYSA-N 0.000 claims description 3
- 238000004891 communication Methods 0.000 abstract description 12
- 230000015572 biosynthetic process Effects 0.000 abstract description 4
- 239000011159 matrix material Substances 0.000 description 7
- 239000003795 chemical substances by application Substances 0.000 description 5
- 230000001953 sensory effect Effects 0.000 description 5
- 238000012549 training Methods 0.000 description 5
- 230000001413 cellular effect Effects 0.000 description 2
- 230000008878 coupling Effects 0.000 description 2
- 238000010168 coupling process Methods 0.000 description 2
- 238000005859 coupling reaction Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000003139 buffering effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000001934 delay Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005562 fading Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W4/00—Services specially adapted for wireless communication networks; Facilities therefor
- H04W4/02—Services making use of location information
- H04W4/029—Location-based management or tracking services
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W24/00—Supervisory, monitoring or testing arrangements
- H04W24/02—Arrangements for optimising operational condition
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W24/00—Supervisory, monitoring or testing arrangements
- H04W24/10—Scheduling measurement reports ; Arrangements for measurement reports
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W28/00—Network traffic management; Network resource management
- H04W28/02—Traffic management, e.g. flow control or congestion control
- H04W28/0278—Traffic management, e.g. flow control or congestion control using buffer status reports
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W4/00—Services specially adapted for wireless communication networks; Facilities therefor
- H04W4/30—Services specially adapted for particular environments, situations or purposes
- H04W4/40—Services specially adapted for particular environments, situations or purposes for vehicles, e.g. vehicle-to-pedestrians [V2P]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W52/00—Power management, e.g. TPC [Transmission Power Control], power saving or power classes
- H04W52/02—Power saving arrangements
- H04W52/0209—Power saving arrangements in terminal devices
- H04W52/0261—Power saving arrangements in terminal devices managing power supply demand, e.g. depending on battery level
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
- Y02D30/70—Reducing energy consumption in communication networks in wireless communication networks
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Traffic Control Systems (AREA)
- Mobile Radio Communication Systems (AREA)
Abstract
The invention discloses a self-adaptive network forming and track optimizing method for multiple unmanned aerial vehicles, which comprises the following steps: based on a heuristic method, network forming is adjusted according to the energy consumption and the data cache state of the unmanned aerial vehicle, and a self-adaptive network forming strategy is obtained; and performing combined optimization of the tracks of the multiple unmanned aerial vehicles by combining an adaptive network forming strategy based on a multi-agent reinforcement learning method. By using the method, a track strategy and a formation strategy are cooperatively optimized, the advantages of a multi-unmanned aerial vehicle cooperative network are fully utilized, and the problem of difficulty in data transmission of ground user equipment is solved. The method for self-adaptive network forming and track optimization of the multiple unmanned aerial vehicles can be widely applied to the field of wireless communication.
Description
Technical Field
The invention relates to the field of wireless communication, in particular to a method for self-adaptive network forming and track optimization of multiple unmanned aerial vehicles.
Background
In order to solve various problems faced in the development process of the internet of things, the unmanned aerial vehicle-assisted wireless communication network is considered to be a solution with great potential and application prospect. The problems of limited energy supply, remote position, non-line-of-sight obstacles and the like in order to expand the scale of the Internet of things can be solved by deploying a flying unmanned aerial vehicle in a wireless network to assist in user data transmission of the Internet of things. However, the main direction of the technical achievements of the existing single-unmanned-aerial-vehicle and multi-unmanned-vehicle auxiliary systems is to plan the flight path of the unmanned aerial vehicle, or to separately consider the control problem of the unmanned aerial vehicle, and the coupling relationship between the unmanned aerial vehicle path and the network connection relationship of the multiple unmanned aerial vehicles is not considered.
Disclosure of Invention
The invention aims to provide a multi-unmanned aerial vehicle self-adaptive network forming and track optimizing method, aims to cooperatively optimize a track strategy and a network forming strategy, fully utilizes the advantages of a multi-unmanned aerial vehicle cooperative network, and solves the problem of difficult data transmission of ground user equipment.
The first technical scheme adopted by the invention is as follows: a self-adaptive network forming and track optimizing method for multiple unmanned aerial vehicles comprises the following steps:
based on a heuristic method, adjusting network forming according to the energy consumption and the data cache state of the unmanned aerial vehicle to obtain a self-adaptive network forming strategy;
and performing combined optimization of the tracks of the multiple unmanned aerial vehicles by combining an adaptive network forming strategy based on a multi-agent reinforcement learning method.
Further, the step of adjusting network shaping according to the energy consumption of the unmanned aerial vehicle and the data cache state based on a heuristic method to obtain a self-adaptive network shaping strategy specifically includes:
at each transmission sub-time slot t, the unmanned aerial vehicle reports the current state to the base station;
the current state comprises a locationNetwork shaping strategy (phi (t), psi)k(t)), energy consumptionAnd buffer information
When the base station collects the state information of all the unmanned aerial vehicles, the network forming matrix (phi (t), psi) is adjusted by taking the balance of the energy consumption of the unmanned aerial vehicles and the size of the queue as the targetk(t));
The base station evaluates the cost function c of each drone in each time slot tj(t) allowing the ith drone to cost c a minimum cost when the cost function of the drone continues to increase beyond a thresholdj(t) connecting other drones in the vicinity.
Further, still include:
judge to be unfavorable for the base station to collect data, forbid being connected between some unmanned aerial vehicle and the unmanned aerial vehicle.
Further, the multi-agent-based reinforcement learning method, combined with an adaptive network shaping strategy, performs a combined optimization of the trajectories of multiple drones, and specifically includes:
for multi-UAV systems, joint observations s defining the states of all UAVsi(t) and action ai(t);
The ith UAV takes action a in s (t) state in the t time sloti(t) obtaining a reward Ri(s(t),ai(t));
According to the reward Ri(s(t),ai(t)) performing trajectory optimization.
Further, the reward includes an energy reward Ri,e(t) transmission of the reward Ri,d(t) and perceived reward Ri,c(t):
Energy reward Ri,e(t), defined as a negative value of energy consumed, for causing the ith drone to reduce energy consumption at each time slot;
transmission of a reward Ri,d(t) representing the amount of data transmitted from the ith drone to the base station;
perception reward Ri,c(t) data transmitted back by the sensors of the internet of things users in the coverage area of the ith unmanned aerial vehicle are represented.
Further, still include:
and (4) combining a penalty function to carry out track optimization.
The method has the beneficial effects that: the invention considers the coupling influence among a plurality of unmanned aerial vehicles, realizes the optimal solution of the transmission performance of the wireless network by integrally planning the scheduling of the plurality of unmanned aerial vehicles, and can greatly improve the performance of the multi-unmanned aerial vehicle auxiliary wireless network system due to the integrated consideration of the adaptive network forming, so that the system is more flexible and the application scene is wider.
Drawings
Fig. 1 is a block diagram of a multi-drone assisted wireless network system according to a specific embodiment of the present invention;
fig. 2 is a schematic structural diagram of a method for adaptive network formation and trajectory optimization of multiple drones according to an embodiment of the present invention.
Detailed Description
The invention is described in further detail below with reference to the figures and the specific embodiments. The step numbers in the following embodiments are provided only for convenience of illustration, the order between the steps is not limited at all, and the execution order of each step in the embodiments can be adapted according to the understanding of those skilled in the art.
Referring to fig. 1, the multi-drone assisted wireless network system has one Base Station (BS) and a plurality of drones (UAVs). CollectionIndicating a fleet of drones. CollectionCombination of Chinese herbsRepresenting sensors or IoT users deployed on the ground, which allow for direct communication beyond range with a base station, flying in a designated area by deploying multiple drones, and collecting user sensory data to the BS. Each UAV may be directly connected to the BS or may relay its information back to the BS through other UAVs. Assuming that each drone is equipped with an antenna, support for UAV to UAV direct communications (i.e., U2U communications); different network topologies can be formed by channel allocation on different links between the unmanned aerial vehicles, namely the adaptive network shaping mentioned in the item, and the method can potentially reduce the overall transmission delay and energy consumption of the multi-hop relay transmission. Furthermore, as each drone optimizes and follows its own trajectory, the network structure formed by the multiple drones also changes over time. So in this scheme, optimize network shaping and unmanned aerial vehicle's orbit jointly.
Referring to fig. 2, the invention provides a method for adaptive network formation and trajectory optimization of multiple drones, comprising the following steps:
s1, adjusting network forming according to the energy consumption and the buffer state of the unmanned aerial vehicle based on a heuristic method to obtain a self-adaptive network forming strategy;
given trajectories of multiple dronesThe topology of many drones requires adaptive network shaping, which makes the problem a non-linear integer programming problem. Although the problem can be solved by existing branch-and-bound methods, the problem has very high computational complexity due to the dynamic evolution of data buffering space and energy consumption of multiple drones and IoT users at different time slots. Therefore, a simple heuristic algorithm, namely an energy and delay perception network shaping (EDA-NF) algorithm is provided to adjust the network shaping of the unmanned aerial vehicle according to the energy consumption and the data caching state of the unmanned aerial vehicle, and the basic idea of the EDA-NF algorithm is to balance the energy consumption and the data of different unmanned aerial vehiclesThe size of the queue. Specifically, at each transmission sub-slot t, the ith unmanned aerial vehicle (UAV-i) reports its current state, including current location l, to the BSi(t), network shaping strategy (Φ (t), Ψ)k(t)), energy consumptionAnd data buffer information. When the BS collects the status information of all drones, it will adjust the network shaping policy (Φ (t), Ψ)k(t)) to balance the energy consumption of the drone and the queue size. The BS will evaluate the cost function c of each UAV in each time slot tj(t) of (d). Cost function c when UAV-ij(t) when continuing to increase beyond a certain threshold, UAV-i attempts to open the U2U channel at a minimum cost cj(t) connect neighboring drones (or send directly to the BS). Meanwhile, the BS can forbid the connection of other unmanned aerial vehicles to the U2U of the UAV-i, the information transmission capability of the unmanned aerial vehicle can be greatly improved through the mode, and the transmission delay is reduced.
S2, performing combined optimization of the tracks of the multiple unmanned aerial vehicles based on a multi-agent reinforcement learning (MADRL) method and in combination with an adaptive network forming strategy.
Given a network shaping policy (phi (t), psi)k(t)), the remaining task is to update the trajectory of the drone for the remaining period of time. Due to its dynamic nature, trajectory optimization is very complex, and the present solution reformulates path planning as a Markov Decision Process (MDP), which is approximated using a model-free Deep Reinforcement Learning (DRL) method. Using tuplesTo characterize the MDP, whereinAndrepresenting a state space and an action space. RIs state-action(s)t,at) A function of the pair. For a multi-drone system, the joint observation of all UAV states is denoted by s (t), i.e. s (t) ═ s1(t),s2(t),...,sN(t)). Similarly, the action is a (t) ═ a1(t),a2(t),...,aN(t)). State s of each dronei(t) including its position li(t), network shaping strategy (phi (t), psi)k(t)), energy state Ei(t) and buffer size Di(t) of (d). Action a of each dronei(t) including the direction of flight di(t) and velocity vi(t) of (d). UAV-i taking action a in time slot t, s (t)i(t) can obtain its own reward Ri(s(t),ai(t)). For a multi-drone system, the reward for UAV-i is also dependent on the actions of the other drones, noted as a-i(t) of (d). The reward for UAV-i consists of three parts: energy reward Ri,e(t) transmission of the reward Ri,d(t) and perceived reward Ri,c(t) of (d). An energy reward is defined asIt forces UAV-i to reduce energy consumption at each slot. To reduce transmission delays, each drone is rewarded if it forwards as much data as possible. Transmission of a reward Ri,d(t) refers to the amount of data transmitted from UAV-i to BS or relay drone, i.e.. For sensing rewardThis section of reward is shown to be determined by data transmitted back from sensors of the IoT users in the coverage area of UAV-i. We then use the reward definition above to approximate the original design objective. In addition, a penalty term R is requiredi,p(t) to ensure a minimum safe distance between UAV-i and other drones. If li(t)-lj(t)||≥dminThe constraint does not hold, we can simply be Ri,p(t) allocating a largerThe penalty function value of (1).
Given the network shaping policy (Φ (t), Ψ) of the dronek(t)), the drone needs to search for the optimal flight direction d based on local observationsi(t) and moving speed vi(t) to update the trajectory. Considering that there are multiple drones in the system, the observation of each drone depends not only on its own action, but also on the actions of the other drones. The trajectory of the drone can thus be learned using a multi-agent depth deterministic policy gradient algorithm (madpg) in multi-agent depth reinforcement learning. Combining an EDA-NF algorithm, the training method is as follows, in an off-line training stage, the BS collects the state updates of all the unmanned aerial vehicles and trains the Critic network and the Actor network of the unmanned aerial vehicles simultaneously in a centralized training mode. After offline training, the Critic network and the Actor network may issue different drone commands to guide the decision of a single drone in a decentralized manner.
Using the trajectories learned by the madpg algorithm, each UAV will follow its trajectory to receive IoT user's data and forward it to the BS or other drone in the next slot. Once the BS receives the data or status updates forwarded by the drones, it will evaluate the cost function c for each dronej(t) of (d). This result can be used to initialize the network shaping policy for the drone, as shown in algorithm 1, lines 8-10. Network shaping policy matrix (phi (t), psi)k(t)) as input to the maddppg algorithm and by training the trajectory of the drone is output.
This scheme is realized mainly based on following unmanned aerial vehicle communication principle:
1) network shaping and subchannel allocation
We consider a slot frame structure. In each time slotEach UAV may fly to a location, receive data from IoT users, buffer the data, and then offload to the BS. We assume that the drone has maximum cache capacityDMAXFor data caching. Status information of the drone (e.g., location of the drone, data buffer size, and network status) may also be updated to the BS during the offload phase. The channels of the drone are described as follows:
IoT user-to-UAV (I2U communication): the I2U channel is used for each drone to collect sensory data from IoT devices within its signal coverage. We assume that a direct channel from the IoT device to the BS is not available. The drone will collect data from the ground sensors in a planned trajectory.
UAV-to-BS (U2B communication): each drone may report its data to the BS over the U2B channel. We assume that the U2B transmission relies on a dedicated cellular channel shared by all UAVs. The data rate on the U2B channel depends on the drone's location and channel conditions.
UAV-to-UAV (U2U communication): if some drones are far away from the BS, we allow them to connect with nearby drones through the U2U tunnel. Through multi-hop relay, the perception data of all internet of things users can be forwarded to the base station. The network shaping of the drone is also related to the overall delay performance.
By usingAndrepresenting a set of drones that forward sensory data using the U2B and U2U channels in the t slot, respectively. All unmanned aerial vehicles are usedMeaning that each UAV is connected either to the BS or to other UAVs. For some drones that are far away from the base station, the direct link may have a lower signal-to-noise ratio (SNR) and larger transmission delay, suggesting that continuing this strategy may result in more hover time and higher energy consumption. In this case, the drone may instead use the U2U channel and aggregate with itThe other drones in (1) connect.
Considering the limited channel resources in cellular systems, we assume that all drones shareAnd (4) orthogonal subchannels. The set of all sub-channels is denoted as. Let binary matrixDenotes the U2B sub-channel allocation strategy, whereinIndicating the k-th sub-channel for UAV-i and U2B channels to offload their data. Similarly, a binary matrix is definedAs a sub-channel allocation strategy of U2U, whereinRepresenting the U2U connection on the kth sub-channel between UAV-i and UAV-j, the sub-channel allocation being constrained by the following resources:
the path planning algorithm invented by the project is suitable for each subchannel k, and the (phi (t), psi in the adjustment formulak(t)) two matrices to determine the drone network formation in each time slot t.
2) Channel model building for I2U, U2U and U2B
All unmanned aerial vehicles are set to fly at a fixed height H, sensing data are collected from IoT users, and then problem expression and solution can be popularized to the situation that the flying height changes along with time. The trajectory of each UAV-i may be defined as notA set of location points on the same time slot, i.e.Each position is specified by two-dimensional coordinates, i.e. /)i(t)=(xi(t),yi(t)). The BS is fixed at the coordinate origin. Suppose UAV-i is at a limited velocity vi(t)≤vmaxTo diAnd (t) moving in the direction. The position of UAV-i at the next time interval t +1 may be given by: li(t+1)=li(t)+vi(t)di(t) of (d). The distance between UAV-i and UAV-j is expressed as:
di,j(t)=||li(t)-lj(t)||
by HbExpressing the height of the BS antenna, we can also find the distance d between the UAV-i and the BSi,0. Given IoT devicePosition on the groundThen its distance to UAV-i is determined byIt is given.
The UAV and BS are typically line-of-sight wireless transmissions, so the U2U and U2B channels employ a simplified exponential channel fading model. For the drones in the system, when UAV-i transmits information to UAV-j on sub-channel, the received power of UAV-j on sub-channel k is expressed asWhereinRepresents the transmit power, β, of UAV-i on the k-th sub-channeli,jIs a constant power gain caused by the amplifier and antenna of the transceiver. Path lossDependent on the distance between the transceivers, alphauRepresenting the path loss constant. If other UAV-m (m ≠ i) also transmits on the same sub-channel k, the interference power for UAV-j is given by:
thus, the transmission data rate from UAV-i to UAV-j on all sub-channels may be expressed as
WhereinRepresenting the noise power on the k-th sub-channel. The U2B data rate may be similarly defined. Each drone collects sensory data as it flies over the ground IoT device, which means that I2U communication is eligible for line-of-sight transmission. The I2U channel can therefore be approximately characterized in the same way as the U2U and U2B channels.
The scheme models the problem as follows. For each UAV-i, the time slot may be further divided into perception, transmission and flight sub-time slots, respectively denoted by ti,s、ti,oAnd ti,fRespectively, are shown. Data s received by UAV-i during sensingi(t) depends on its coverage and the transmission rate of I2U. Let Wm(t) denotes IoT userThe remaining data in (1). The data queue for IoT user m may be updated as follows:
wherein [ X ]]+Indicating a maximum operation, i.e. max{0,X},xi,m(t) e {0,1} represents the communication of IoT user m to UAV-i, si,m(t)≤DmRepresenting the amount of perceived data collected by UAV-j. Order toRepresenting the set of users under UAV-i coverage, thenAnd the buffer dynamics of the UAV may be modeled as follows:
whereinRepresenting output data from UAV-i. O isiFirst term of (t) oi,0(t) is data transmitted to the BS, the second termIs the data sent to the drone. DiThird term of (t +1)Is data received from other drones.
Given a task completion time T, the total energy consumption of multiple drones may be expressed asWhereinRepresenting the total energy consumption of multiple drones at different sub-channels, the problem of minimizing the total energy is then modeled as follows
The solution is to optimize the network shaping strategy (phi (t), psi) under the constraints mentioned abovek(t)) and binary matricesThis matrix specifies the I2U connection policy in each time slot. All these matrix variables should be aligned with the trajectories L of the multiple dronesiAnd (4) joint optimization. We also optimize the total number of timeslots T needed to complete the offloading of all user data, which may simplify the fixed sensing strategy; given the position of the drone, the I2U correlation matrix x (t) may be determined. By Di(t)≤Dmax and Di(T) 0 and Wm(0)=Dm and Wm0 ensures that the sensory data of all IoT users can be successfully offloaded to the BS after T slots. Ii(t+1)-li(t)||≤vmax(t)ti,fAnd li(t)-lj(t)||≥dminThe inequality in (1) limits the flight speed and distance of multiple drones. In fact, the transmission power p of multiple dronesiThe power consumption of hovering and flying of the unmanned aerial vehicle is far less, and the optimization problem can be omitted.
By the way, the modeling details, complexity and simplified thought of the problem are described in detail, and the two iterative algorithms proposed by the scheme have excellent performance in processing the problem.
In the scheme, the multiple unmanned aerial vehicle paths are planned, self-adaptive network forming is added, and a two-stage algorithm is provided to iterate between the self-adaptive network forming and track optimization, so that cooperation of the network forming and the track optimization is realized. The adaptive network forming is based on a heuristic algorithm EDA-NF, and can be used for balancing the energy consumption of the unmanned aerial vehicle and the size of a data cache queue. Compared with the traditional strategy, the algorithm designed by the scheme is low in calculation complexity and more efficient. And the system is more flexible by combining track optimization and network forming, and a plurality of unmanned aerial vehicles are allowed to adaptively optimize a new network structure according to position change. And the algorithm optimizes the energy consumption and the time delay, reduces the energy consumption of the system and the time delay of the system, and makes the practical use of the system possible.
While the preferred embodiments of the present invention have been illustrated and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.
Claims (6)
1. A self-adaptive network forming and track optimizing method for multiple unmanned aerial vehicles is characterized by comprising the following steps:
based on a heuristic method, network forming is adjusted according to the energy consumption and the data cache state of the unmanned aerial vehicle, and a self-adaptive network forming strategy is obtained;
and performing combined optimization of the tracks of the multiple unmanned aerial vehicles by combining an adaptive network forming strategy based on a multi-agent reinforcement learning method.
2. The method of claim 1, wherein the step of adjusting the network shaping according to the energy consumption and the data cache state of the unmanned aerial vehicle based on a heuristic method to obtain the adaptive network shaping comprises:
at each transmission sub-time slot t, the unmanned aerial vehicle reports the current state to the base station;
the current state includes a location Li(t), network shaping strategy (phi (t), psi)k(t)), energy consumptionAnd data caching information
When the base station collects the state information of all the unmanned aerial vehicles, the network shaping strategy (phi (t), psi) is adjusted by taking the balance of the energy consumption of the unmanned aerial vehicles and the queue size as the targetk(t));
The base station evaluating each drone in each time slot tA cost function that allows the ith drone to cost c a minimum cost when the cost function of the ith drone continues to increase beyond a thresholdj(t) connecting other drones in the vicinity.
3. The method of claim 2, further comprising:
judge to be unfavorable for the condition that the basic station collected data, forbid being connected between some unmanned aerial vehicle and the unmanned aerial vehicle.
4. The method as claimed in claim 3, wherein the multi-agent-based reinforcement learning method is combined with an adaptive network shaping strategy to perform joint optimization of trajectories of multiple drones, and specifically includes:
for multi-UAV systems, joint observations s defining the states of all UAVsi(t) and action ai(t);
The ith UAV takes action a in s (t) state in the t time sloti(t) obtaining a reward Ri(s(t),ai(t));
According to the reward Ri(s(t),ai(t)) performing trajectory optimization.
5. The method of claim 4, wherein the reward comprises an energy reward Ri,e(t) transmission of the reward Ri,d(t) and perceived reward Ri,c(t):
Energy reward Ri,e(t), defined as a negative value of energy consumed, for causing the ith drone to reduce energy consumption at each time slot;
transmission of a reward Ri,d(t) representing the amount of data transmitted from the ith drone to the base station;
perception reward Ri,c(t) data transmitted back by the sensors of the internet of things users in the coverage area of the ith unmanned aerial vehicle are represented.
6. The method of claim 5, further comprising:
and (4) combining a penalty function to carry out track optimization.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111439489.9A CN114222251B (en) | 2021-11-30 | Self-adaptive network forming and track optimizing method for multiple unmanned aerial vehicles |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111439489.9A CN114222251B (en) | 2021-11-30 | Self-adaptive network forming and track optimizing method for multiple unmanned aerial vehicles |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114222251A true CN114222251A (en) | 2022-03-22 |
CN114222251B CN114222251B (en) | 2024-06-28 |
Family
ID=
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114980020A (en) * | 2022-05-17 | 2022-08-30 | 重庆邮电大学 | Unmanned aerial vehicle data collection method based on MADDPG algorithm |
CN115167506A (en) * | 2022-06-27 | 2022-10-11 | 华南师范大学 | Method, device, equipment and storage medium for updating and planning flight line of unmanned aerial vehicle |
CN116506965A (en) * | 2023-06-20 | 2023-07-28 | 南方科技大学 | Multi-unmanned aerial vehicle communication resource allocation method and terminal |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110166107A (en) * | 2019-05-17 | 2019-08-23 | 武汉大学 | Based on the unmanned plane relay system resource allocation method for wirelessly taking energy communication network |
CN110380776A (en) * | 2019-08-22 | 2019-10-25 | 电子科技大学 | A kind of Internet of things system method of data capture based on unmanned plane |
CN110531617A (en) * | 2019-07-30 | 2019-12-03 | 北京邮电大学 | Multiple no-manned plane 3D hovering position combined optimization method, device and unmanned plane base station |
CN111193536A (en) * | 2019-12-11 | 2020-05-22 | 西北工业大学 | Multi-unmanned aerial vehicle base station track optimization and power distribution method |
CN111786713A (en) * | 2020-06-04 | 2020-10-16 | 大连理工大学 | Unmanned aerial vehicle network hovering position optimization method based on multi-agent deep reinforcement learning |
US20200359297A1 (en) * | 2018-12-28 | 2020-11-12 | Beijing University Of Posts And Telecommunications | Method of Route Construction of UAV Network, UAV and Storage Medium thereof |
CN112104502A (en) * | 2020-09-16 | 2020-12-18 | 云南大学 | Time-sensitive multitask edge computing and cache cooperation unloading strategy method |
CN112188515A (en) * | 2020-08-27 | 2021-01-05 | 清华大学 | Deep and distant sea information service quality optimization method based on unmanned aerial vehicle network |
CN112256056A (en) * | 2020-10-19 | 2021-01-22 | 中山大学 | Unmanned aerial vehicle control method and system based on multi-agent deep reinforcement learning |
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200359297A1 (en) * | 2018-12-28 | 2020-11-12 | Beijing University Of Posts And Telecommunications | Method of Route Construction of UAV Network, UAV and Storage Medium thereof |
CN110166107A (en) * | 2019-05-17 | 2019-08-23 | 武汉大学 | Based on the unmanned plane relay system resource allocation method for wirelessly taking energy communication network |
CN110531617A (en) * | 2019-07-30 | 2019-12-03 | 北京邮电大学 | Multiple no-manned plane 3D hovering position combined optimization method, device and unmanned plane base station |
CN110380776A (en) * | 2019-08-22 | 2019-10-25 | 电子科技大学 | A kind of Internet of things system method of data capture based on unmanned plane |
CN111193536A (en) * | 2019-12-11 | 2020-05-22 | 西北工业大学 | Multi-unmanned aerial vehicle base station track optimization and power distribution method |
CN111786713A (en) * | 2020-06-04 | 2020-10-16 | 大连理工大学 | Unmanned aerial vehicle network hovering position optimization method based on multi-agent deep reinforcement learning |
CN112188515A (en) * | 2020-08-27 | 2021-01-05 | 清华大学 | Deep and distant sea information service quality optimization method based on unmanned aerial vehicle network |
CN112104502A (en) * | 2020-09-16 | 2020-12-18 | 云南大学 | Time-sensitive multitask edge computing and cache cooperation unloading strategy method |
CN112256056A (en) * | 2020-10-19 | 2021-01-22 | 中山大学 | Unmanned aerial vehicle control method and system based on multi-agent deep reinforcement learning |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114980020A (en) * | 2022-05-17 | 2022-08-30 | 重庆邮电大学 | Unmanned aerial vehicle data collection method based on MADDPG algorithm |
CN115167506A (en) * | 2022-06-27 | 2022-10-11 | 华南师范大学 | Method, device, equipment and storage medium for updating and planning flight line of unmanned aerial vehicle |
CN116506965A (en) * | 2023-06-20 | 2023-07-28 | 南方科技大学 | Multi-unmanned aerial vehicle communication resource allocation method and terminal |
CN116506965B (en) * | 2023-06-20 | 2023-09-19 | 南方科技大学 | Multi-unmanned aerial vehicle communication resource allocation method and terminal |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Samir et al. | Optimizing age of information through aerial reconfigurable intelligent surfaces: A deep reinforcement learning approach | |
CN108832998B (en) | Cooperative data distribution method in air-ground converged communication network | |
CN112118556A (en) | Unmanned aerial vehicle track and power joint optimization method based on deep reinforcement learning | |
CN111163511A (en) | Intelligent reflection surface assisted uplink power distribution method with limited delay in millimeter wave communication | |
CN114422363B (en) | Capacity optimization method and device for unmanned aerial vehicle-mounted RIS auxiliary communication system | |
CN108668257B (en) | A kind of distribution unmanned plane postman relaying track optimizing method | |
CN108834049B (en) | Wireless energy supply communication network and method and device for determining working state of wireless energy supply communication network | |
CN115802318B (en) | Unmanned aerial vehicle-based auxiliary Internet of vehicles resource optimization method, equipment and medium | |
CN113255218B (en) | Unmanned aerial vehicle autonomous navigation and resource scheduling method of wireless self-powered communication network | |
CN115499921A (en) | Three-dimensional trajectory design and resource scheduling optimization method for complex unmanned aerial vehicle network | |
Gong et al. | Bayesian optimization enhanced deep reinforcement learning for trajectory planning and network formation in multi-UAV networks | |
CN113206701A (en) | Three-dimensional deployment and power distribution joint optimization method for unmanned aerial vehicle flight base station | |
Ghorbel et al. | Energy efficient data collection for wireless sensors using drones | |
CN114222251B (en) | Self-adaptive network forming and track optimizing method for multiple unmanned aerial vehicles | |
Gendia et al. | UAV positioning with joint NOMA power allocation and receiver node activation | |
CN114222251A (en) | Adaptive network forming and track optimizing method for multiple unmanned aerial vehicles | |
CN116009590B (en) | Unmanned aerial vehicle network distributed track planning method, system, equipment and medium | |
CN111741520A (en) | Cognitive underwater acoustic communication system power distribution method based on particle swarm | |
CN115412156B (en) | Urban monitoring-oriented satellite energy-carrying Internet of things resource optimal allocation method | |
CN116684851A (en) | MAPPO-based multi-RIS auxiliary Internet of vehicles throughput improving method | |
Wang et al. | Adaptive network formation and trajectory optimization for multi-UAV-assisted wireless data offloading | |
Liu et al. | Outage probability minimization for vehicular networks via joint clustering, UAV trajectory optimization and power allocation | |
CN115802370A (en) | Communication method and device | |
CN113055826A (en) | Large-scale unmanned aerial vehicle cluster data collection method combining clustering and three-dimensional trajectory planning | |
CN112929977B (en) | Deep learning amplification forwarding cooperative network energy efficiency resource allocation method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant |