CN114222251A - Adaptive network forming and track optimizing method for multiple unmanned aerial vehicles - Google Patents

Adaptive network forming and track optimizing method for multiple unmanned aerial vehicles Download PDF

Info

Publication number
CN114222251A
CN114222251A CN202111439489.9A CN202111439489A CN114222251A CN 114222251 A CN114222251 A CN 114222251A CN 202111439489 A CN202111439489 A CN 202111439489A CN 114222251 A CN114222251 A CN 114222251A
Authority
CN
China
Prior art keywords
unmanned aerial
reward
aerial vehicle
drone
strategy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111439489.9A
Other languages
Chinese (zh)
Other versions
CN114222251B (en
Inventor
龚世民
王猛
王海东
龙钰斯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
Sun Yat Sen University Shenzhen Campus
Original Assignee
Sun Yat Sen University
Sun Yat Sen University Shenzhen Campus
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Yat Sen University, Sun Yat Sen University Shenzhen Campus filed Critical Sun Yat Sen University
Priority to CN202111439489.9A priority Critical patent/CN114222251B/en
Priority claimed from CN202111439489.9A external-priority patent/CN114222251B/en
Publication of CN114222251A publication Critical patent/CN114222251A/en
Application granted granted Critical
Publication of CN114222251B publication Critical patent/CN114222251B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/02Services making use of location information
    • H04W4/029Location-based management or tracking services
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W24/00Supervisory, monitoring or testing arrangements
    • H04W24/02Arrangements for optimising operational condition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W24/00Supervisory, monitoring or testing arrangements
    • H04W24/10Scheduling measurement reports ; Arrangements for measurement reports
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W28/00Network traffic management; Network resource management
    • H04W28/02Traffic management, e.g. flow control or congestion control
    • H04W28/0278Traffic management, e.g. flow control or congestion control using buffer status reports
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/30Services specially adapted for particular environments, situations or purposes
    • H04W4/40Services specially adapted for particular environments, situations or purposes for vehicles, e.g. vehicle-to-pedestrians [V2P]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W52/00Power management, e.g. TPC [Transmission Power Control], power saving or power classes
    • H04W52/02Power saving arrangements
    • H04W52/0209Power saving arrangements in terminal devices
    • H04W52/0261Power saving arrangements in terminal devices managing power supply demand, e.g. depending on battery level
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Traffic Control Systems (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

The invention discloses a self-adaptive network forming and track optimizing method for multiple unmanned aerial vehicles, which comprises the following steps: based on a heuristic method, network forming is adjusted according to the energy consumption and the data cache state of the unmanned aerial vehicle, and a self-adaptive network forming strategy is obtained; and performing combined optimization of the tracks of the multiple unmanned aerial vehicles by combining an adaptive network forming strategy based on a multi-agent reinforcement learning method. By using the method, a track strategy and a formation strategy are cooperatively optimized, the advantages of a multi-unmanned aerial vehicle cooperative network are fully utilized, and the problem of difficulty in data transmission of ground user equipment is solved. The method for self-adaptive network forming and track optimization of the multiple unmanned aerial vehicles can be widely applied to the field of wireless communication.

Description

Adaptive network forming and track optimizing method for multiple unmanned aerial vehicles
Technical Field
The invention relates to the field of wireless communication, in particular to a method for self-adaptive network forming and track optimization of multiple unmanned aerial vehicles.
Background
In order to solve various problems faced in the development process of the internet of things, the unmanned aerial vehicle-assisted wireless communication network is considered to be a solution with great potential and application prospect. The problems of limited energy supply, remote position, non-line-of-sight obstacles and the like in order to expand the scale of the Internet of things can be solved by deploying a flying unmanned aerial vehicle in a wireless network to assist in user data transmission of the Internet of things. However, the main direction of the technical achievements of the existing single-unmanned-aerial-vehicle and multi-unmanned-vehicle auxiliary systems is to plan the flight path of the unmanned aerial vehicle, or to separately consider the control problem of the unmanned aerial vehicle, and the coupling relationship between the unmanned aerial vehicle path and the network connection relationship of the multiple unmanned aerial vehicles is not considered.
Disclosure of Invention
The invention aims to provide a multi-unmanned aerial vehicle self-adaptive network forming and track optimizing method, aims to cooperatively optimize a track strategy and a network forming strategy, fully utilizes the advantages of a multi-unmanned aerial vehicle cooperative network, and solves the problem of difficult data transmission of ground user equipment.
The first technical scheme adopted by the invention is as follows: a self-adaptive network forming and track optimizing method for multiple unmanned aerial vehicles comprises the following steps:
based on a heuristic method, adjusting network forming according to the energy consumption and the data cache state of the unmanned aerial vehicle to obtain a self-adaptive network forming strategy;
and performing combined optimization of the tracks of the multiple unmanned aerial vehicles by combining an adaptive network forming strategy based on a multi-agent reinforcement learning method.
Further, the step of adjusting network shaping according to the energy consumption of the unmanned aerial vehicle and the data cache state based on a heuristic method to obtain a self-adaptive network shaping strategy specifically includes:
at each transmission sub-time slot t, the unmanned aerial vehicle reports the current state to the base station;
the current state comprises a location
Figure BDA0003382459870000011
Network shaping strategy (phi (t), psi)k(t)), energy consumption
Figure BDA0003382459870000012
And buffer information
Figure BDA0003382459870000013
When the base station collects the state information of all the unmanned aerial vehicles, the network forming matrix (phi (t), psi) is adjusted by taking the balance of the energy consumption of the unmanned aerial vehicles and the size of the queue as the targetk(t));
The base station evaluates the cost function c of each drone in each time slot tj(t) allowing the ith drone to cost c a minimum cost when the cost function of the drone continues to increase beyond a thresholdj(t) connecting other drones in the vicinity.
Further, still include:
judge to be unfavorable for the base station to collect data, forbid being connected between some unmanned aerial vehicle and the unmanned aerial vehicle.
Further, the multi-agent-based reinforcement learning method, combined with an adaptive network shaping strategy, performs a combined optimization of the trajectories of multiple drones, and specifically includes:
for multi-UAV systems, joint observations s defining the states of all UAVsi(t) and action ai(t);
The ith UAV takes action a in s (t) state in the t time sloti(t) obtaining a reward Ri(s(t),ai(t));
According to the reward Ri(s(t),ai(t)) performing trajectory optimization.
Further, the reward includes an energy reward Ri,e(t) transmission of the reward Ri,d(t) and perceived reward Ri,c(t):
Energy reward Ri,e(t), defined as a negative value of energy consumed, for causing the ith drone to reduce energy consumption at each time slot;
transmission of a reward Ri,d(t) representing the amount of data transmitted from the ith drone to the base station;
perception reward Ri,c(t) data transmitted back by the sensors of the internet of things users in the coverage area of the ith unmanned aerial vehicle are represented.
Further, still include:
and (4) combining a penalty function to carry out track optimization.
The method has the beneficial effects that: the invention considers the coupling influence among a plurality of unmanned aerial vehicles, realizes the optimal solution of the transmission performance of the wireless network by integrally planning the scheduling of the plurality of unmanned aerial vehicles, and can greatly improve the performance of the multi-unmanned aerial vehicle auxiliary wireless network system due to the integrated consideration of the adaptive network forming, so that the system is more flexible and the application scene is wider.
Drawings
Fig. 1 is a block diagram of a multi-drone assisted wireless network system according to a specific embodiment of the present invention;
fig. 2 is a schematic structural diagram of a method for adaptive network formation and trajectory optimization of multiple drones according to an embodiment of the present invention.
Detailed Description
The invention is described in further detail below with reference to the figures and the specific embodiments. The step numbers in the following embodiments are provided only for convenience of illustration, the order between the steps is not limited at all, and the execution order of each step in the embodiments can be adapted according to the understanding of those skilled in the art.
Referring to fig. 1, the multi-drone assisted wireless network system has one Base Station (BS) and a plurality of drones (UAVs). Collection
Figure BDA0003382459870000021
Indicating a fleet of drones. CollectionCombination of Chinese herbs
Figure BDA0003382459870000022
Representing sensors or IoT users deployed on the ground, which allow for direct communication beyond range with a base station, flying in a designated area by deploying multiple drones, and collecting user sensory data to the BS. Each UAV may be directly connected to the BS or may relay its information back to the BS through other UAVs. Assuming that each drone is equipped with an antenna, support for UAV to UAV direct communications (i.e., U2U communications); different network topologies can be formed by channel allocation on different links between the unmanned aerial vehicles, namely the adaptive network shaping mentioned in the item, and the method can potentially reduce the overall transmission delay and energy consumption of the multi-hop relay transmission. Furthermore, as each drone optimizes and follows its own trajectory, the network structure formed by the multiple drones also changes over time. So in this scheme, optimize network shaping and unmanned aerial vehicle's orbit jointly.
Referring to fig. 2, the invention provides a method for adaptive network formation and trajectory optimization of multiple drones, comprising the following steps:
s1, adjusting network forming according to the energy consumption and the buffer state of the unmanned aerial vehicle based on a heuristic method to obtain a self-adaptive network forming strategy;
given trajectories of multiple drones
Figure BDA0003382459870000031
The topology of many drones requires adaptive network shaping, which makes the problem a non-linear integer programming problem. Although the problem can be solved by existing branch-and-bound methods, the problem has very high computational complexity due to the dynamic evolution of data buffering space and energy consumption of multiple drones and IoT users at different time slots. Therefore, a simple heuristic algorithm, namely an energy and delay perception network shaping (EDA-NF) algorithm is provided to adjust the network shaping of the unmanned aerial vehicle according to the energy consumption and the data caching state of the unmanned aerial vehicle, and the basic idea of the EDA-NF algorithm is to balance the energy consumption and the data of different unmanned aerial vehiclesThe size of the queue. Specifically, at each transmission sub-slot t, the ith unmanned aerial vehicle (UAV-i) reports its current state, including current location l, to the BSi(t), network shaping strategy (Φ (t), Ψ)k(t)), energy consumption
Figure BDA0003382459870000032
And data buffer information
Figure BDA0003382459870000033
. When the BS collects the status information of all drones, it will adjust the network shaping policy (Φ (t), Ψ)k(t)) to balance the energy consumption of the drone and the queue size. The BS will evaluate the cost function c of each UAV in each time slot tj(t) of (d). Cost function c when UAV-ij(t) when continuing to increase beyond a certain threshold, UAV-i attempts to open the U2U channel at a minimum cost cj(t) connect neighboring drones (or send directly to the BS). Meanwhile, the BS can forbid the connection of other unmanned aerial vehicles to the U2U of the UAV-i, the information transmission capability of the unmanned aerial vehicle can be greatly improved through the mode, and the transmission delay is reduced.
S2, performing combined optimization of the tracks of the multiple unmanned aerial vehicles based on a multi-agent reinforcement learning (MADRL) method and in combination with an adaptive network forming strategy.
Given a network shaping policy (phi (t), psi)k(t)), the remaining task is to update the trajectory of the drone for the remaining period of time. Due to its dynamic nature, trajectory optimization is very complex, and the present solution reformulates path planning as a Markov Decision Process (MDP), which is approximated using a model-free Deep Reinforcement Learning (DRL) method. Using tuples
Figure BDA0003382459870000041
To characterize the MDP, wherein
Figure BDA0003382459870000042
And
Figure BDA0003382459870000043
representing a state space and an action space. RIs state-action(s)t,at) A function of the pair. For a multi-drone system, the joint observation of all UAV states is denoted by s (t), i.e. s (t) ═ s1(t),s2(t),...,sN(t)). Similarly, the action is a (t) ═ a1(t),a2(t),...,aN(t)). State s of each dronei(t) including its position li(t), network shaping strategy (phi (t), psi)k(t)), energy state Ei(t) and buffer size Di(t) of (d). Action a of each dronei(t) including the direction of flight di(t) and velocity vi(t) of (d). UAV-i taking action a in time slot t, s (t)i(t) can obtain its own reward Ri(s(t),ai(t)). For a multi-drone system, the reward for UAV-i is also dependent on the actions of the other drones, noted as a-i(t) of (d). The reward for UAV-i consists of three parts: energy reward Ri,e(t) transmission of the reward Ri,d(t) and perceived reward Ri,c(t) of (d). An energy reward is defined as
Figure BDA0003382459870000044
It forces UAV-i to reduce energy consumption at each slot. To reduce transmission delays, each drone is rewarded if it forwards as much data as possible. Transmission of a reward Ri,d(t) refers to the amount of data transmitted from UAV-i to BS or relay drone, i.e.
Figure BDA0003382459870000045
. For sensing reward
Figure BDA0003382459870000046
This section of reward is shown to be determined by data transmitted back from sensors of the IoT users in the coverage area of UAV-i. We then use the reward definition above to approximate the original design objective. In addition, a penalty term R is requiredi,p(t) to ensure a minimum safe distance between UAV-i and other drones. If li(t)-lj(t)||≥dminThe constraint does not hold, we can simply be Ri,p(t) allocating a largerThe penalty function value of (1).
Given the network shaping policy (Φ (t), Ψ) of the dronek(t)), the drone needs to search for the optimal flight direction d based on local observationsi(t) and moving speed vi(t) to update the trajectory. Considering that there are multiple drones in the system, the observation of each drone depends not only on its own action, but also on the actions of the other drones. The trajectory of the drone can thus be learned using a multi-agent depth deterministic policy gradient algorithm (madpg) in multi-agent depth reinforcement learning. Combining an EDA-NF algorithm, the training method is as follows, in an off-line training stage, the BS collects the state updates of all the unmanned aerial vehicles and trains the Critic network and the Actor network of the unmanned aerial vehicles simultaneously in a centralized training mode. After offline training, the Critic network and the Actor network may issue different drone commands to guide the decision of a single drone in a decentralized manner.
Using the trajectories learned by the madpg algorithm, each UAV will follow its trajectory to receive IoT user's data and forward it to the BS or other drone in the next slot. Once the BS receives the data or status updates forwarded by the drones, it will evaluate the cost function c for each dronej(t) of (d). This result can be used to initialize the network shaping policy for the drone, as shown in algorithm 1, lines 8-10. Network shaping policy matrix (phi (t), psi)k(t)) as input to the maddppg algorithm and by training the trajectory of the drone is output.
This scheme is realized mainly based on following unmanned aerial vehicle communication principle:
1) network shaping and subchannel allocation
We consider a slot frame structure
Figure BDA0003382459870000051
. In each time slot
Figure BDA0003382459870000052
Each UAV may fly to a location, receive data from IoT users, buffer the data, and then offload to the BS. We assume that the drone has maximum cache capacityDMAXFor data caching. Status information of the drone (e.g., location of the drone, data buffer size, and network status) may also be updated to the BS during the offload phase. The channels of the drone are described as follows:
IoT user-to-UAV (I2U communication): the I2U channel is used for each drone to collect sensory data from IoT devices within its signal coverage. We assume that a direct channel from the IoT device to the BS is not available. The drone will collect data from the ground sensors in a planned trajectory.
UAV-to-BS (U2B communication): each drone may report its data to the BS over the U2B channel. We assume that the U2B transmission relies on a dedicated cellular channel shared by all UAVs. The data rate on the U2B channel depends on the drone's location and channel conditions.
UAV-to-UAV (U2U communication): if some drones are far away from the BS, we allow them to connect with nearby drones through the U2U tunnel. Through multi-hop relay, the perception data of all internet of things users can be forwarded to the base station. The network shaping of the drone is also related to the overall delay performance.
By using
Figure BDA0003382459870000053
And
Figure BDA0003382459870000054
representing a set of drones that forward sensory data using the U2B and U2U channels in the t slot, respectively. All unmanned aerial vehicles are used
Figure BDA0003382459870000055
Meaning that each UAV is connected either to the BS or to other UAVs. For some drones that are far away from the base station, the direct link may have a lower signal-to-noise ratio (SNR) and larger transmission delay, suggesting that continuing this strategy may result in more hover time and higher energy consumption. In this case, the drone may instead use the U2U channel and aggregate with it
Figure BDA0003382459870000056
The other drones in (1) connect.
Considering the limited channel resources in cellular systems, we assume that all drones share
Figure BDA0003382459870000057
And (4) orthogonal subchannels. The set of all sub-channels is denoted as
Figure BDA0003382459870000058
. Let binary matrix
Figure BDA0003382459870000059
Denotes the U2B sub-channel allocation strategy, wherein
Figure BDA00033824598700000510
Indicating the k-th sub-channel for UAV-i and U2B channels to offload their data. Similarly, a binary matrix is defined
Figure BDA00033824598700000511
As a sub-channel allocation strategy of U2U, wherein
Figure BDA00033824598700000512
Representing the U2U connection on the kth sub-channel between UAV-i and UAV-j, the sub-channel allocation being constrained by the following resources:
Figure BDA0003382459870000061
the path planning algorithm invented by the project is suitable for each subchannel k, and the (phi (t), psi in the adjustment formulak(t)) two matrices to determine the drone network formation in each time slot t.
2) Channel model building for I2U, U2U and U2B
All unmanned aerial vehicles are set to fly at a fixed height H, sensing data are collected from IoT users, and then problem expression and solution can be popularized to the situation that the flying height changes along with time. The trajectory of each UAV-i may be defined as notA set of location points on the same time slot, i.e.
Figure BDA0003382459870000062
Each position is specified by two-dimensional coordinates, i.e. /)i(t)=(xi(t),yi(t)). The BS is fixed at the coordinate origin. Suppose UAV-i is at a limited velocity vi(t)≤vmaxTo diAnd (t) moving in the direction. The position of UAV-i at the next time interval t +1 may be given by: li(t+1)=li(t)+vi(t)di(t) of (d). The distance between UAV-i and UAV-j is expressed as:
di,j(t)=||li(t)-lj(t)||
by HbExpressing the height of the BS antenna, we can also find the distance d between the UAV-i and the BSi,0. Given IoT device
Figure BDA0003382459870000063
Position on the ground
Figure BDA0003382459870000064
Then its distance to UAV-i is determined by
Figure BDA0003382459870000065
It is given.
The UAV and BS are typically line-of-sight wireless transmissions, so the U2U and U2B channels employ a simplified exponential channel fading model. For the drones in the system, when UAV-i transmits information to UAV-j on sub-channel, the received power of UAV-j on sub-channel k is expressed as
Figure BDA0003382459870000066
Wherein
Figure BDA0003382459870000067
Represents the transmit power, β, of UAV-i on the k-th sub-channeli,jIs a constant power gain caused by the amplifier and antenna of the transceiver. Path loss
Figure BDA0003382459870000068
Dependent on the distance between the transceivers, alphauRepresenting the path loss constant. If other UAV-m (m ≠ i) also transmits on the same sub-channel k, the interference power for UAV-j is given by:
Figure BDA0003382459870000069
thus, the transmission data rate from UAV-i to UAV-j on all sub-channels may be expressed as
Figure BDA00033824598700000610
Wherein
Figure BDA0003382459870000071
Representing the noise power on the k-th sub-channel. The U2B data rate may be similarly defined. Each drone collects sensory data as it flies over the ground IoT device, which means that I2U communication is eligible for line-of-sight transmission. The I2U channel can therefore be approximately characterized in the same way as the U2U and U2B channels.
The scheme models the problem as follows. For each UAV-i, the time slot may be further divided into perception, transmission and flight sub-time slots, respectively denoted by ti,s、ti,oAnd ti,fRespectively, are shown. Data s received by UAV-i during sensingi(t) depends on its coverage and the transmission rate of I2U. Let Wm(t) denotes IoT user
Figure BDA0003382459870000072
The remaining data in (1). The data queue for IoT user m may be updated as follows:
Figure BDA0003382459870000073
wherein [ X ]]+Indicating a maximum operation, i.e. max{0,X},xi,m(t) e {0,1} represents the communication of IoT user m to UAV-i, si,m(t)≤DmRepresenting the amount of perceived data collected by UAV-j. Order to
Figure BDA0003382459870000074
Representing the set of users under UAV-i coverage, then
Figure BDA0003382459870000075
And the buffer dynamics of the UAV may be modeled as follows:
Figure BDA0003382459870000076
wherein
Figure BDA0003382459870000077
Representing output data from UAV-i. O isiFirst term of (t) oi,0(t) is data transmitted to the BS, the second term
Figure BDA0003382459870000078
Is the data sent to the drone. DiThird term of (t +1)
Figure BDA0003382459870000079
Is data received from other drones.
Given a task completion time T, the total energy consumption of multiple drones may be expressed as
Figure BDA00033824598700000710
Wherein
Figure BDA00033824598700000711
Representing the total energy consumption of multiple drones at different sub-channels, the problem of minimizing the total energy is then modeled as follows
Figure BDA00033824598700000712
The solution is to optimize the network shaping strategy (phi (t), psi) under the constraints mentioned abovek(t)) and binary matrices
Figure BDA00033824598700000713
This matrix specifies the I2U connection policy in each time slot. All these matrix variables should be aligned with the trajectories L of the multiple dronesiAnd (4) joint optimization. We also optimize the total number of timeslots T needed to complete the offloading of all user data, which may simplify the fixed sensing strategy; given the position of the drone, the I2U correlation matrix x (t) may be determined. By Di(t)≤Dmax and Di(T) 0 and Wm(0)=Dm and Wm0 ensures that the sensory data of all IoT users can be successfully offloaded to the BS after T slots. Ii(t+1)-li(t)||≤vmax(t)ti,fAnd li(t)-lj(t)||≥dminThe inequality in (1) limits the flight speed and distance of multiple drones. In fact, the transmission power p of multiple dronesiThe power consumption of hovering and flying of the unmanned aerial vehicle is far less, and the optimization problem can be omitted.
By the way, the modeling details, complexity and simplified thought of the problem are described in detail, and the two iterative algorithms proposed by the scheme have excellent performance in processing the problem.
In the scheme, the multiple unmanned aerial vehicle paths are planned, self-adaptive network forming is added, and a two-stage algorithm is provided to iterate between the self-adaptive network forming and track optimization, so that cooperation of the network forming and the track optimization is realized. The adaptive network forming is based on a heuristic algorithm EDA-NF, and can be used for balancing the energy consumption of the unmanned aerial vehicle and the size of a data cache queue. Compared with the traditional strategy, the algorithm designed by the scheme is low in calculation complexity and more efficient. And the system is more flexible by combining track optimization and network forming, and a plurality of unmanned aerial vehicles are allowed to adaptively optimize a new network structure according to position change. And the algorithm optimizes the energy consumption and the time delay, reduces the energy consumption of the system and the time delay of the system, and makes the practical use of the system possible.
While the preferred embodiments of the present invention have been illustrated and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (6)

1. A self-adaptive network forming and track optimizing method for multiple unmanned aerial vehicles is characterized by comprising the following steps:
based on a heuristic method, network forming is adjusted according to the energy consumption and the data cache state of the unmanned aerial vehicle, and a self-adaptive network forming strategy is obtained;
and performing combined optimization of the tracks of the multiple unmanned aerial vehicles by combining an adaptive network forming strategy based on a multi-agent reinforcement learning method.
2. The method of claim 1, wherein the step of adjusting the network shaping according to the energy consumption and the data cache state of the unmanned aerial vehicle based on a heuristic method to obtain the adaptive network shaping comprises:
at each transmission sub-time slot t, the unmanned aerial vehicle reports the current state to the base station;
the current state includes a location Li(t), network shaping strategy (phi (t), psi)k(t)), energy consumption
Figure FDA0003382459860000011
And data caching information
Figure FDA0003382459860000012
When the base station collects the state information of all the unmanned aerial vehicles, the network shaping strategy (phi (t), psi) is adjusted by taking the balance of the energy consumption of the unmanned aerial vehicles and the queue size as the targetk(t));
The base station evaluating each drone in each time slot tA cost function that allows the ith drone to cost c a minimum cost when the cost function of the ith drone continues to increase beyond a thresholdj(t) connecting other drones in the vicinity.
3. The method of claim 2, further comprising:
judge to be unfavorable for the condition that the basic station collected data, forbid being connected between some unmanned aerial vehicle and the unmanned aerial vehicle.
4. The method as claimed in claim 3, wherein the multi-agent-based reinforcement learning method is combined with an adaptive network shaping strategy to perform joint optimization of trajectories of multiple drones, and specifically includes:
for multi-UAV systems, joint observations s defining the states of all UAVsi(t) and action ai(t);
The ith UAV takes action a in s (t) state in the t time sloti(t) obtaining a reward Ri(s(t),ai(t));
According to the reward Ri(s(t),ai(t)) performing trajectory optimization.
5. The method of claim 4, wherein the reward comprises an energy reward Ri,e(t) transmission of the reward Ri,d(t) and perceived reward Ri,c(t):
Energy reward Ri,e(t), defined as a negative value of energy consumed, for causing the ith drone to reduce energy consumption at each time slot;
transmission of a reward Ri,d(t) representing the amount of data transmitted from the ith drone to the base station;
perception reward Ri,c(t) data transmitted back by the sensors of the internet of things users in the coverage area of the ith unmanned aerial vehicle are represented.
6. The method of claim 5, further comprising:
and (4) combining a penalty function to carry out track optimization.
CN202111439489.9A 2021-11-30 Self-adaptive network forming and track optimizing method for multiple unmanned aerial vehicles Active CN114222251B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111439489.9A CN114222251B (en) 2021-11-30 Self-adaptive network forming and track optimizing method for multiple unmanned aerial vehicles

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111439489.9A CN114222251B (en) 2021-11-30 Self-adaptive network forming and track optimizing method for multiple unmanned aerial vehicles

Publications (2)

Publication Number Publication Date
CN114222251A true CN114222251A (en) 2022-03-22
CN114222251B CN114222251B (en) 2024-06-28

Family

ID=

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114980020A (en) * 2022-05-17 2022-08-30 重庆邮电大学 Unmanned aerial vehicle data collection method based on MADDPG algorithm
CN115167506A (en) * 2022-06-27 2022-10-11 华南师范大学 Method, device, equipment and storage medium for updating and planning flight line of unmanned aerial vehicle
CN116506965A (en) * 2023-06-20 2023-07-28 南方科技大学 Multi-unmanned aerial vehicle communication resource allocation method and terminal

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110166107A (en) * 2019-05-17 2019-08-23 武汉大学 Based on the unmanned plane relay system resource allocation method for wirelessly taking energy communication network
CN110380776A (en) * 2019-08-22 2019-10-25 电子科技大学 A kind of Internet of things system method of data capture based on unmanned plane
CN110531617A (en) * 2019-07-30 2019-12-03 北京邮电大学 Multiple no-manned plane 3D hovering position combined optimization method, device and unmanned plane base station
CN111193536A (en) * 2019-12-11 2020-05-22 西北工业大学 Multi-unmanned aerial vehicle base station track optimization and power distribution method
CN111786713A (en) * 2020-06-04 2020-10-16 大连理工大学 Unmanned aerial vehicle network hovering position optimization method based on multi-agent deep reinforcement learning
US20200359297A1 (en) * 2018-12-28 2020-11-12 Beijing University Of Posts And Telecommunications Method of Route Construction of UAV Network, UAV and Storage Medium thereof
CN112104502A (en) * 2020-09-16 2020-12-18 云南大学 Time-sensitive multitask edge computing and cache cooperation unloading strategy method
CN112188515A (en) * 2020-08-27 2021-01-05 清华大学 Deep and distant sea information service quality optimization method based on unmanned aerial vehicle network
CN112256056A (en) * 2020-10-19 2021-01-22 中山大学 Unmanned aerial vehicle control method and system based on multi-agent deep reinforcement learning

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200359297A1 (en) * 2018-12-28 2020-11-12 Beijing University Of Posts And Telecommunications Method of Route Construction of UAV Network, UAV and Storage Medium thereof
CN110166107A (en) * 2019-05-17 2019-08-23 武汉大学 Based on the unmanned plane relay system resource allocation method for wirelessly taking energy communication network
CN110531617A (en) * 2019-07-30 2019-12-03 北京邮电大学 Multiple no-manned plane 3D hovering position combined optimization method, device and unmanned plane base station
CN110380776A (en) * 2019-08-22 2019-10-25 电子科技大学 A kind of Internet of things system method of data capture based on unmanned plane
CN111193536A (en) * 2019-12-11 2020-05-22 西北工业大学 Multi-unmanned aerial vehicle base station track optimization and power distribution method
CN111786713A (en) * 2020-06-04 2020-10-16 大连理工大学 Unmanned aerial vehicle network hovering position optimization method based on multi-agent deep reinforcement learning
CN112188515A (en) * 2020-08-27 2021-01-05 清华大学 Deep and distant sea information service quality optimization method based on unmanned aerial vehicle network
CN112104502A (en) * 2020-09-16 2020-12-18 云南大学 Time-sensitive multitask edge computing and cache cooperation unloading strategy method
CN112256056A (en) * 2020-10-19 2021-01-22 中山大学 Unmanned aerial vehicle control method and system based on multi-agent deep reinforcement learning

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114980020A (en) * 2022-05-17 2022-08-30 重庆邮电大学 Unmanned aerial vehicle data collection method based on MADDPG algorithm
CN115167506A (en) * 2022-06-27 2022-10-11 华南师范大学 Method, device, equipment and storage medium for updating and planning flight line of unmanned aerial vehicle
CN116506965A (en) * 2023-06-20 2023-07-28 南方科技大学 Multi-unmanned aerial vehicle communication resource allocation method and terminal
CN116506965B (en) * 2023-06-20 2023-09-19 南方科技大学 Multi-unmanned aerial vehicle communication resource allocation method and terminal

Similar Documents

Publication Publication Date Title
Samir et al. Optimizing age of information through aerial reconfigurable intelligent surfaces: A deep reinforcement learning approach
CN108832998B (en) Cooperative data distribution method in air-ground converged communication network
CN112118556A (en) Unmanned aerial vehicle track and power joint optimization method based on deep reinforcement learning
CN111163511A (en) Intelligent reflection surface assisted uplink power distribution method with limited delay in millimeter wave communication
CN114422363B (en) Capacity optimization method and device for unmanned aerial vehicle-mounted RIS auxiliary communication system
CN108668257B (en) A kind of distribution unmanned plane postman relaying track optimizing method
CN108834049B (en) Wireless energy supply communication network and method and device for determining working state of wireless energy supply communication network
CN115802318B (en) Unmanned aerial vehicle-based auxiliary Internet of vehicles resource optimization method, equipment and medium
CN113255218B (en) Unmanned aerial vehicle autonomous navigation and resource scheduling method of wireless self-powered communication network
CN115499921A (en) Three-dimensional trajectory design and resource scheduling optimization method for complex unmanned aerial vehicle network
Gong et al. Bayesian optimization enhanced deep reinforcement learning for trajectory planning and network formation in multi-UAV networks
CN113206701A (en) Three-dimensional deployment and power distribution joint optimization method for unmanned aerial vehicle flight base station
Ghorbel et al. Energy efficient data collection for wireless sensors using drones
CN114222251B (en) Self-adaptive network forming and track optimizing method for multiple unmanned aerial vehicles
Gendia et al. UAV positioning with joint NOMA power allocation and receiver node activation
CN114222251A (en) Adaptive network forming and track optimizing method for multiple unmanned aerial vehicles
CN116009590B (en) Unmanned aerial vehicle network distributed track planning method, system, equipment and medium
CN111741520A (en) Cognitive underwater acoustic communication system power distribution method based on particle swarm
CN115412156B (en) Urban monitoring-oriented satellite energy-carrying Internet of things resource optimal allocation method
CN116684851A (en) MAPPO-based multi-RIS auxiliary Internet of vehicles throughput improving method
Wang et al. Adaptive network formation and trajectory optimization for multi-UAV-assisted wireless data offloading
Liu et al. Outage probability minimization for vehicular networks via joint clustering, UAV trajectory optimization and power allocation
CN115802370A (en) Communication method and device
CN113055826A (en) Large-scale unmanned aerial vehicle cluster data collection method combining clustering and three-dimensional trajectory planning
CN112929977B (en) Deep learning amplification forwarding cooperative network energy efficiency resource allocation method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant