CN116225058A - Unmanned aerial vehicle track planning method and device, electronic equipment and storage medium - Google Patents

Unmanned aerial vehicle track planning method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN116225058A
CN116225058A CN202310181830.8A CN202310181830A CN116225058A CN 116225058 A CN116225058 A CN 116225058A CN 202310181830 A CN202310181830 A CN 202310181830A CN 116225058 A CN116225058 A CN 116225058A
Authority
CN
China
Prior art keywords
aerial vehicle
unmanned aerial
user terminal
area
hot spot
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310181830.8A
Other languages
Chinese (zh)
Inventor
李文璟
喻鹏
田静悦
周凡钦
丰雷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN202310181830.8A priority Critical patent/CN116225058A/en
Publication of CN116225058A publication Critical patent/CN116225058A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
    • G05D1/10Simultaneous control of position or course in three dimensions
    • G05D1/101Simultaneous control of position or course in three dimensions specially adapted for aircraft
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Aviation & Aerospace Engineering (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Traffic Control Systems (AREA)

Abstract

The invention provides an unmanned aerial vehicle track planning method, an unmanned aerial vehicle track planning device, electronic equipment and a storage medium, wherein a plurality of user terminal clusters are obtained by clustering user terminals in a region to be served, and each user terminal cluster corresponds to a hot spot region; according to the distribution of the hot spot areas, acquiring a coordinate point sequence corresponding to the optimal energy consumption ratio of the unmanned aerial vehicle and the user terminal cluster communication in the area to be served, and taking the coordinate point sequence as a flight passing point sequence of the unmanned aerial vehicle; and acquiring the flight track of the unmanned aerial vehicle in the to-be-serviced area based on the flight passing point sequence of the unmanned aerial vehicle. By calculating the optimal energy consumption ratio of the unmanned aerial vehicle and the corresponding user terminal cluster communication in the to-be-served area, the signal transmission capacity of the whole communication system is improved, the hot spot areas are divided to realize the rapid coverage of each hot spot area, the energy loss of the unmanned aerial vehicle due to the early exploration of the hot spot areas is reduced, the signal transmission efficiency is improved, and finally the optimal track planning of the unmanned aerial vehicle in the to-be-served area is realized.

Description

Unmanned aerial vehicle track planning method and device, electronic equipment and storage medium
Technical Field
The invention relates to the technical field of unmanned aerial vehicles, in particular to an unmanned aerial vehicle track planning method, an unmanned aerial vehicle track planning device, electronic equipment and a storage medium.
Background
In recent years, natural disasters such as earthquakes, floods and the like and terrorist attacks destroy the ground mobile network, so that communication in the area is interrupted. Meanwhile, due to the fact that traffic routes are damaged, personnel and vehicles are difficult to pass, ground infrastructures are difficult to repair in a short time, search and rescue teams often rely on public safety communication networks, the networks only support voice services, higher-level services are difficult to provide, and difficulties are brought to search and rescue work. The existing point-to-point communication mode and satellite communication mode are difficult to popularize and use on a large scale due to the problems of difficult route optimization, limited resources of connectable sites and the like. Because unmanned aerial vehicle has advantages such as small, deployment is easy, the flexibility is high, deployment cost is low, so unmanned aerial vehicle-based aerial base station is regarded as an effective means of emergent communication. However, because resources are short in emergency scenes and the battery capacity of the unmanned aerial vehicle is limited, how to reasonably plan the track of the unmanned aerial vehicle is very important in the emergency communication scenes. The energy loss in the communication process of the unmanned aerial vehicle and the corresponding user terminal cluster is not considered in the track planning method used by the existing unmanned aerial vehicle, so that the signal transmission capacity of the whole communication system is poor, and meanwhile, the point-to-point coverage is carried out on the whole area according to the maximum coverage radius of the unmanned aerial vehicle, so that the signal energy loss is large, and the signal transmission efficiency is reduced.
Disclosure of Invention
The invention provides an unmanned aerial vehicle track planning method, an unmanned aerial vehicle track planning device, electronic equipment and a storage medium, which are used for solving the defects that in the prior art, energy loss in the communication process of an unmanned aerial vehicle and a corresponding user terminal cluster is not considered, so that the signal transmission capacity of the whole communication system is poor, and meanwhile, the signal energy loss is large because the whole area is subjected to point-to-point coverage according to the maximum coverage radius of the unmanned aerial vehicle.
The invention provides an unmanned aerial vehicle track planning method, which comprises the following steps:
clustering user terminals in a to-be-served area to obtain a plurality of user terminal clusters, wherein each user terminal cluster corresponds to one hot spot area;
according to the distribution of the hot spot areas, acquiring a coordinate point sequence corresponding to the optimal energy consumption ratio of the unmanned aerial vehicle and the user terminal cluster communication in the area to be served, and taking the coordinate point sequence as a flight passing point sequence of the unmanned aerial vehicle;
and acquiring the flight track of the unmanned aerial vehicle in the to-be-serviced area based on the flight passing point sequence of the unmanned aerial vehicle.
According to the unmanned aerial vehicle track planning method provided by the invention, the coordinate point sequence corresponding to the optimal energy consumption ratio of unmanned aerial vehicle and user terminal cluster communication in the to-be-served area is obtained according to the hot spot area distribution, and the method comprises the following steps:
Establishing a deep reinforcement learning model covering the hot spot area in the area to be served, and taking boundary points of the area to be served as state limit values of the deep reinforcement learning model;
constructing a reward function, wherein the reward function comprises a ratio function of the transmission data quantity of the unmanned aerial vehicle to the total energy loss of the unmanned aerial vehicle and a coverage condition of the hot spot area;
and solving the deep reinforcement learning model according to the reward function to obtain a coordinate point sequence corresponding to the optimal energy consumption ratio of the unmanned aerial vehicle in the to-be-serviced area and the corresponding user terminal cluster communication.
According to the unmanned aerial vehicle track planning method provided by the invention, the unmanned aerial vehicle transmission data volume acquisition method comprises the following steps:
acquiring average link loss of communication between the unmanned aerial vehicle and a corresponding user terminal cluster based on a channel model;
acquiring the average time-frequency resource proportion occupied by the user terminal in the hot spot area based on a beam scheduling model;
and inputting the average link loss, the average time-frequency resource proportion occupied by the user terminal in the hot spot area and the communication duration of the unmanned aerial vehicle and the user terminal into the transmission model to acquire the transmission data quantity of the unmanned aerial vehicle.
According to the unmanned aerial vehicle track planning method provided by the invention, the method for acquiring the total energy loss of the unmanned aerial vehicle comprises the following steps:
acquiring uniform speed flight energy loss, acceleration flight energy loss, deceleration flight energy loss and communication energy loss of the unmanned aerial vehicle and the user terminal of the unmanned aerial vehicle based on an energy consumption model;
and taking the sum of the constant speed flight energy loss, the acceleration flight energy loss, the deceleration flight energy loss and the communication energy loss of the unmanned aerial vehicle and the user terminal of the unmanned aerial vehicle as the total energy loss of the unmanned aerial vehicle.
According to the unmanned aerial vehicle track planning method provided by the invention, the deep reinforcement learning model is solved according to the reward function, and the method comprises the following steps:
presetting a passing point sequence in the to-be-served area as a state value, wherein the state value is a current position coordinate;
determining actions according to the epsilon-greedy strategy;
executing action under the current state value to obtain a next state value and a reward value, wherein the reward value is obtained according to a reward function;
guiding the action through the reward value to obtain an optimal next state value;
forward pushing the deep reinforcement learning model to obtain the next state value of each moment in a future period of time;
And obtaining an optimal coordinate point sequence and a corresponding optimal energy efficiency ratio of communication between the unmanned aerial vehicle and a corresponding user terminal cluster in the to-be-served area according to the optimal next state value of each moment in a future period of time, and taking the optimal coordinate point sequence and the corresponding optimal energy efficiency ratio as an output value of the deep reinforcement learning model.
According to the unmanned aerial vehicle track planning method provided by the invention, the reward function further comprises constraint conditions, and the constraint conditions comprise:
the signal transmission rates of the unmanned aerial vehicle and the user terminal meet the signal transmission rate required by the user terminal;
the average link loss of the user terminal does not exceed the maximum link loss;
the displacement of the unmanned aerial vehicle at each moment does not exceed the maximum coverage diameter of the unmanned aerial vehicle;
and the flight track starting points of the unmanned aerial vehicle are overlapped.
According to the unmanned aerial vehicle track planning method provided by the invention, the user terminals in the area to be served are clustered, and the unmanned aerial vehicle track planning method comprises the following steps:
randomly selecting a user terminal, and calculating the density value of the neighbor users of the user terminal in the clustering radius range;
if the density value of the search neighborhood users is larger than or equal to a preset threshold value, dividing the user terminals and the user terminals in the clustering radius range into a user terminal cluster;
Repeating the steps until the user density value of the search neighborhood corresponding to the user terminals which are not in the user terminal cluster is smaller than a preset threshold value.
The invention also provides an unmanned aerial vehicle track planning device, which comprises:
the hot spot detection module is used for clustering the user terminals in the area to be served to obtain a plurality of user terminal clusters, and each user terminal cluster corresponds to one hot spot area;
the solving module is used for acquiring a coordinate point sequence corresponding to the optimal energy consumption ratio of the unmanned aerial vehicle and the user terminal cluster communication in the to-be-serviced area according to the hot spot area distribution, and taking the coordinate point sequence as a flight passing point sequence of the unmanned aerial vehicle;
and the track planning module is used for acquiring the flight track of the unmanned aerial vehicle in the to-be-serviced area based on the flight passing point sequence of the unmanned aerial vehicle.
The invention also provides electronic equipment, which comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes any one of the unmanned aerial vehicle track planning methods when executing the program.
The invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method of unmanned aerial vehicle trajectory planning as described in any of the above.
The invention provides an unmanned aerial vehicle track planning method, an unmanned aerial vehicle track planning device, electronic equipment and a storage medium, wherein a plurality of user terminal clusters are obtained by clustering user terminals in a region to be served, and each user terminal cluster corresponds to a hot spot region; according to the distribution of the hot spot areas, a coordinate point sequence corresponding to the optimal energy consumption ratio of the unmanned aerial vehicle and the user terminal cluster communication in the area to be served is obtained, and the coordinate point sequence is used as a flight passing point sequence of the unmanned aerial vehicle; and acquiring the flight track of the unmanned aerial vehicle in the area to be serviced based on the flight passing point sequence of the unmanned aerial vehicle. By calculating the optimal energy consumption ratio of the unmanned aerial vehicle and the corresponding user terminal cluster communication in the to-be-served area, the signal transmission capacity of the whole communication system is improved, meanwhile, the quick coverage of the communication of each hot spot area is realized by dividing the hot spot area, the energy loss of the unmanned aerial vehicle caused by the early exploration of the hot spot area is reduced, the signal transmission efficiency is improved, and finally, the optimal track planning of the unmanned aerial vehicle in the to-be-served area is realized.
Drawings
In order to more clearly illustrate the invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic flow chart of a method for planning a trajectory of an unmanned aerial vehicle according to the present invention;
fig. 2 is a second flow chart of the unmanned aerial vehicle trajectory planning method provided by the invention;
fig. 3 is a third flow chart of the unmanned aerial vehicle trajectory planning method provided by the invention;
fig. 4 is a flow chart of a method for planning a trajectory of an unmanned aerial vehicle according to the present invention;
fig. 5 is a fifth flow chart of the unmanned aerial vehicle trajectory planning method provided by the invention;
fig. 6 is a flowchart of a method for planning a trajectory of an unmanned aerial vehicle according to the present invention;
fig. 7 is a flow chart of a method for planning a trajectory of an unmanned aerial vehicle according to the present invention;
fig. 8 is a flowchart illustrating a method for planning a trajectory of an unmanned aerial vehicle according to the present invention;
fig. 9 is a schematic structural diagram of an unmanned aerial vehicle trajectory planning device provided by the invention;
fig. 10 is a schematic structural diagram of an electronic device provided by the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Fig. 1 is a flow chart of an unmanned aerial vehicle track planning method provided by the invention, and as shown in fig. 1, the unmanned aerial vehicle track planning method provided by the invention comprises the following steps:
step 101, clustering user terminals in a region to be served to obtain a plurality of user terminal clusters, wherein each user terminal cluster corresponds to a hot spot region;
step 102, acquiring a coordinate point sequence corresponding to the optimal energy consumption ratio of the unmanned aerial vehicle and the user terminal cluster communication in the to-be-serviced area according to the hot spot area distribution, and taking the coordinate point sequence as a flying passing point sequence of the unmanned aerial vehicle;
step 103, acquiring the flight track of the unmanned aerial vehicle in the area to be serviced based on the flight passing point sequence of the unmanned aerial vehicle.
The energy loss in the communication process of the unmanned aerial vehicle and the corresponding user terminal cluster is not considered in the track planning method used by the existing unmanned aerial vehicle, so that the signal transmission capacity of the whole communication system is poor, and meanwhile, the signal energy loss is large and the signal transmission efficiency is reduced because the whole area is covered according to the maximum coverage radius of the unmanned aerial vehicle.
The invention provides an unmanned aerial vehicle track planning method, which comprises the steps of clustering user terminals in a region to be serviced to obtain a plurality of user terminal clusters, wherein each user terminal cluster corresponds to a hot spot region; according to the distribution of the hot spot areas, a coordinate point sequence corresponding to the optimal energy consumption ratio of the unmanned aerial vehicle and the user terminal cluster communication in the area to be served is obtained, and the coordinate point sequence is used as a flight passing point sequence of the unmanned aerial vehicle; and acquiring the flight track of the unmanned aerial vehicle in the area to be serviced based on the flight passing point sequence of the unmanned aerial vehicle. By calculating the optimal energy consumption ratio of the unmanned aerial vehicle and the corresponding user terminal cluster communication in the to-be-served area, the signal transmission capacity of the whole communication system is improved, meanwhile, the quick coverage of the communication of each hot spot area is realized by dividing the hot spot area, the energy loss of the unmanned aerial vehicle caused by the early exploration of the hot spot area is reduced, the signal transmission efficiency is improved, and finally, the optimal track planning of the unmanned aerial vehicle in the to-be-served area is realized.
Based on any of the above embodiments, as shown in fig. 2, clustering user terminals in a to-be-served area includes:
step 201, randomly selecting a user terminal, and calculating a neighborhood user density value of the user terminal in a clustering radius range;
step 202, if the density value of the searched neighborhood users is larger than or equal to a preset threshold value, dividing the user terminals and the user terminals within the clustering radius range into a user terminal cluster;
and 203, repeating the steps until the search neighborhood user density value corresponding to the corresponding user terminal not in the user terminal cluster is smaller than a preset threshold value.
Because the embodiment of the invention is a coverage compensation scene for the hot spot area under emergency communication, the unmanned aerial vehicle is required to dynamically cover the hot spot area in the flight process to realize additional coverage compensation, and a basis is provided for subsequent track planning, so that the hot spot area of the area to be served needs to be determined.
In the embodiment of the invention, the DBSCAN algorithm is used for clustering the ground user terminals according to the distribution density of the user terminals, and each cluster is a hot spot area. The DBSCAN (Density-Based Spatial Clustering of Application with Noise) algorithm is a typical Density-based clustering method. It defines clusters as the largest set of densely connected points, is able to divide areas of sufficient density into clusters, and can find arbitrarily shaped clusters in noisy spatial data sets.
Based on the characteristics of emergency communication scenes, focus coverage is carried out on hot spot areas gathered by users after disaster, a hot spot area detection algorithm based on a DBSCAN algorithm is used in an area to be served, and the distribution of ground users is analyzed and clustered by combining with the distribution information of historical users, so that the number of hot spot areas and the range of the hot spot areas in the area are determined, and the track planning is conveniently carried out on the hot spot areas based on an aerial base station of an unmanned aerial vehicle.
Based on any of the above embodiments, as shown in fig. 3, according to the distribution of the hot spot areas, a coordinate point sequence corresponding to an optimal energy consumption ratio of the communication between the unmanned aerial vehicle and the user terminal cluster in the to-be-served area is obtained, including:
step 301, establishing a deep reinforcement learning model covering a hot spot area in a to-be-serviced area, and taking boundary points of the to-be-serviced area as a state limit value of the deep reinforcement learning model;
step 302, constructing a reward function, wherein the reward function comprises a ratio function of the transmission data quantity of the unmanned aerial vehicle to the total energy loss of the unmanned aerial vehicle and a coverage condition of a hot spot area;
and 303, solving a deep reinforcement learning model according to the reward function to obtain a coordinate point sequence corresponding to the optimal energy consumption ratio of the unmanned aerial vehicle and the user terminal cluster communication in the to-be-served area.
In an embodiment of the invention, a DQN (Deep Q Network) algorithm is used to build a deep reinforcement learning model that covers the hot spot areas. The DQN algorithm is improved by applying a neural network based on the Q-learning algorithm, and the optimal actions are obtained by direct calculation by using the neural network instead of the Q-table. The DQN algorithm is a value based algorithm, and relies on the interaction of Agent and Environment to complete the learning processObtaining the state s by observing the environment t The agent obtains all Q (s, a) values for the state according to the value neural network, and the agent determines action a according to epsilon-greedy strategy t The environment will respond to the prize r according to the prize function t New state s t+1 The DQN algorithm is iterated continuously according to the above procedure and the historical experience stored in the experience pool until an optimal state is obtained.
According to the embodiment of the invention, the user terminal is divided into a plurality of clusters according to the position and the communication requirement, and the center of each cluster is a coverage compensation communication target of the unmanned aerial vehicle. In order to effectively meet the emergency service requirements of a to-be-serviced area, solve the track conveniently and cover the hot spot area in a key way, the flight process of the unmanned aerial vehicle is emphasized into a plurality of passing points, and the flight track of the unmanned aerial vehicle is obtained by presetting different numbers of passing points and solving the passing points.
And the unmanned aerial vehicle takes action to adjust the positions of the passing points according to the current environment and the received rewards, and completes movement. Because each action selection is discrete and independent of the other, the entire movement process can be modeled as a Markov decision process (Markov Decision Process, MDP), which is a formalized description of the environment in reinforcement learning, or one modeling of the environment in which the agent is located. In reinforcement learning, almost all questions can be formally represented as a Markov decision process. When the flight path is planned, the depth reinforcement learning (Deep Reinforcement Learning, DRL) algorithm can continuously change the positions of the passing points according to the environment, and the optimal positions of the passing points can be found more quickly.
Based on any of the above embodiments, as shown in fig. 4, the method for acquiring the transmission data amount of the unmanned aerial vehicle includes:
step 401, obtaining average link loss of communication between the unmanned aerial vehicle and a corresponding user terminal cluster based on a channel model;
in the embodiment of the invention, the deployment of the unmanned aerial vehicle aims to solve the problem of dense users, and after the basic coverage of the whole area is completed by the base station in the base air, the unmanned aerial vehicle realizes the coverage compensation of the hot spot area, so that the scene is a dense urban environment. In this scenario, since the unmanned aerial vehicle and the user terminal sometimes encounter an obstruction when communicating, and thus a non-line-of-sight link is generated, both the line-of-sight link and the non-line-of-sight link are considered at the same time in the transmission link established by the unmanned aerial vehicle and the user terminal.
Average link loss for communication between drone (Unmanned Aerial Vehicle, UAV, fully unmanned aerial vehicle) and user terminal i during time slot t
Figure BDA0004102552100000093
Can be obtained by the following method: />
Figure BDA0004102552100000092
Wherein, P (LoS) is a line-of-sight propagation probability, P (NLoS) is a non-line-of-sight propagation probability, and the line-of-sight propagation probability and the non-line-of-sight propagation probability can be obtained by:
P(LoS)=a(θ i (t)-θ o ) b
P(NLoS)=1-a(θ i (t)-θ o ) b
wherein a is a first environmental coefficient, b is a second environmental coefficient, a and b take different values according to different environments, θ i (t) is the angle between the ground and the transmission link established by the unmanned aerial vehicle and the user terminal, theta i ∈[θ o ,90°]。θ i (t) can be obtained by the following method:
Figure BDA0004102552100000091
wherein (x (t), y (t), z (t)) is the coordinates of the unmanned aerial vehicle (x) i ,y i ,h i ) Is the coordinates of the user terminal i.
For example, (a, b) = (0.33,0.23), (PL NLoS ,PL LoS )=(2,2.65),θ o 15 deg..
Step 402, acquiring an average time-frequency resource proportion occupied by a user terminal in a hot spot area based on a beam scheduling model;
in the embodiment of the invention, the unmanned aerial vehicle and the user terminal need to perform main beam alignment before data transmission so as to ensure that the user receives high-quality signals. Firstly, a small amount of wide beams are used for coarse-grained scanning in a communication sector of the unmanned aerial vehicle to determine the alignment direction of the wide beams and users, then a plurality of narrow beams are used for scanning in a sector covered by the wide beams, and due to the strong directivity of the narrow millimeter wave beams, beam level scanning is needed to cover the whole considered area, and finally beam alignment is realized.
In the beam alignment process, time overhead is caused, and the time cost of beam alignment is basically in the second stage, because the time spent in the first stage is negligible. The beam alignment time τ can be obtained by:
Figure BDA0004102552100000101
wherein delta T,s Representing sector width, delta at drone R,s Representing sector width, delta at user terminal T,i Representing beam width, delta at drone end R,i Representing beam width at user terminal, where T p The time for beam alignment is performed for the beam to traverse the entire sector and transmit the pilot signal at each location.
In the hot spot area, when the number of the user terminals receiving the unmanned aerial vehicle signals is larger than or equal to the number of the beams transmitted by the unmanned aerial vehicle, the beam scheduling is needed by adopting a round robin scheme, and the average time-frequency resource ratio occupied by the user terminals is the ratio of the number of the beams transmitted by the unmanned aerial vehicle to the number of the user terminals receiving the unmanned aerial vehicle signals in the hot spot area; otherwise, the average time-frequency resource ratio occupied by the user terminal is 1. Average time-frequency resource ratio eta occupied by User Equipment (UE) u The approximation is:
Figure BDA0004102552100000111
wherein N is b For the number of beams transmitted by the unmanned aerial vehicle, N u And the number of the user terminals for receiving the unmanned aerial vehicle signals in the hot spot area is the number of the user terminals.
And step 403, inputting the average link loss and the average time-frequency resource ratio occupied by the user terminal in the hot spot area into a transmission model, and obtaining the transmission data quantity of the unmanned aerial vehicle.
In the embodiment of the invention, the transmission data size of the unmanned aerial vehicle is as follows: and in the flight time T of the unmanned aerial vehicle, the total transmission data quantity received by all user terminals covered by the unmanned aerial vehicle is taken as a unit of bit. Data transmission amount of unmanned aerial vehicle
Figure BDA0004102552100000112
Can be obtained by the following method:
Figure BDA0004102552100000113
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0004102552100000114
for the total transmission data amount of the ith user terminal, the total transmission data amount is expressed in bit, the total transmission data amount of the ith user terminal is +.>
Figure BDA0004102552100000115
Can be obtained by the following method:
Figure BDA0004102552100000116
wherein, when the unmanned plane flies and communicates with the user terminal, the information throughput of the ith user terminal in the coverage area is obtained by taking the nats as a unit through a shannon formula, R i (t) can be obtained by the following method:
Figure BDA0004102552100000117
wherein eta u B, the average time-frequency resource proportion occupied by the user terminal U For the bandwidth allocated to all user terminals within the connection range of the unmanned plane, τ is the beam alignment time, t tr For the time that the drone remains in communication with the ith user terminal,
Figure BDA0004102552100000118
Is the signal to noise ratio when the unmanned aerial vehicle communicates with the ith user terminal.
To simplify T tr The cluster center of the whole user terminal cluster is taken as a communication target in the subsequent simulation, and the average transmission time is the time for the unmanned aerial vehicle to keep communication with the user terminal corresponding to each cluster center, namely
Figure BDA0004102552100000121
v is the speed of the unmanned aerial vehicle flying at a constant speed. Maximum coverage radius of unmanned plane +.>
Figure BDA0004102552100000122
Wherein h is U Is the flying height of the unmanned plane, theta max To meet the average link loss not exceeding the maximum link loss PL max And the maximum communication included angle between the unmanned aerial vehicle and the cluster center of the user terminal cluster under the condition.
In the process of communication between the user terminal and the unmanned aerial vehicle, the user terminal receives service signals of other base stations and unmanned aerial vehicle, removes signals of the base stations and the unmanned aerial vehicle serving the user terminal, and the signals of the other base stations and the unmanned aerial vehicle are all noise, so that
Figure BDA0004102552100000123
Can be obtained by the following method:
Figure BDA0004102552100000124
wherein the signal to noise ratio is dB is a unit of the total of two,
Figure BDA0004102552100000125
representing the signal power of the unmanned aerial vehicle received by the ith user terminal,
Figure BDA0004102552100000126
representing base station BS m D is the set of all ground base stations that may be in the vicinity and may generate interference, h A The free space propagation path loss for the BS (Base Station), the channel gain, sigma, is ignored 2 Representing gaussian white noise (Additive White Gaussian Noise, AWGN) power. />
The signal strength received by the ith user terminal is the transmitting power of the main lobe of the unmanned aerial vehicle minus the path loss in the signal transmission process, so the receiving power of the ith user terminal can be obtained by the following method:
Figure BDA0004102552100000127
Pt UAV representing the transmission power of unmanned aerial vehicle, G M Representing the main lobe gain of the drone,
Figure BDA0004102552100000128
representing the path loss between the user terminal and the drone at any given location.
In addition To the propagation path loss of Air To Ground (A2G), the directional millimeter wave antenna gain is also an important influencing factor of power. For ease of calculation we assume that a three-dimensional beam of the drone has the same gain GM within its beam width, and a small constant sidelobe gain GS outside the beam width. The main lobe gain can be obtained by a method in which δ i Is the cone half angle of the millimeter wave beam of the ith user terminal.
Figure BDA0004102552100000129
Therefore, the amount of transmission data of the unmanned aerial vehicle
Figure BDA0004102552100000131
Can be obtained by the following method based on the above calculation results:
Figure BDA0004102552100000132
the communication process of the unmanned aerial vehicle and the user terminal is refined through a channel model, a beam scheduling model and a transmission model. The line-of-sight link loss and the non-line-of-sight link loss are comprehensively considered through the channel model, so that an average link loss calculation result is more accurate; the communication duration of the unmanned aerial vehicle and the user terminal is refined through a beam scheduling model, and the beam alignment duration is removed from the communication duration of the unmanned aerial vehicle and the user terminal, so that the actual communication duration is obtained; the transmission model takes the average link loss and the signal loss before the unmanned aerial vehicle communicates with the user terminal as influence factors, so that the calculation result of the transmission data amount of the unmanned aerial vehicle is more accurate.
Based on any of the above embodiments, as shown in fig. 5, the method for acquiring the total energy loss of the unmanned aerial vehicle includes:
step 501, obtaining uniform speed flight energy loss, acceleration flight energy loss, deceleration flight energy loss and communication energy loss of the unmanned aerial vehicle and a user terminal of the unmanned aerial vehicle based on an energy consumption model;
in the embodiment of the invention, the unmanned aerial vehicle keeps providing data service in the flight process, and the energy loss mainly comprises two parts: flight energy, movement energy, and communication energy. When the rotary wing unmanned plane communicates with the user terminal at a certain position, the initial speed is 0, the rotary wing unmanned plane needs to accelerate to the maximum flying speed and then finishes decelerating before reaching the position, so that the movement energy comprises energy in different states. The energy loss of the unmanned aerial vehicle in the flight process is E F Hover energy loss of E H The communication energy is E C The loss of the accelerating flight energy is E A The energy loss of the decelerating flight is E D
P H For hover power, it can be obtained by:
Figure BDA0004102552100000133
where Ro is the number of unmanned rotor wings, G represents gravity (g=mg, where m is the unmanned weight, G is the gravitational acceleration), ρ represents air density, and β represents rotor disk radius.
T F For unmanned plane flight time, the energy loss in the uniform flight process is E F The method can be obtained by the following steps:
Figure BDA0004102552100000141
/>
wherein P is H For hover power, f is air resistance.
T C For the communication time between the unmanned aerial vehicle and the user terminal, the communication energy loss E C The method can be obtained by the following steps:
Figure BDA0004102552100000142
wherein Pt is UAV Is the transmission power of the unmanned aerial vehicle.
Accelerating flight energy loss E A Energy loss E of deceleration flight D The method can be obtained by the following steps:
Figure BDA0004102552100000143
wherein v represents the maximum speed that the unmanned aerial vehicle can reach, namely the speed that the unmanned aerial vehicle flies at a constant speed, m is unmanned aerial vehicle weight, and a is unmanned aerial vehicle acceleration.
Step 502, taking the sum of the constant speed flight energy loss, the acceleration flight energy loss, the deceleration flight energy loss and the communication energy loss of the unmanned aerial vehicle and the user terminal of the unmanned aerial vehicle as the total energy loss of the unmanned aerial vehicle.
The total energy loss E of the drone can be obtained by:
Figure BDA0004102552100000144
wherein the total energy loss E of the drone is in joules.
Because the acceleration and deceleration time of the unmanned aerial vehicle is very short and can be ignored in long-distance flight time, most of the unmanned aerial vehicle is in a constant-speed flight state, the air resistance can be considered to be constant, and the communication state is kept in the flight process, and the total energy loss E of the unmanned aerial vehicle in all flight time T can be considered to be constant-speed flight energy loss and communication loss, so that the total energy loss E of the unmanned aerial vehicle can be expressed as:
Figure BDA0004102552100000151
According to the air-to-ground channel characteristics, a channel model, a beam scheduling model, a transmission model and an energy consumption model of the unmanned aerial vehicle are comprehensively considered, so that the energy efficiency ratio of the system is improved, and finally the optimal track of the unmanned aerial vehicle in the hot spot area is obtained.
Based on any of the above embodiments, as shown in fig. 6, solving the deep reinforcement learning model according to the reward function includes:
step 601, presetting a passing point sequence in a region to be served as a state value, wherein the state value is a current position coordinate;
in the embodiment of the invention, the state space S is expressed as a 3D position coordinate sequence of n passing points of the unmanned aerial vehicle, and S= [ [ x ] 1 ,y 1 ,h 1 ],[x 2 ,y 2 ,h 2 ],…,[x n ,y n ,h n ]]And x, y and z axis coordinates of the n passing points are respectively represented.
Step 602, determining actions according to the epsilon-greedy strategy;
in the embodiment of the invention, use is made ofThe action space a represents the 3D position change a=a of the nth passing point of the unmanned aerial vehicle n ,a n Indicating passing point of unmanned plane [ x ] n ,y n ,h n ]The variation in the x-axis or y-axis direction was 1m in granularity.
In the embodiment of the invention, if the random action is smaller than epsilon, an action corresponding to the maximum value of Q (s, a) is selected through the value neural network of the DQN algorithm, and if the random action is larger than epsilon, an action is randomly selected in an action space.
Step 603, executing action under the current state value to obtain a next state value and a reward value, wherein the reward value is obtained according to a reward function;
Step 604, guiding actions through the rewarding value to obtain an optimal next state value;
step 605, forward pushing the deep reinforcement learning model to obtain the next state value of each moment in a future period of time;
and step 606, obtaining an optimal coordinate point sequence and a corresponding optimal energy efficiency ratio of communication between the unmanned aerial vehicle and a corresponding user terminal cluster in the to-be-served area according to the optimal next state value of each moment in a future period of time, and using the optimal coordinate point sequence and the corresponding optimal energy efficiency ratio as an output value of the deep reinforcement learning model.
In the embodiment of the invention, the action of the unmanned aerial vehicle is guided by the reward value calculated by the reward function (reward function), and when the reward value is larger, the action of the aerial base station is closer to the optimization target. In order to achieve the optimization goal of maximizing unmanned energy efficiency, and at the same time achieve full coverage to the hot spot area, the definition of the reward function is as follows:
Figure BDA0004102552100000161
Figure BDA0004102552100000162
and (3) representing the energy efficiency condition of the unmanned aerial vehicle at the current position, wherein E is a proportionality coefficient, and C is all hot spot areas. V represents the access situation of each hotspot region:
Figure BDA0004102552100000163
in an embodiment of the present invention, the reward function further includes a constraint condition, where the constraint condition includes:
the signal transmission rate of the unmanned aerial vehicle and the user terminal meets the signal transmission rate required by the user terminal;
The average link loss of the user terminal does not exceed the maximum link loss;
the displacement of the unmanned aerial vehicle at each moment does not exceed the maximum coverage diameter of the unmanned aerial vehicle;
the flight track starting points of the unmanned aerial vehicle coincide.
In the embodiment of the invention, the optimal energy efficiency ratio and the constraint conditions thereof are as follows:
Figure BDA0004102552100000164
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0004102552100000165
Figure BDA0004102552100000166
Figure BDA0004102552100000167
Figure BDA0004102552100000171
u(0)=u(f)
constraint 1: the transmission rates of the base station and the unmanned aerial vehicle meet the rate required by the user. R is R i And (t) is the signal transmission rate between the ith user terminal and the unmanned aerial vehicle.
Constraint 2: the average path loss of the user terminal does not exceed the maximum pathThe loss, i.e. the user terminal is in coverage and meets QoS (Quality of Service ) requirements, increases the coverage as much as possible.
Figure BDA0004102552100000172
PL for path loss between the ith user terminal and the drone U max Is the maximum path loss.
Constraint 3: the movement of the unmanned aerial vehicle at each moment does not exceed the maximum diameter
Figure BDA0004102552100000173
Thereby ensuring the continuous coverage and the complete coverage of disaster area users. u (t) is the position coordinates of the drone at each point in time.
Constraint 4: the starting points of the paths coincide, the unmanned plane flies for a circle to return to the original point, and a new flying cycle is started.
In the embodiment of the invention, in a region to be served, a flight track of an unmanned aerial vehicle is represented by a preset number of passing point sequences, a first state value of the passing point sequences and a first state value of action values are randomly initialized to be current 3D coordinates of the passing point sequences, and the action values are current 3D coordinate changes of the passing point sequences;
Determining an action value according to an epsilon-greedy strategy, and obtaining a rewarding value and a second state value after the passing point sequence executes the action, wherein the rewarding value is a rewarding value after the environment receives the action value, and the second state value is the next 3D coordinate of the passing point sequence;
storing a sample consisting of a first state value, an action value, a reward value and a second state value into a first sample set, iterating the first state value by using the second state value to generate a new sample, and storing the new sample into the sample set;
randomly sampling from a sample set, taking a first state value and an action value as input of a predicted value neural network to calculate a predicted Q value, and continuously updating weight parameters of the predicted value neural network according to a reward value and the predicted Q value;
updating the weight parameters of the target value neural network into the weight parameters of the predicted value neural network at intervals;
and calculating a loss value through a loss function according to the target Q value and the predicted Q value, and carrying out gradient descent update on the loss value until the target neural network converges to obtain an optimal track.
In the embodiment of the invention, through setting the conditions of the number of the passing points, the distribution of the users and the like, the unmanned aerial vehicle is utilized to continuously explore and learn the environment, continuously move the passing points to acquire new information, explore the coordinates of the passing points of the optimal energy efficiency ratio of the unmanned aerial vehicle, realize global search, avoid sinking into local optimum, finish migration among different environments and realize quick deployment.
Fig. 7 is a flowchart of a general method for unmanned aerial vehicle trajectory planning provided by an embodiment of the present invention, and as shown in fig. 7, the general method for unmanned aerial vehicle trajectory planning provided by the embodiment of the present invention includes:
step 701, acquiring a hot spot area by using a clustering algorithm according to historical distribution information of a ground user terminal;
step 702, establishing an optimization model of communication between an air base station and a user cluster;
step 703, solving a passing point sequence corresponding to the optimal energy efficiency ratio of the to-be-served area by an aerial base station track planning algorithm based on the full coverage of the hot spot area facing emergency communication based on an optimization model, and finally obtaining the full coverage high-energy efficiency track of the hot spot of the aerial base station.
In the embodiment of the invention, the hotspot area is determined by adopting a DBSCAN algorithm by considering the signal loss during signal transmission and the energy efficiency problem of the air base station, so that the key coverage area of the air base station track planning is determined, and then the high-energy efficiency ratio track of the air base station is calculated by using the air base station track planning algorithm based on the deep reinforcement learning algorithm, so that the emergency communication service of the area to be served is ensured.
Fig. 8 is a flowchart of an energy-efficient air base station track planning algorithm provided by an embodiment of the present invention, where, as shown in fig. 8, the energy-efficient air base station track planning algorithm provided by the embodiment of the present invention includes:
Step 801, dividing a user hotspot area by using a DBSCAN algorithm according to historical distribution information of a ground user terminal;
in the embodiment of the invention, the input of the user hot spot area division algorithm comprises the following steps: j is a user position data set, epsilon is a clustering radius parameter, and M is a neighborhood user density threshold; the output includes: a user cluster and a cluster center based on density clustering;
the method comprises the following specific steps:
(1) Marking all user points as unvisited, wherein each user point is represented by a position coordinate in a user position data set J, randomly selecting an unvisited user object p in a user set marked as unvisited, and updating p to be visited;
(2) If the epsilon-neighborhood of p has at least M user objects, creating a new user cluster C, and adding p to C;
(3) Representing the set of user points within the epsilon-neighborhood of p by N, for each p ' in N, if p ' is unvisited, marking p ' as visited; if the epsilon-neighborhood of p' has at least M user points, adding the user points to N; if p 'is not already a member of any user cluster, adding p' to C;
(4) And (5) obtaining the cluster center coordinates of the user clusters by averaging the x coordinates and the y coordinates of all the points.
Step 802, displaying the flight track of the unmanned aerial vehicle through a preset number of passing point sequences in a region to be served, randomly initializing a first state value of the passing point sequences and a first state value of action values, wherein the first state value is the current 3D coordinates of the passing point sequences, and the action values are the current 3D coordinate changes of the passing point sequences;
step 803, determining an action value according to an epsilon-greedy strategy, and obtaining a rewarding value and a second state value after the passing point sequence executes the action, wherein the rewarding value is a rewarding value after the environment receives the action value, and the second state value is the next 3D coordinate of the passing point sequence; storing a sample consisting of a first state value, an action value, a reward value and a second state value into a first sample set, iterating the first state value by using the second state value to generate a new sample, and storing the new sample into the sample set;
step 804, randomly sampling from a sample set, taking the first state value and the action value as the input of a predicted value neural network to calculate a predicted Q value, and continuously updating the weight parameter of the predicted value neural network according to the rewarding value and the predicted Q value;
step 805, updating the weight parameters of the target value neural network to the weight parameters of the predicted value neural network at intervals; and calculating a loss value through a loss function according to the target Q value and the predicted Q value, and carrying out gradient descent update on the target neural network according to the loss value until the target neural network converges, so as to obtain an optimal track.
According to the embodiment of the invention, the user terminal is divided into a plurality of clusters according to the position and the communication requirement, and the center of each cluster is a coverage compensation communication target of the unmanned aerial vehicle. In order to effectively meet the emergency service requirement of a region to be served, solve the track conveniently and cover the hot spot region in a key way, the flight process of the unmanned aerial vehicle is emphasized into a plurality of passing points. And obtaining the flight track of the unmanned aerial vehicle by presetting different numbers of passing points and solving the passing points. Through setting conditions such as the number of the passing points, user distribution and the like, the environment is continuously explored and learned by using a human-machine, new information is continuously acquired by moving the passing points, the coordinates of the passing points of the optimal energy efficiency ratio of the unmanned aerial vehicle are explored, global searching can be realized, local optimality is avoided, migration can be completed among different environments, and rapid deployment can be realized.
In the embodiment of the invention, the emergency communication air base station track planning algorithm flow based on the deep reinforcement learning algorithm is as follows:
(1) Randomly selecting an initial state;
(2) Action a is selected using an epsilon-greedy policy based on the state. If the random number is smaller than epsilon, selecting an action corresponding to the maximum Q value through the Q-Network, and if the random number is larger than epsilon, randomly selecting an action in an action space.
(3) After the action selection is completed, the Agent (Agent) executes action a in the Environment (Environment), and then the Environment returns to the next state (s_) and rewards (R).
(4) The quadruples (S, A, R, S_) are stored in an experience pool.
(5) And (2) taking the next state (S_) as the current state (S), and repeating the step (2).
(6) The network in the DQN is updated. That is, randomly sampling from the experience pool, sending the state (S) and the action (A) in the sample into a value network to calculate the Q value, and updating the value network according to the actual rewards (R) and the Q value.
(7) After updating the value network for a certain number of times, copying the parameters to the target network to update the network parameters.
(8) Repeating the steps (5) - (7) until the target Q-Network completes convergence.
The inputs of the air base station track optimization algorithm based on the DQN comprise: the user set U, the number of passing points N, the algorithm iteration number Z and the UAV flying height H are used for calculating the total transmission data quantity, wherein in the embodiment of the invention, the UAV flying height H is a fixed value; the output includes: an air base station running track and energy efficiency;
the method comprises the following specific steps:
(1) Initializing a user distribution state and an air base station passing point sequence s; initializing a Q network and a target network
Figure BDA0004102552100000212
Randomly generating a weight theta; setting a target value network weight theta - =θ;
(2) The following constraints are satisfied: the signal transmission rate of the unmanned aerial vehicle and the user terminal meets the signal transmission rate required by the user terminal; the average link loss of the user terminal does not exceed the maximum link loss; the displacement of the unmanned aerial vehicle at each moment does not exceed the maximum coverage diameter of the unmanned aerial vehicle; the flight track initial points of the unmanned aerial vehicle coincide, and the air base station dynamic action a is determined according to epsilon-greedy strategy t : if the random action is smaller than epsilon, selecting an action corresponding to the maximum value of Q (s, a) through a value neural network of the DQN algorithm, and if the random action is larger than or equal to epsilon, randomly selecting an action in an action space;
(3) Executing action a t Receive rewards R t (value is a reward function) and new state s t+1
(4) Sample(s) t ,a t ,R t ,s t+1 ) The sample correlation is conveniently eliminated by storing the sample correlation in a memory pool D;
(5) Randomly extracting samples from the memory pool (s j ,a j ,R j ,s j+1 );
(6) If the j+1th step reaches the preset training step number, enabling y to be j =r j The method comprises the steps of carrying out a first treatment on the surface of the Otherwise, let
Figure BDA0004102552100000211
Wherein r is the rewarding value of the step, and gamma is the discount coefficient;
(7) Couple (y) j -Q(s j ,a j ;θ)) 2 Updating with respect to θ using a gradient descent method;
(8) And synchronizing the network parameters to target value network parameters every C steps, and training the designated steps for preset times to obtain an optimal passing point sequence corresponding to the optimal energy efficiency ratio.
The invention provides an air base station track planning mechanism for emergency communication of a to-be-served area of a communication network, which comprises the steps of firstly determining a high-energy-efficiency track planning model of an air base station, and then providing a two-step mechanism for track planning of a hot spot area in the emergency communication: firstly, clustering users according to the distribution density condition of the users in the region to be served based on a DBSCAN clustering algorithm to obtain hot spot regions, and the hot spot regions are convenient to be used as track planning key regions subsequently. And a mode of communicating the user cluster head with the air base station is adopted, the coverage radius of the air base station is calculated according to the model, and the coverage area to be covered is ensured to be covered in the coverage area in the flight process of the air base station. And solving the flight track by solving the passing points by means of a deep reinforcement learning method.
The unmanned aerial vehicle track planning device provided by the invention is described below, and the unmanned aerial vehicle track planning device described below and the unmanned aerial vehicle track planning method described above can be referred to correspondingly.
Fig. 9 is a schematic structural diagram of an unmanned aerial vehicle trajectory planning device according to an embodiment of the present invention, and as shown in fig. 9, the unmanned aerial vehicle trajectory planning device according to an embodiment of the present invention includes:
The hotspot detection module 901 is configured to cluster user terminals in a to-be-served area to obtain a plurality of user terminal clusters, where each user terminal cluster corresponds to one hotspot area;
the solving module 902 is configured to obtain a coordinate point sequence corresponding to an optimal energy consumption ratio of communication between the unmanned aerial vehicle and the user terminal cluster in the to-be-served area according to the hotspot area distribution, and use the coordinate point sequence as a flight passing point sequence of the unmanned aerial vehicle;
the track planning module 903 is configured to obtain a flight track of the unmanned aerial vehicle in the to-be-serviced area based on the flight route point sequence of the unmanned aerial vehicle.
The unmanned aerial vehicle track planning device provided by the invention is used for obtaining a plurality of user terminal clusters by clustering the user terminals in the area to be served, wherein each user terminal cluster corresponds to one hot spot area; according to the distribution of the hot spot areas, a coordinate point sequence corresponding to the optimal energy consumption ratio of the unmanned aerial vehicle and the user terminal cluster communication in the area to be served is obtained, and the coordinate point sequence is used as a flight passing point sequence of the unmanned aerial vehicle; and acquiring the flight track of the unmanned aerial vehicle in the area to be serviced based on the flight passing point sequence of the unmanned aerial vehicle. By calculating the optimal energy consumption ratio of the unmanned aerial vehicle and the corresponding user terminal cluster communication in the to-be-served area, the signal transmission capacity of the whole communication system is improved, meanwhile, the quick coverage of the communication of each hot spot area is realized by dividing the hot spot area, the energy loss of the unmanned aerial vehicle caused by the early exploration of the hot spot area is reduced, the signal transmission efficiency is improved, and finally, the optimal track planning of the unmanned aerial vehicle in the to-be-served area is realized.
Fig. 10 illustrates a physical structure diagram of an electronic device, as shown in fig. 10, which may include: processor 1010, communication interface 1020, memory 1030, and communication bus 1040, wherein processor 1010, communication interface 1020, and memory 1030 communicate with each other via communication bus 1040. The processor 1010 may invoke logic instructions in the memory 1030 to perform a drone trajectory planning method comprising: clustering user terminals in a to-be-served area to obtain a plurality of user terminal clusters, wherein each user terminal cluster corresponds to one hot spot area; according to the distribution of the hot spot areas, a coordinate point sequence corresponding to the optimal energy consumption ratio of the unmanned aerial vehicle and the user terminal cluster communication in the area to be served is obtained, and the coordinate point sequence is used as a flight passing point sequence of the unmanned aerial vehicle; and acquiring the flight track of the unmanned aerial vehicle in the area to be serviced based on the flight passing point sequence of the unmanned aerial vehicle.
Further, the logic instructions in the memory 1030 described above may be implemented in the form of software functional units and stored in a computer readable storage medium when sold or used as a stand alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method of the embodiments of the present invention. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a Read-only memory (ROM), a random access memory (RAM, randomAccessMemory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In another aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform the unmanned aerial vehicle trajectory planning method provided by the above methods, the method comprising: clustering user terminals in a to-be-served area to obtain a plurality of user terminal clusters, wherein each user terminal cluster corresponds to one hot spot area; according to the distribution of the hot spot areas, a coordinate point sequence corresponding to the optimal energy consumption ratio of the unmanned aerial vehicle and the user terminal cluster communication in the area to be served is obtained, and the coordinate point sequence is used as a flight passing point sequence of the unmanned aerial vehicle; and acquiring the flight track of the unmanned aerial vehicle in the area to be serviced based on the flight passing point sequence of the unmanned aerial vehicle.
The apparatus embodiments described above are merely illustrative, wherein elements illustrated as separate elements may or may not be physically separate, and elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on such understanding, the foregoing technical solutions may be embodied essentially or in part in the form of a software product, which may be stored in a computer-readable storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform the various embodiments or methods of some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. The unmanned aerial vehicle track planning method is characterized by comprising the following steps of:
clustering user terminals in a to-be-served area to obtain a plurality of user terminal clusters, wherein each user terminal cluster corresponds to one hot spot area;
according to the distribution of the hot spot areas, acquiring a coordinate point sequence corresponding to the optimal energy consumption ratio of the unmanned aerial vehicle and the user terminal cluster communication in the area to be served, and taking the coordinate point sequence as a flight passing point sequence of the unmanned aerial vehicle;
and acquiring the flight track of the unmanned aerial vehicle in the to-be-serviced area based on the flight passing point sequence of the unmanned aerial vehicle.
2. The unmanned aerial vehicle track planning method according to claim 1, wherein the obtaining a coordinate point sequence corresponding to an optimal energy consumption ratio of unmanned aerial vehicle and user terminal cluster communication in the to-be-served area according to the hot spot area distribution comprises:
establishing a deep reinforcement learning model covering the hot spot area in the area to be served, and taking boundary points of the area to be served as state limit values of the deep reinforcement learning model;
constructing a reward function, wherein the reward function comprises a ratio function of the transmission data quantity of the unmanned aerial vehicle to the total energy loss of the unmanned aerial vehicle and a coverage condition of the hot spot area;
And solving the deep reinforcement learning model according to the reward function to obtain a coordinate point sequence corresponding to the optimal energy consumption ratio of the unmanned aerial vehicle in the to-be-serviced area and the corresponding user terminal cluster communication.
3. The unmanned aerial vehicle trajectory planning method according to claim 2, wherein the unmanned aerial vehicle transmission data amount acquisition method comprises:
acquiring average link loss of communication between the unmanned aerial vehicle and a corresponding user terminal cluster based on a channel model;
acquiring the average time-frequency resource proportion occupied by the user terminal in the hot spot area based on a beam scheduling model;
and inputting the average link loss, the average time-frequency resource proportion occupied by the user terminal in the hot spot area and the communication duration of the unmanned aerial vehicle and the user terminal into the transmission model to acquire the transmission data quantity of the unmanned aerial vehicle.
4. The unmanned aerial vehicle trajectory planning method of claim 2, wherein the method for acquiring the total energy loss of the unmanned aerial vehicle comprises:
acquiring uniform speed flight energy loss, acceleration flight energy loss, deceleration flight energy loss and communication energy loss of the unmanned aerial vehicle and the user terminal of the unmanned aerial vehicle based on an energy consumption model;
And taking the sum of the constant speed flight energy loss, the acceleration flight energy loss, the deceleration flight energy loss and the communication energy loss of the unmanned aerial vehicle and the user terminal of the unmanned aerial vehicle as the total energy loss of the unmanned aerial vehicle.
5. The unmanned aerial vehicle trajectory planning method of claim 2, wherein said solving the deep reinforcement learning model from the reward function comprises:
presetting a passing point sequence in the to-be-served area as a state value, wherein the state value is a current position coordinate;
determining actions according to the epsilon-greedy strategy;
executing action under the current state value to obtain a next state value and a reward value, wherein the reward value is obtained according to a reward function;
guiding the action through the reward value to obtain an optimal next state value;
forward pushing the deep reinforcement learning model to obtain the next state value of each moment in a future period of time;
and obtaining an optimal coordinate point sequence and a corresponding optimal energy efficiency ratio of communication between the unmanned aerial vehicle and a corresponding user terminal cluster in the to-be-served area according to the optimal next state value of each moment in a future period of time, and taking the optimal coordinate point sequence and the corresponding optimal energy efficiency ratio as an output value of the deep reinforcement learning model.
6. The unmanned aerial vehicle trajectory planning method of claim 2, wherein the reward function further comprises constraints comprising:
the signal transmission rates of the unmanned aerial vehicle and the user terminal meet the signal transmission rate required by the user terminal;
the average link loss of the user terminal does not exceed the maximum link loss;
the displacement of the unmanned aerial vehicle at each moment does not exceed the maximum coverage diameter of the unmanned aerial vehicle;
and the flight track starting points of the unmanned aerial vehicle are overlapped.
7. The unmanned aerial vehicle trajectory planning method of claim 1, wherein the clustering of user terminals in the area to be served comprises:
randomly selecting a user terminal, and calculating the density value of the neighbor users of the user terminal in the clustering radius range;
if the density value of the search neighborhood users is larger than or equal to a preset threshold value, dividing the user terminals and the user terminals in the clustering radius range into a user terminal cluster;
repeating the steps until the user density value of the search neighborhood corresponding to the user terminals which are not in the user terminal cluster is smaller than a preset threshold value.
8. An unmanned aerial vehicle trajectory planning device, characterized by comprising:
The hot spot detection module is used for clustering the user terminals in the area to be served to obtain a plurality of user terminal clusters, and each user terminal cluster corresponds to one hot spot area;
the solving module is used for acquiring a coordinate point sequence corresponding to the optimal energy consumption ratio of the unmanned aerial vehicle and the user terminal cluster communication in the to-be-serviced area according to the hot spot area distribution, and taking the coordinate point sequence as a flight passing point sequence of the unmanned aerial vehicle;
and the track planning module is used for acquiring the flight track of the unmanned aerial vehicle in the to-be-serviced area based on the flight passing point sequence of the unmanned aerial vehicle.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the unmanned aerial vehicle trajectory planning method of any one of claims 1 to 7 when the program is executed by the processor.
10. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the unmanned aerial vehicle trajectory planning method of any one of claims 1 to 7.
CN202310181830.8A 2023-02-21 2023-02-21 Unmanned aerial vehicle track planning method and device, electronic equipment and storage medium Pending CN116225058A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310181830.8A CN116225058A (en) 2023-02-21 2023-02-21 Unmanned aerial vehicle track planning method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310181830.8A CN116225058A (en) 2023-02-21 2023-02-21 Unmanned aerial vehicle track planning method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116225058A true CN116225058A (en) 2023-06-06

Family

ID=86580205

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310181830.8A Pending CN116225058A (en) 2023-02-21 2023-02-21 Unmanned aerial vehicle track planning method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116225058A (en)

Similar Documents

Publication Publication Date Title
CN111786713B (en) Unmanned aerial vehicle network hovering position optimization method based on multi-agent deep reinforcement learning
US20210165405A1 (en) Multiple unmanned aerial vehicles navigation optimization method and multiple unmanned aerial vehicles system using the same
Liu et al. Reinforcement learning in multiple-UAV networks: Deployment and movement design
CN110531617B (en) Multi-unmanned aerial vehicle 3D hovering position joint optimization method and device and unmanned aerial vehicle base station
CN109885088B (en) Unmanned aerial vehicle flight trajectory optimization method based on machine learning in edge computing network
CN111045443A (en) Movement control method, device, equipment and storage medium
CN113873434A (en) Communication network hotspot area capacity enhancement oriented multi-aerial base station deployment method
Savkin et al. Range-based reactive deployment of autonomous drones for optimal coverage in disaster areas
CN111381499B (en) Internet-connected aircraft self-adaptive control method based on three-dimensional space radio frequency map learning
CN110312265B (en) Power distribution method and system for unmanned aerial vehicle formation communication coverage
Luo et al. A two-step environment-learning-based method for optimal UAV deployment
CN114942653B (en) Method and device for determining unmanned cluster flight strategy and electronic equipment
CN115499921A (en) Three-dimensional trajectory design and resource scheduling optimization method for complex unmanned aerial vehicle network
CN114339842B (en) Method and device for designing dynamic trajectory of unmanned aerial vehicle cluster in time-varying scene based on deep reinforcement learning
CN115407794A (en) Sea area safety communication unmanned aerial vehicle track real-time planning method based on reinforcement learning
Parvaresh et al. A continuous actor–critic deep Q-learning-enabled deployment of UAV base stations: Toward 6G small cells in the skies of smart cities
Cui et al. Model-free based automated trajectory optimization for UAVs toward data transmission
CN116009590B (en) Unmanned aerial vehicle network distributed track planning method, system, equipment and medium
CN116225058A (en) Unmanned aerial vehicle track planning method and device, electronic equipment and storage medium
CN113727278A (en) Path planning method, access network equipment and flight control equipment
CN114520991B (en) Unmanned aerial vehicle cluster-based edge network self-adaptive deployment method
CN114374951B (en) Dynamic pre-deployment method for multiple unmanned aerial vehicles
CN116578354A (en) Method and device for unloading edge calculation tasks of electric power inspection unmanned aerial vehicle
CN112243239B (en) Unmanned aerial vehicle deployment method based on overpass and related device
US20230351205A1 (en) Scheduling for federated learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination