CN114257298B - Intelligent reflecting surface phase shift and unmanned aerial vehicle path planning method - Google Patents

Intelligent reflecting surface phase shift and unmanned aerial vehicle path planning method Download PDF

Info

Publication number
CN114257298B
CN114257298B CN202210050792.8A CN202210050792A CN114257298B CN 114257298 B CN114257298 B CN 114257298B CN 202210050792 A CN202210050792 A CN 202210050792A CN 114257298 B CN114257298 B CN 114257298B
Authority
CN
China
Prior art keywords
aerial vehicle
unmanned aerial
intelligent
network
terminal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210050792.8A
Other languages
Chinese (zh)
Other versions
CN114257298A (en
Inventor
梅海波
蔡勇
车畅
张子歌
庞宇
梁楚雄
孙小博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN202210050792.8A priority Critical patent/CN114257298B/en
Publication of CN114257298A publication Critical patent/CN114257298A/en
Application granted granted Critical
Publication of CN114257298B publication Critical patent/CN114257298B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B7/00Radio transmission systems, i.e. using radiation field
    • H04B7/14Relay systems
    • H04B7/15Active relay systems
    • H04B7/185Space-based or airborne stations; Stations for satellite systems
    • H04B7/18502Airborne stations
    • H04B7/18506Communications with or from aircraft, i.e. aeronautical mobile service
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W24/00Supervisory, monitoring or testing arrangements
    • H04W24/02Arrangements for optimising operational condition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W76/00Connection management
    • H04W76/10Connection setup
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Astronomy & Astrophysics (AREA)
  • Aviation & Aerospace Engineering (AREA)
  • General Physics & Mathematics (AREA)
  • Control Of Position, Course, Altitude, Or Attitude Of Moving Bodies (AREA)

Abstract

The invention discloses an intelligent reflection surface phase shift and unmanned aerial vehicle path planning method, and relates to the field of unmanned aerial vehicle air-ground communication, intelligent reflection surface auxiliary communication and deep learning.

Description

Intelligent reflecting surface phase shift and unmanned aerial vehicle path planning method
Technical Field
The invention relates to the fields of unmanned aerial vehicle air-ground communication, intelligent reflection surface auxiliary communication and deep learning, in particular to an intelligent reflection surface phase shift and unmanned aerial vehicle path planning method.
Background
The wireless communication supported by the unmanned aerial vehicle is a research hotspot in recent years, and the unmanned aerial vehicle has high mobility, so that the unmanned aerial vehicle can flexibly work in a three-dimensional space, and attracts the air-ground wireless communication by simultaneously adjusting the horizontal and vertical positions of the unmanned aerial vehicle. Due to its high mobility, drone-assisted wireless networks are particularly suitable for on-demand deployment in emergency situations, where the drone is mainly used as an aerial temporary base station or access point, transmitting or receiving data to ground terminals. However, in practical application, a communication link between the unmanned aerial vehicle and the terminal is likely to be blocked by regional obstacles, so that signal attenuation is caused, and the data transmission rate is reduced. The problem is solved by the proposal of an intelligent reflecting surface, which is a meta-surface equipped with integrated electronic circuits, which can be programmed to vary the input electromagnetic field in a customizable manner, each surface element being realized by a reflecting array. The communication link blocked between the unmanned aerial vehicle and the ground terminal can be transferred and reestablished by an intelligent reflection surface on a building, so that the energy and the spectral efficiency of a cellular system are effectively utilized, and the unmanned aerial vehicle is helped to overcome the signal blocking problem of air-ground wireless communication.
Despite these advantages, there are three technical problems that have yet to be resolved.
Firstly, under the assistance of an intelligent reflecting surface, two sections of communication links between a ground terminal and an unmanned aerial vehicle are influenced by the movement of the unmanned aerial vehicle, and the three-dimensional track design is difficult to realize;
secondly, the phase shift of the intelligent reflecting surface needs to be calculated and determined in real time according to the current communication link condition, and the traditional algorithm has large calculation amount and poor real-time performance, so that the quality of the communication link is influenced;
finally, the drone has limited energy, and the overall propulsion energy of the drone should be controlled to a minimum, while at the same time having a high system energy efficiency.
Generally, the above problems affect each other in the unmanned aerial vehicle air-ground communication system, and how to solve the joint optimization problem is particularly important for improving the unmanned aerial vehicle air-ground communication energy efficiency.
Disclosure of Invention
The invention aims to solve the problems and designs an intelligent reflecting surface phase shift and unmanned aerial vehicle path planning method which is based on deep reinforcement learning, minimizes the energy consumption of an unmanned aerial vehicle and simultaneously maximizes the gain of a wireless communication network under the assistance of an intelligent reflecting surface.
The invention realizes the purpose through the following technical scheme:
an intelligent reflecting surface phase shift and unmanned aerial vehicle path planning method is characterized by comprising the following steps:
s1, establishing an unmanned aerial vehicle-terminal communication model assisted by the intelligent reflecting surface;
s2, collecting information of the unmanned aerial vehicle, the intelligent reflection surface and the ground terminal in the current area, and importing a communication model;
s3, establishing a deep reinforcement learning network, and initializing initial and target network parameters;
s4, initializing the states of the communication scene of the unmanned aerial vehicle assisted by the intelligent reflection surface in the deep reinforcement learning network and the terminal;
s5, executing behaviors according to the states and the rewards;
s6, judging whether the unmanned aerial vehicle is out of range or overspeed, and if so, punishing and canceling the execution behavior;
s7, applying intelligent reflection surface phase shift parameters and executing behaviors;
s8, saving the behavior, reward, current and next state to the sample;
s9, if the task is not completed, repeating the steps S5 to S8 for a fixed number of times or until the task is completed;
s10, randomly selecting small samples from the samples obtained in S8 to calculate target values;
s11, updating network parameters by a method of minimizing a loss function and gradient descent respectively;
and S12, repeating the steps S4 to S11 for a fixed number of times to obtain the stable intelligent reflecting surface phase shift and unmanned aerial vehicle path planning method.
The invention has the beneficial effects that: the design is based on the frame of degree of depth reinforcement study, realizes comparing in the trade-off of traditional protruding optimization algorithm in computational complexity and precision, combines intelligent reflection surface technique, jointly optimizes unmanned aerial vehicle three-dimensional orbit and intelligent reflection surface phase shift for maximize wireless communication network gain when minimizing unmanned aerial vehicle energy consumption promotes unmanned aerial vehicle and ground terminal communication efficiency.
Drawings
FIG. 1 is a flow chart of a method of intelligent reflective surface phase shifting and unmanned aerial vehicle path planning of the present invention;
fig. 2 is a scene model diagram of an intelligent reflective surface phase shift and unmanned aerial vehicle path planning method of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention. It is to be understood that the embodiments described are only a few embodiments of the present invention, and not all embodiments. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
In the description of the present invention, it is to be understood that the terms "upper", "lower", "inner", "outer", "left", "right", and the like indicate orientations or positional relationships based on orientations or positional relationships shown in the drawings, or orientations or positional relationships conventionally placed when the product of the present invention is used, or orientations or positional relationships conventionally understood by those skilled in the art, which are merely for convenience of description and simplification of description, and do not indicate or imply that the device or element referred to must have a particular orientation, be constructed and operated in a particular orientation, and therefore, should not be construed as limiting the present invention.
Furthermore, the terms "first," "second," and the like are used merely to distinguish one description from another, and are not to be construed as indicating or implying relative importance.
In the description of the present invention, it is also to be noted that, unless otherwise explicitly stated or limited, the terms "disposed" and "connected" are to be interpreted broadly, and for example, "connected" may be a fixed connection, a detachable connection, or an integral connection; can be mechanically or electrically connected; the connection may be direct or indirect through an intermediate medium, and the connection may be internal to the two elements. The specific meanings of the above terms in the present invention can be understood by those skilled in the art according to specific situations.
The following detailed description of embodiments of the invention refers to the accompanying drawings.
The invention provides an intelligent reflecting surface phase shift and unmanned aerial vehicle path planning method which effectively balances computational complexity and computational accuracy, maximizes wireless communication network gain while minimizing unmanned aerial vehicle energy consumption, and consists of three parts of system model establishment, model transformation and solution, as shown in figure 1, the method specifically comprises the following steps:
s1, establishing an unmanned aerial vehicle-terminal communication model assisted by an intelligent reflection surface, specifically:
in a three-dimensional area of unmanned aerial vehicle and terminal communication under the assistance of intelligent reflection surface, this area is evenly divided into a plurality of cells, and the horizontal coordinate of the central of ith cell is
Figure GDA0003743550210000041
In the formula
Figure GDA0003743550210000042
Set of abscissa, x, referring to the horizontal center of all cells s And y s Refers to the horizontal distance between two adjacent cells in the x and y directions.
Figure GDA0003743550210000043
Refers to the horizontal position of the unmanned plane at the nth time slot, wherein
Figure GDA0003743550210000044
Where N refers to all slots. As shown in fig. 2, is provided
Figure GDA0003743550210000045
And
Figure GDA0003743550210000046
the horizontal center for the takeoff and landing of the unmanned aerial vehicle is set in advance.
Figure GDA0003743550210000047
Refers to the vertical position of the drone at the nth slot. Hence spatial coordinates
Figure GDA0003743550210000048
And time slot duration
Figure GDA0003743550210000049
The path plan of the drone can be characterized.
And establishing an energy consumption model. According to the horizontal flying speed of the unmanned aerial vehicle
Figure GDA0003743550210000051
Constant blade power P 0 Hovering induced power P 1 Constant falling or rising power P 2 Speed of moving blade tip Average rotor induced velocity v at hover 0 Body resistance ratio d 0 The rotor solidity S, the air density rho and the rotor disc area G, and calculating the propulsion energy of the unmanned rotorcraft
Figure GDA0003743550210000052
And establishing a communication model between the intelligent reflecting surface and the unmanned aerial vehicle. Number of reflecting units M according to each uniform planar array on intelligent reflecting surface c ×M r Uniform planar array of column spacing d c Distance d between rice and line r And m, calculating the channel gain between the unmanned aerial vehicle and the intelligent reflecting surface at the nth time slot
Figure GDA0003743550210000053
Xi in the formula refers to the channel loss when the distance is 1 meter, and the distance between the intelligent reflecting surface of the nth time slot and the unmanned aerial vehicle is expressed as
Figure GDA0003743550210000054
z R And w R Which respectively indicate the position of the first element of the intelligent reflective surface in the vertical and horizontal directions, lambda refers to the carrier wavelength,
Figure GDA0003743550210000055
and
Figure GDA0003743550210000056
respectively denote cosine and sine values of the angle of arrival of the horizontal signal of the intelligent reflecting surface,
Figure GDA0003743550210000057
refers to the sine of the angle of arrival of the vertical signal at the intelligent reflective surface.
And establishing a communication model between the intelligent reflecting surface and the ground terminal. Calculating channel gain between a kth terminal and a smart reflective surface
Figure GDA0003743550210000058
Wherein the distance between the kth terminal and the intelligent reflective surface
Figure GDA0003743550210000059
And
Figure GDA00037435502100000510
respectively refers to a cosine value and a sine value of a k terminal horizontal signal emission angle,
Figure GDA0003743550210000061
refers to the sine of the k-th terminal vertical signal transmission angle. Further, the channel gain of the k-th terminal of the overall process may be expressed as
Figure GDA0003743550210000062
In the formula
Figure GDA0003743550210000063
Is an intelligent reflective surface reflection phase coefficient matrix and
Figure GDA0003743550210000064
and establishing a communication link model of the unmanned aerial vehicle and the ground terminal under the assistance of the intelligent reflecting surface. Calculating blocking probability of link between unmanned aerial vehicle and k ground terminal in n time slot
Figure GDA0003743550210000065
In the formula
Figure GDA0003743550210000066
a and b are variables that change as the communication environment changes. Further, the average channel gain achievable by the kth terminal is expressed as
Figure GDA0003743550210000067
Channel rate of
Figure GDA0003743550210000068
In the formula, P is the fixed transmitting power of the unmanned aerial vehicle, B is the bandwidth, sigma is the noise variable, c k,n With 0, 1 indicating whether the kth terminal is scheduled (the same slot intelligent reflective surface serves at most one terminal).
S2, collecting information of the unmanned aerial vehicle, the intelligent reflection surface and the ground terminal in the current area, and importing a communication model:
and collecting information of the unmanned aerial vehicle L, H, T, information of the intelligent reflection surface theta and information of the ground terminal C in the current area, and importing the information into a communication model. Wherein
Figure GDA0003743550210000069
Indicating a set of horizontal positions of the drone,
Figure GDA00037435502100000610
indicating a set of vertical positions of the drone,
Figure GDA00037435502100000611
indicating the duration of each flight time slot of the drone,
Figure GDA00037435502100000612
indicating the intelligent reflective surface reflection phase coefficient matrix,
Figure GDA00037435502100000613
indicating a ground terminal scheduling scheme;
s3, establishing a deep reinforcement learning network, initializing initial and target network parameters:
and establishing a deep reinforcement learning network, and initializing an experience recurrence cache F and a time slot number N. Initializing parameter theta of initial and target policy pi network π And theta π′ So that θ π′ =θ π . Parameters thetaq and thetaq for initializing initial and target deep reinforcement learning Q networks Q′ So that theta Q′ =θ Q
S4, initializing the state of the communication scene between the intelligent reflection surface assisted unmanned aerial vehicle and the terminal in the deep reinforcement learning network to a state S (1);
s5, performing behavior according to the state and the reward:
random selection behavior
Figure GDA0003743550210000071
Carry out and executeIn the formula
Figure GDA0003743550210000072
Is a random equation, pi (s (n) | theta π ) Denoted in state s (n) and network parameter θ π Selecting a time strategy;
s6, judging whether the unmanned aerial vehicle is out of range or overspeed, if yes, punishing, and simultaneously canceling the execution behaviors a (n);
s7, applying the smart reflective surface phase shift parameter, performing acts a (n) for state S (n +1) and reward r (S (n), a (n));
s8, storing behaviors, rewards, current and next-moment states into the samples, namely storing the samples (S (n), a (n), r (·), S (n +1)) into an experience reproduction cache F;
s9, if task D k If not, repeating the steps S5 to S8 repeatedly until the task is completed or repeating N times;
s10, randomly selecting small samples from the samples obtained in S8 to calculate the target values:
selecting a batch of random small samples (s (j), a (j), r (j), s (j +1)) from M samples in the empirical recurrence cache F, wherein s (j) and s (j +1) respectively refer to the states at the moments j and j +1, a (j) refers to the behavior at the moment j, and r (j) refers to the reward at the moment j; the target value y (j) ═ r (j) + γ Q' (s (j +1) | θ is calculated π′ )|θ Q′ ) In the formula, gamma is belonged to (0, 1)]For the discounting factor, Q '(s (j +1), π' (s (j + 1). theta π′ )|θ Q′ ) The Q value parameter of the target Q' () network obtained through the state, the strategy and the network parameter at the moment j + 1.
S11, updating the network parameters by a method of minimizing a loss function and gradient descent respectively:
by minimizing a loss function
Figure GDA0003743550210000073
Updating Q (-) network weights θ Q Wherein Q (s (j), a (j) | θ Q ) By network weight θ Q Q value obtained when the state is s (j), the action is a (j), and updating the target network parameter theta Q′ =δθ Q +(1-δ)θ Q′ Wherein δ is a proportionality coefficient; by means of a pair of gradients
Figure GDA0003743550210000081
Solving and reducing, wherein J (pi) is the output value of the strategy network, a refers to behavior parameters, s refers to state parameters, pi refers to strategy parameters, and the weight theta of the pi (·) network is updated π And updating the target network parameter θ π′ =δθ π +(1-δ)θ π′
And S12, repeating the steps S4 to S11 for a fixed number of times to obtain the stable intelligent reflecting surface phase shift and unmanned aerial vehicle path planning method.
The technical solution of the present invention is not limited to the limitations of the above specific embodiments, and all technical modifications made according to the technical solution of the present invention fall within the protection scope of the present invention.

Claims (1)

1. The intelligent reflecting surface phase shift and unmanned aerial vehicle path planning method is characterized by comprising the following steps:
s1, establishing an unmanned aerial vehicle-terminal communication model assisted by the intelligent reflecting surface;
s2, collecting information of the unmanned aerial vehicle, the intelligent reflection surface and the ground terminal in the current area, and importing a communication model;
s3, establishing a deep reinforcement learning network, and initializing initial and target network parameters;
s4, initializing the states of the communication scene of the unmanned aerial vehicle assisted by the intelligent reflection surface in the deep reinforcement learning network and the terminal;
s5, executing behaviors according to the states and the rewards;
s6, judging whether the unmanned aerial vehicle is out of range or overspeed, and if so, punishing and canceling the execution behavior;
s7, applying intelligent reflection surface phase shift parameters and executing behaviors;
s8, saving the behavior, reward, current and next state to the sample;
s9, if the task is not completed, repeating the steps S5 to S8 for a fixed number of times or until the task is completed;
s10, randomly selecting small samples from the samples obtained in S8 to calculate target values;
s11, updating network parameters by a method of minimizing a loss function and gradient descent respectively;
s12, repeating the steps S4 to S11 for a fixed number of times to obtain an intelligent reflecting surface phase shift tending to be stable and an unmanned aerial vehicle path planning method;
in S1, the method includes:
s101, using space coordinates
Figure FDA0003743550200000011
And time slot duration
Figure FDA0003743550200000012
Characterizing a path of the drone;
wherein the content of the first and second substances,
Figure FDA0003743550200000013
refers to the horizontal position of the unmanned plane at the nth time slot, wherein
Figure FDA0003743550200000014
Where N refers to all slots;
Figure FDA0003743550200000015
indicating the vertical position of the unmanned plane in the nth time slot;
in a three-dimensional area for communication between the unmanned aerial vehicle and the terminal under the assistance of an intelligent reflecting surface, the area is uniformly divided into a plurality of cells, and the horizontal coordinate of the center of the ith cell is
Figure FDA00037435502000000211
Wherein
Figure FDA00037435502000000210
A set of abscissas that refer to the horizontal centers of all cells;
is provided with
Figure FDA0003743550200000029
And
Figure FDA0003743550200000028
taking off and landing horizontal centers of the unmanned aerial vehicle which are set in advance;
s102, the propulsion energy of the rotor unmanned aerial vehicle is as follows:
Figure FDA0003743550200000021
in the above formula, the first and second carbon atoms are,
Figure FDA0003743550200000027
the horizontal flying speed of the unmanned plane; p 0 Constant blade wing power; p 1 Inducing power for hovering; p 2 Constant falling or rising power; u shape tip The moving blade wing speed; v. of 0 Average rotor induced speed at hover; d 0 Is the fuselage resistance ratio; s is rotor solidity; ρ is the air density; g is the area of the rotor disc;
s103, according to the number M of the reflection units of each uniform planar array on the intelligent reflection surface c ×M r Uniform planar array of column spacing d c Distance d between rice and row r And m, calculating the channel gain between the unmanned aerial vehicle at the nth time slot and the intelligent reflecting surface:
Figure FDA0003743550200000022
in the above formula, ξ is the channel loss at a distance of 1 meter;
Figure FDA0003743550200000023
the distance between the intelligent reflecting surface and the unmanned aerial vehicle is the nth time slot; zR and w R Respectively representing the position of a first element of the intelligent reflecting surface in the vertical direction and the horizontal direction; λ refers to the carrier wavelength;
Figure FDA0003743550200000026
and
Figure FDA0003743550200000024
respectively indicating cosine and sine values of the arrival angle of the horizontal signal of the intelligent reflection surface;
Figure FDA0003743550200000025
sine value of the vertical signal arrival angle of the intelligent reflection surface;
s104, similarly, calculating the channel gain between the kth terminal and the intelligent reflecting surface:
Figure FDA0003743550200000031
wherein the distance between the kth terminal and the intelligent reflective surface
Figure FDA0003743550200000032
Figure FDA00037435502000000316
And
Figure FDA00037435502000000317
respectively denote a cosine value and a sine value of a k-th terminal horizontal signal emission angle,
Figure FDA00037435502000000313
the sine value of the vertical signal emission angle of the kth terminal is referred to; further, the channel gain of the k-th terminal of the overall process is expressed as
Figure FDA00037435502000000314
In the formula
Figure FDA00037435502000000315
Is an intelligent reflective surface reflection phase coefficient matrix and
Figure FDA00037435502000000312
s105, calculating blocking probability of a link between the unmanned aerial vehicle and the kth ground terminal in the nth time slot
Figure FDA00037435502000000310
In the formula
Figure FDA00037435502000000311
a and b are variables that change with changes in the communication environment; further, the average channel gain achieved by the kth terminal is expressed as
Figure FDA0003743550200000038
Channel rate of
Figure FDA0003743550200000039
In the formula, P is the fixed transmitting power of the unmanned aerial vehicle, B is the bandwidth, sigma is the noise variable, c k,n {0, 1} represents whether or not the k-th terminal is scheduled;
in S2, collecting information of the unmanned aerial vehicle L, H, T, information of the intelligent reflection surface theta and information of the ground terminal C in the current area, and importing a communication model; wherein
Figure FDA0003743550200000037
Indicating a set of horizontal positions of the drone,
Figure FDA0003743550200000034
indicating a set of vertical positions of the drone,
Figure FDA0003743550200000036
indicating the duration of each flight time slot of the drone,
Figure FDA0003743550200000035
indicating the intelligent reflective surface reflection phase coefficient matrix,
Figure FDA0003743550200000033
indicating a ground terminal scheduling scheme;
in S3, establishing a deep reinforcement learning network, and initializing an experience recurrence buffer F and a time slot number N; initializing parameter theta of initial and target policy pi network π And theta π′ So that theta π′ =θ π (ii) a Initializing parameters θ of initial and target deep reinforcement learning Q networks Q And theta Q′ So that θ Q′ =θ Q
In S5, behaviors are randomly selected
Figure FDA0003743550200000041
Is carried out in the formula
Figure FDA0003743550200000042
Is a random equation, pi (s (n) | theta π ) Denoted in state s (n) and network parameter θ π Selecting time;
in S6, if the drone is flying beyond the boundary or the speed exceeds the upper limit, then penalizing while cancelling action a (n);
at S7, applying the smart reflective surface phase shift parameter, performing acts a (n) for state S (n +1) and reward r (S (n), a (n));
in S8, storing the samples (S (n), a (n), r (·), S (n +1)) in an experience recurrence cache F;
in S10, selecting a random small sample (S (j), a (j), r (j), S (j +1)) from M samples in the empirical recurrence buffer F, where S (j) and S (j +1) refer to the states at times j and j +1, respectively, a (j) refers to the behavior at time j, and r (j) refers to the reward at time j; the target value y (j) ═ r (j) + γ Q' (s (j +1) | θ is calculated π′ )|θ Q′ ) In the formula, gamma is belonged to (0, 1)]As a discount factor, Q '(s (j +1), π' (s (j +1) | θ π′ )|θ Q′ ) The Q value parameter of the target Q' () network obtained through the state, the strategy and the network parameter at the moment j + 1;
in S11, by minimizing the loss function
Figure FDA0003743550200000043
Updating Q (-) network weights θ Q Wherein Q (s (j), a (j) | θ Q ) By network weight θ Q Q value obtained when the state is s (j), the action is a (j), and updating the target network parameter theta Q′ =δθ Q +(1-δ)θ Q′ Wherein δ is a proportionality coefficient; by means of a pair of gradients
Figure FDA0003743550200000044
Solving and reducing, wherein J (pi) is the output value of the strategy network, a refers to behavior parameters, s refers to state parameters, pi refers to strategy parameters, and the weight theta of the pi (·) network is updated π And updating the target network parameter θ π′ =δθ π +(1-δ)θ π′
CN202210050792.8A 2022-01-17 2022-01-17 Intelligent reflecting surface phase shift and unmanned aerial vehicle path planning method Active CN114257298B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210050792.8A CN114257298B (en) 2022-01-17 2022-01-17 Intelligent reflecting surface phase shift and unmanned aerial vehicle path planning method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210050792.8A CN114257298B (en) 2022-01-17 2022-01-17 Intelligent reflecting surface phase shift and unmanned aerial vehicle path planning method

Publications (2)

Publication Number Publication Date
CN114257298A CN114257298A (en) 2022-03-29
CN114257298B true CN114257298B (en) 2022-09-27

Family

ID=80796579

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210050792.8A Active CN114257298B (en) 2022-01-17 2022-01-17 Intelligent reflecting surface phase shift and unmanned aerial vehicle path planning method

Country Status (1)

Country Link
CN (1) CN114257298B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116878520B (en) * 2023-09-06 2024-01-26 北京邮电大学 Unmanned aerial vehicle path planning method

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113162679A (en) * 2021-04-01 2021-07-23 南京邮电大学 DDPG algorithm-based IRS (inter-Range instrumentation System) auxiliary unmanned aerial vehicle communication joint optimization method
CN113316169A (en) * 2021-05-08 2021-08-27 北京科技大学 UAV auxiliary communication energy efficiency optimization method and device for smart port
CN113364495A (en) * 2021-05-25 2021-09-07 西安交通大学 Multi-unmanned aerial vehicle track and intelligent reflecting surface phase shift joint optimization method and system
JP2021148789A (en) * 2020-03-13 2021-09-27 株式会社スカイマティクス Radio wave propagation path maintenance system
CN113645635A (en) * 2021-08-12 2021-11-12 大连理工大学 Design method of intelligent reflector-assisted high-energy-efficiency unmanned aerial vehicle communication system
CN113708886A (en) * 2021-08-25 2021-11-26 中国人民解放军陆军工程大学 Unmanned aerial vehicle anti-interference communication system and joint track and beam forming optimization method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2021148789A (en) * 2020-03-13 2021-09-27 株式会社スカイマティクス Radio wave propagation path maintenance system
CN113162679A (en) * 2021-04-01 2021-07-23 南京邮电大学 DDPG algorithm-based IRS (inter-Range instrumentation System) auxiliary unmanned aerial vehicle communication joint optimization method
CN113316169A (en) * 2021-05-08 2021-08-27 北京科技大学 UAV auxiliary communication energy efficiency optimization method and device for smart port
CN113364495A (en) * 2021-05-25 2021-09-07 西安交通大学 Multi-unmanned aerial vehicle track and intelligent reflecting surface phase shift joint optimization method and system
CN113645635A (en) * 2021-08-12 2021-11-12 大连理工大学 Design method of intelligent reflector-assisted high-energy-efficiency unmanned aerial vehicle communication system
CN113708886A (en) * 2021-08-25 2021-11-26 中国人民解放军陆军工程大学 Unmanned aerial vehicle anti-interference communication system and joint track and beam forming optimization method

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
Joint Trajectory-Task-Cache Optimization With Phase-Shift Design of RIS-Assisted UAV for MEC;Haibo Mei 等;《IEEE Wireless Communications Letters》;20210422;第1586-1590页 *
Optimizing Age of Information Through Aerial Reconfigurable Intelligent Surfaces: A Deep Reinforcement Learning Approach;Moataz Samir 等;《IEEE Transactions on Vehicular Technology》;20210305;第70卷(第4期);第3978-3983页 *
Reconfigurable Intelligent Surface Assisted UAV Communication: Joint Trajectory Design and Passive Beamforming;Sixian Li 等;《IEEE Wireless Communications Letters》;20200114;第9卷(第5期);第716-720页 *
Reconfigurable Intelligent Surface-Assisted Multi-UAV Networks: Efficient Resource Allocation With Deep Reinforcement Learning;Khoi Khac Nguyen 等;《IEEE Journal of Selected Topics in Signal Processing》;20211209;第16卷(第3期);第358-368页 *
基于智能反射面的无人机通信研究;李思贤;《中国优秀博硕士学位论文全文数据库(硕士)工程科技Ⅱ辑》;20220115;第10-19页 *
基于深度强化学习的无人机数据采集和路径规划研究;牟治宇 等;《物联网学报》;20200930;第4卷(第03期);第42-51页 *

Also Published As

Publication number Publication date
CN114257298A (en) 2022-03-29

Similar Documents

Publication Publication Date Title
CN113645635B (en) Design method of intelligent reflector-assisted high-energy-efficiency unmanned aerial vehicle communication system
CN113364495B (en) Multi-unmanned aerial vehicle track and intelligent reflecting surface phase shift joint optimization method and system
CN113162679B (en) DDPG algorithm-based IRS (intelligent resilient software) assisted unmanned aerial vehicle communication joint optimization method
CN110286694B (en) Multi-leader unmanned aerial vehicle formation cooperative control method
CN113194488B (en) Unmanned aerial vehicle track and intelligent reflecting surface phase shift joint optimization method and system
Mei et al. 3D-trajectory and phase-shift design for RIS-assisted UAV systems using deep reinforcement learning
Binol et al. Time optimal multi-UAV path planning for gathering its data from roadside units
CN114257298B (en) Intelligent reflecting surface phase shift and unmanned aerial vehicle path planning method
CN100541372C (en) Automatic homing control method under a kind of unmanned vehicle engine involuntary stoppage
Dong et al. Joint optimization of deployment and trajectory in UAV and IRS-assisted IoT data collection system
CN113784314B (en) Unmanned aerial vehicle data and energy transmission method assisted by intelligent reflection surface
CN114372612B (en) Path planning and task unloading method for unmanned aerial vehicle mobile edge computing scene
CN115202849B (en) Multi-unmanned aerial vehicle task allocation and path planning method supporting edge calculation
CN114158010B (en) Unmanned aerial vehicle communication system and resource allocation strategy prediction method based on neural network
CN114142908B (en) Multi-unmanned aerial vehicle communication resource allocation method for coverage reconnaissance task
Chen et al. Cooperative networking strategy of UAV cluster for large-scale WSNs
Tegicho et al. Effect of wind on the connectivity and safety of large scale uav swarms
CN114879742B (en) Unmanned aerial vehicle cluster dynamic coverage method based on multi-agent deep reinforcement learning
CN115665800A (en) Joint optimization method for intelligent campus unmanned aerial vehicle track task unloading cache and RIS
Lyu et al. Resource Allocation in UAV‐Assisted Wireless Powered Communication Networks for Urban Monitoring
Cao et al. Average transmission rate and energy efficiency optimization in uav-assisted IoT
Zhang et al. Learning-based trajectory design and time allocation in UAV-supported wireless powered NOMA-IoT networks
Chuan et al. Cooperative Scheduling of Imaging Observation Tasks for High‐Altitude Airships Based on Propagation Algorithm
Lin et al. A Deep Reinforcement Learning Based UAV Trajectory Planning Method For Integrated Sensing And Communications Networks
CN111934745B (en) Optimization method based on energy-saving communication system of solar unmanned aerial vehicle

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant