CN115915069A

CN115915069A - Unmanned aerial vehicle RIS-carrying auxiliary vehicle network communication method and system

Info

Publication number: CN115915069A
Application number: CN202211348726.5A
Authority: CN
Inventors: 赵海涛; 孙文雪; 夏文超; 倪艺洋; 王琴; 徐卓然; 谈宇浩; 郝晴
Original assignee: Nanjing University of Posts and Telecommunications
Current assignee: Nanjing University of Posts and Telecommunications
Priority date: 2022-10-31
Filing date: 2022-10-31
Publication date: 2023-04-04

Abstract

The invention discloses a network communication method and a network communication system for an unmanned aerial vehicle carrying RIS (remote station) auxiliary vehicle, wherein the unmanned aerial vehicle predicts the next time slot position of the vehicle according to a prediction model; then, optimizing a phase shift matrix of the RIS reflecting unit and optimizing the track of the unmanned aerial vehicle, thereby maximizing the communication rate of the two vehicles in the next time slot; and finally, adjusting the phase shift factor of the RIS reflecting unit according to the optimization result, and enabling the unmanned aerial vehicle to fly to the arrived position in the next time slot. The invention solves the problem that the vehicles can not be directly communicated due to the obstruction of the barriers, enhances the quality of signals reflected to the vehicles by optimizing the RIS phase shift matrix under the condition of meeting the delay constraint of vehicle communication and the energy consumption constraint of unmanned aerial vehicle flight, better services the vehicle communication by optimizing the track of the unmanned aerial vehicle, and simultaneously solves the optimization problem by adopting the DDPG algorithm so as to obtain the strategy of maximizing the vehicle communication speed, meet the requirement of vehicle network communication and improve the performance of the vehicle communication.

Description

Unmanned Aerial Vehicle (UAV) carried RIS auxiliary vehicle network communication method and system

Technical Field

The invention belongs to the technical field of vehicle networking communication, and particularly relates to a network communication method and system for an unmanned aerial vehicle carrying RIS (remote location System) auxiliary vehicle.

Background

The main purpose of vehicle communication research results is to make people's daily trips safer and more convenient, thereby paving the way for intelligent transportation systems and autonomous driving applications. However, the propagation link for vehicle communication is easily deteriorated due to a complicated propagation environment, especially considering the occlusion of a plurality of buildings in a city and the rapid change of a channel due to the high mobility of the vehicle.

Because of its advantages of high mobility, low cost and line-of-sight transmission, the drone has been used as a mobile aerial base station or airborne relay to improve ground communication performance by deploying optimal flight positions. However, the conventional communication auxiliary technology usually adopts auxiliary devices such as relays and the like, consumes more energy and is easily interfered by the environment; the current research about unmanned aerial vehicle auxiliary vehicle networking is concentrated on static deployment unmanned aerial vehicle position with the coverage area of maximize unmanned aerial vehicle and ground user communication, does not consider unmanned aerial vehicle's mobility, and the energy that unmanned aerial vehicle self carried is limited simultaneously, and its energy consumption management also is the problem that needs to consider. And most studies exist that combine UAVs and RIS, but only for the case of ground users or target statics. However, in the internet of vehicles, the vehicles have high mobility, which causes great unknown to the communication environment, the RSU needs to continuously exchange various state information with the vehicles, perform online decision-making according to the instant state information, and solve the optimal decision by using a conventional optimization algorithm or a heuristic algorithm, which requires a large calculation cost and is difficult to deploy in an online manner in real time.

Disclosure of Invention

In order to solve the technical problem, the invention provides a network communication method and system for an unmanned aerial vehicle carrying RIS auxiliary vehicle. The communication speed maximization between vehicles is realized by optimizing the RIS phase shift matrix and the unmanned aerial vehicle track under the condition of meeting the V2V communication delay constraint and the unmanned aerial vehicle energy consumption constraint, and the communication performance of a vehicle network is improved.

The invention adopts the following technical scheme:

a network communication method for an Unmanned Aerial Vehicle (UAV) carrying RIS auxiliary vehicle is characterized in that aiming at two vehicles with communication obstacles in a target area, the unmanned aerial vehicles carrying a preset number of RIS reflecting units are combined to serve as communication relays, the following steps are executed, and the maximization of the communication rate of the two vehicles in each time slot is realized:

step A: respectively predicting vehicle positions of the first vehicle and the second vehicle in the next time slot based on the information of the first vehicle and the second vehicle in the current time slot, wherein the information specifies each vehicle state type;

and B, step B: obtaining an optimized RIS reflecting unit phase shift matrix and an unmanned aerial vehicle track based on the vehicle positions of the first vehicle and the second vehicle in the next time slot;

and C: and adjusting the phase shift matrix of the RIS reflecting unit and the position of the unmanned aerial vehicle in the next time slot flying arrival based on the optimized RIS reflecting unit phase shift matrix and the unmanned aerial vehicle track.

Preferably, in the step a, based on the information specifying the respective vehicle state types of the first vehicle and the second vehicle at the current time slot, the vehicle positions of the first vehicle and the second vehicle at the next time slot, that is, the first vehicle k, are predicted by the following formula ₁ Vehicle position at the next time slot

First vehicle k ₂ In the next time slot->

Wherein the content of the first and second substances,

indicating a position of the first vehicle; />

Indicating a position of the second vehicle; />

Indicating a first vehicle k ₁ The x-axis coordinate position of the current time slot n; />

Indicating a first vehicle k ₁ The y-axis coordinate position of the current time slot n; />

Indicating a second vehicle k ₂ The x-axis coordinate position of the current time slot n; />

Indicating a second vehicle k ₂ The y-axis coordinate position of the current time slot n; n +1 represents the next time slot of the current time slot n; v. of ₁ (n) represents the speed of the first vehicle at the current time slot n; v. of ₂ (n) represents the speed of the second vehicle at the current time slot n; a is ₁ (n) represents the acceleration of the first vehicle at the current time slot n; a is a ₂ (n) an acceleration of the second vehicle at the current time slot n; w is a ₁ (n) represents the angular velocity of the first vehicle at the current time slot n; w is a ₂ (n) represents the angular velocity of the second vehicle at the current time slot n; theta.theta. ₁ (n) represents a deviation angle of the first vehicle at the current time slot n; theta ₂ And (n) represents the deviation angle of the second vehicle at the current time slot n.

Preferably, the information specifying each vehicle state type includes a position, a speed, a deviation angle, and an angular acceleration of the vehicle at the current time slot.

Preferably, in the step B, the following steps are specifically executed to obtain an optimized RIS reflecting unit phase shift matrix and an optimized drone trajectory:

step B1: based on the fact that an unmanned aerial vehicle with a RIS reflecting unit is used as a communication relay, an optimization problem of communication of two vehicles in the next time slot of the current time slot is built by combining communication rate models of the two vehicles;

and step B2: and solving an optimization problem of communication of the first vehicle and the second vehicle in the next time slot of the current time slot based on the predicted vehicle positions of the first vehicle and the second vehicle in the next time slot to obtain an optimized RIS reflecting unit phase shift matrix and an unmanned aerial vehicle track.

Preferably, in step B1, based on the unmanned aerial vehicle having the RIS reflector unit mounted thereon as a communication relay, the communication rate model of the two vehicles is expressed as follows:

wherein the content of the first and second substances,

/>

in the formula, R _b [i]Indicating the communication rate of two vehicles in time slot i, i.e. communication between two vehiclesA rate model; b is _w Is the bandwidth;

indicating a first vehicle k ₁ The transmit power of (a); h is ₁ [i]Indicating a first vehicle k ₁ Channel gain at time slot i with the RIS reflecting unit; h is ₂ [i]Indicating a second vehicle k ₂ Channel gain at time slot i with the RIS reflection unit; theta [ i ]]Representing the RIS reflecting element phase shift matrix at time slot i; sigma ² Representing the noise power; ρ represents a path loss at a reference distance of 1 m; α represents a path loss exponent; λ represents the carrier wavelength; />

Indicating a first vehicle k ₁ The distance from the unmanned aerial vehicle in time slot i; />

Indicating a second vehicle k ₂ The distance from the unmanned aerial vehicle in time slot i; m represents the total number of RIS reflecting units; />

Indicating from a first vehicle k ₁ Cosine of the signal angle to the drone; />

Indicating a transition from drone to second vehicle k ₂ The cosine of the signal angle of (c); x is a radical of a fluorine atom _A [i]Representing the x-axis coordinate position, y, of the drone in time slot i _A [i]The coordinate position of the unmanned aerial vehicle on the y axis of the time slot i is represented, and the coordinate position of the unmanned aerial vehicle on the z axis of the time slot i is represented by H and is a fixed height; />

Indicating a first vehicle k ₁ Z-axis coordinate position at time slot i;

indicating a second vehicle k ₂ At the z-coordinate position of time slot i.

Preferably, in the step B1, the problem of optimizing the communication between the two vehicles in the next time slot of the current time slot is to maximize the communication rate of the two vehicles in the next time slot of the current time slot, and is expressed as follows:

s.t

E _UAV ≤E _MAX

||l[n+1]-l[n]|| ² ≤D _l ² ，n＝1,2,...,N-1

||l[1]-l ₀ || ² ≤D _l ²

wherein R is _b [n+1]Representing the communication rate of two vehicles in the next time slot n +1 of the current time slot; phi represents a phase shift matrix at the RIS reflective element; theta [ n +1 ]]Representing the RIS reflective element phase shift matrix at time slot n + 1; b is _w Is the bandwidth; l represents drone position;

indicating a first vehicle k ₁ The transmit power of (a); h is ₁ [n+1]Indicating a first vehicle k ₁ Channel gain at time slot n +1 with the RIS reflection unit; h is ₂ [n+1]Indicating a second vehicle k ₂ Channel gain at time slot n +1 with the RIS reflection unit; sigma ² Representing the noise power; />

Representing a delivery probability; Δ T is the communication duration of the two vehicles based on the drone as a communication relay; b represents the sum of the data sizes required to be generated by the communication of the two vehicles within the time delta T; n represents the total number of communication time slots; p is _th Representing a preset delivery probability threshold; e _UAV Representing the energy consumption of the unmanned aerial vehicle in the current time slot; e _MAX To representMaximum energy consumption of the unmanned aerial vehicle; l [ n ]]Representing the position of the unmanned plane in the current time slot n; l [ n +1 ]]Indicating the position of the unmanned plane in the time slot n + 1; l 1]Indicating the position of the unmanned plane in the 1 st time slot; l. the ₀ Representing an initial position of the drone; d _l ＝V _max τ，V _max Is the maximum speed at which the drone is flying; τ is the slot duration.

Preferably, in step B2, based on the predicted vehicle positions of the first vehicle and the second vehicle in the next time slot, the optimization problem of the communication of the two vehicles in the next time slot of the current time slot is solved by using a DDPG algorithm, so as to obtain an optimized RIS reflection unit phase shift matrix and an optimized unmanned aerial vehicle track.

Preferably, for the optimization problem of communication of two vehicles in the next time slot of the current time slot, in the DDPG algorithm, the unmanned aerial vehicle is used as an intelligent agent, and the communication environment between the unmanned aerial vehicle and the two vehicles is used as the DDPG algorithm environment:

the state of the setting environment is: s (t) = { h ₁ (t),h ₂ (t),R _b (t),B _k }

Wherein h is ₁ [t]Indicating a first vehicle k ₁ Channel gain at time slot t with the RIS reflecting unit; h is ₂ [t]Indicating a second vehicle k ₂ Channel gain at time slot t with the RIS reflecting unit; r is _b (t) represents the communication rate of the two vehicles in the time slot t; b _k Representing the remaining payload of the environment state space at t time slot; s (t) represents the state of the environmental state space at time slot t;

based on the state of the environment, the actions of the agent in the environment include determining the RIS phase shift matrix and the drone trajectory, i.e., a (t) = { Φ, L };

wherein the content of the first and second substances,

representing the RIS reflecting element phase shift matrix at time slot t;

representing the trajectory of the drone at time slot t; a (t) indicates the state of the agent in the environment at time slot tAn action in state space;

based on the actions of the agent in the environment, the reward awarding function is:

wherein r (t) represents the reward at time slot t;

based on the environment state, the actions of the agent in the environment, and the reward function, the evaluation function Q of the environment is:

wherein, pi represents a strategy, namely the action of the intelligent agent in the environment; e represents expectation; gamma is an element of [0,1 ]]A discount factor representing r (t), r(s) _t ,a _t ) Representing an immediate reward for taking action a in the ambient state s of time slot t.

Preferably, in the DDPG algorithm, an optimal strategy pi ^* By an objective function of pi ^* ＝argmaxQ ^π (s, a) is derived, where π represents a policy, s represents a state of the environment, and a represents an action of the agent in the environment.

An unmanned aerial vehicle carries on RIS auxiliary vehicle network communication system, apply to a unmanned aerial vehicle carries on RIS auxiliary vehicle network communication method of claim 1, characterized by that: the system comprises a position prediction module and an optimization module; the position prediction module respectively predicts the vehicle positions of the first vehicle and the second vehicle in the next time slot based on the information of the first vehicle and the second vehicle in the current time slot, wherein the information specifies the state types of the vehicles; the optimization module optimizes the phase shift matrix of the RIS reflection unit and the unmanned aerial vehicle trajectory based on predicting vehicle positions of the first vehicle and the second vehicle at a next time slot.

The invention has the beneficial effects that: the invention provides a communication method and a system for an unmanned aerial vehicle carrying RIS auxiliary vehicle network. Under the condition of meeting the vehicle communication delay constraint and reliability and the unmanned aerial vehicle energy consumption constraint, the optimization problem of maximizing the communication speed between the vehicles is quickly solved by using a DDPG algorithm, and the optimal strategy of the communication between the vehicles is obtained. According to the method, the communication speed maximization between vehicles is realized by optimizing the RIS phase shift matrix and the unmanned aerial vehicle track under the condition of meeting the V2V communication delay constraint and the unmanned aerial vehicle energy consumption constraint, and the communication performance of a vehicle network is improved.

Drawings

Fig. 1 is a schematic general flow chart of a network communication method for an assisted vehicle equipped with an unmanned aerial vehicle RIS according to an embodiment of the present invention;

fig. 2 is a schematic view of a network communication scene of an auxiliary vehicle for loading a RIS to an unmanned aerial vehicle according to an embodiment of the present invention.

Detailed Description

The invention is further described below with reference to the accompanying drawings. The following examples are presented to enable one of ordinary skill in the art to more fully understand the present invention and are not intended to limit the invention in any way.

An unmanned aerial vehicle carries on RIS and assists the vehicle network communication method, two vehicles to there are communication obstacles in the target area, vehicle k ₁ 、k ₂ In combination with the unmanned aerial vehicles carrying the preset number of RIS reflection units as communication relays, that is, one unmanned aerial vehicle UAV carries one RIS element having M reflection units, as shown in fig. 1, the following steps are performed to maximize the communication rate of two vehicles in each timeslot. In this embodiment, consider an intersection scenario, as shown in FIG. 2, vehicle k ₁ 、k ₂ The building blocks direct communication between the two vehicles, respectively traveling on their own lanes.

Step A: respectively predicting the vehicle positions of the first vehicle and the second vehicle in the next time slot based on the information of the specified vehicle state types of the first vehicle and the second vehicle in the current time slot; the information specifying each vehicle state type includes a position, a speed, a deviation angle, and an angular acceleration of the vehicle at the current time slot.

Based on the information specifying the respective vehicle state types of the first vehicle and the second vehicle at the current time slot, the vehicle positions of the first vehicle and the second vehicle at the next time slot, i.e., the first vehicle k, are predicted by the following formula ₁ Vehicle position at the next time slot

First vehicle k ₂ In the next time slot->

Wherein the content of the first and second substances,

indicating a position of the first vehicle; />

Indicating a position of the second vehicle; />

Indicating a first vehicle k ₁ The y-axis coordinate position at the current time slot n; />

Indicating a second vehicle k ₂ The x-axis coordinate position at the current time slot n; />

To representSecond vehicle k ₂ The y-axis coordinate position of the current time slot n; n +1 represents the next time slot of the current time slot n; v. of ₁ (n) represents the speed of the first vehicle at the current time slot n; v. of ₂ (n) represents the speed of the second vehicle at the current time slot n; a is ₁ (n) represents the acceleration of the first vehicle at the current time slot n; a is ₂ (n) an acceleration of the second vehicle at the current time slot n; w is a ₁ (n) represents the angular velocity of the first vehicle at the current time slot n; w is a ₂ (n) represents the angular velocity of the second vehicle at the current time slot n; theta ₁ (n) represents a deflection angle of the first vehicle at the current time slot n; theta ₂ And (n) represents the deviation angle of the second vehicle at the current time slot n. Considering a three-dimensional coordinate system, at the current time slot n, the coordinate of the UAV is (x) _A [n],y _A [n]H), unmanned aerial vehicle perceives the current state information of the vehicle, including vehicle k ₁ 、k ₂ Respectively in the location coordinate of &>

And &>

Dividing the flight period T of the unmanned aerial vehicle into N identical time slots, wherein the time of each time slot is a sufficiently small constant tau, namely the data corresponding to each time slot is approximate to the data at one moment, namely ^ N>

And the flying height H is unchanged. Within the constant τ, the positions of the drone and the vehicle are static by default.

After the unmanned aerial vehicle predicts the next position of the vehicle, the next position of the vehicle is fed back to the RIS, the quality of a reflection signal is enhanced by utilizing a phase shift matrix of an optimized RIS reflection unit, position information is transmitted to the vehicle, high reliability of communication between the vehicles is facilitated, meanwhile, the unmanned aerial vehicle optimizes a track L per se according to the predicted position of the vehicle, the position where the unmanned aerial vehicle flies to the next time slot is determined, and the communication service for the vehicle in a coverage range is facilitated.

And B, step B: based on the vehicle positions of the first vehicle and the second vehicle in the next time slot are predicted, an optimized RIS reflecting unit phase shift matrix and an unmanned aerial vehicle track are obtained;

in the step B, the following steps are specifically executed to obtain an optimized RIS reflection unit phase shift matrix and an unmanned aerial vehicle trajectory:

step B1: based on the fact that an unmanned aerial vehicle provided with an RIS reflecting unit is used as a communication relay, and the communication rate models of two vehicles are combined, the optimization problem of the communication of the two vehicles in the next time slot of the current time slot is built;

since the current time slot is the nth time slot, the vehicle k ₁ The channel gain from RIS is expressed as:

wherein, the first and the second end of the pipe are connected with each other,

indicating vehicle k ₁ Distance from RIS. ρ is a path loss at a reference distance of 1m, α is a path loss exponent, λ represents a carrier wavelength,

indicating slave vehicle k ₁ And the cosine of the signal angle to the RIS.

Similarly, the nth time slot, RIS and vehicle k ₂ The channel gain in between is expressed as:

wherein

Indicating vehicle k ₂ Distance from RIS. />

Representation from RIS to vehicle k ₂ The cosine of the signal angle of (c).

Thus at the nth time slot, the communication rate between two vehicles is expressed as:

wherein B is _w In order to be a bandwidth of the communication channel,

indicating vehicle k ₁ The transmission power of the vehicle in each time slot is set to be constant in the scheme, sigma ² Is the noise power.

Thus, in the n +1 th slot, the communication rate between the two vehicles is expressed as:

therefore, based on the unmanned aerial vehicle mounted with the RIS reflector as a communication relay, the communication rate model of the two vehicles is expressed as follows:

in the formula, R _b [i]Indicating the communication rate of two vehicles in time slot i, i.e. between two vehiclesThe communication rate model of (1); b is _w Is the bandwidth;

indicating a first vehicle k ₁ The transmit power of (a); h is ₁ [i]Indicating a first vehicle k ₁ Channel gain at time slot i with the RIS reflecting unit; h is ₂ [i]Indicating a second vehicle k ₂ Channel gain at time slot i with the RIS reflection unit; theta [ i ]]A RIS reflecting element phase shift matrix represented at time slot i; sigma ² Representing the noise power; ρ represents a path loss at a reference distance of 1 m; α represents a path loss exponent; λ represents the carrier wavelength; />

Indicating a second vehicle k ₂ The distance from the RIS reflecting unit a at time slot i; m represents the total number of RIS reflecting units; />

Indicating a flight from drone to second vehicle k ₂ The cosine of the signal angle of (d); x is the number of _A [i]Representing the x-axis coordinate position, y, of the drone in time slot i _A [i]The time slot I is a time slot of the unmanned aerial vehicle, and the time slot I is a time slot of the unmanned aerial vehicle; />

Representing a first vehicle k ₁ Z-axis coordinate position at time slot i; />

Indicating a second vehicle k ₂ At the time ofAnd taking 0 in the scheme for the z-axis coordinate position of the gap i.

In step B1, the optimization problem of communication of two vehicles in the next time slot of the current time slot is the optimization problem of maximizing the communication rate of two vehicles in the next time slot of the current time slot, that is, maximizing the V2V communication rate in the (n + 1) th time slot, and the delay constraint of V2V communication and the energy consumption and trajectory constraint of the unmanned aerial vehicle are satisfied, which is expressed as follows:

s.t

E _UAV ≤E _MAX

||l[n+1]-l[n]|| ² ≤D _l ² ，n＝1,2,...,N-1

||l[1]-l ₀ || ² ≤D _l ²

wherein R is _b [n+1]Representing the communication rate of two vehicles in the next time slot n +1 of the current time slot; phi represents a phase shift matrix at the RIS reflective element; theta [ n +1 ]]Representing the RIS reflecting element phase shift matrix at time slot n + 1; b _w Is the bandwidth; l represents drone position;

representing a first vehicle k ₁ The transmit power of (a); h is a total of ₁ [n+1]Representing a first vehicle k ₁ Channel gain at time slot n +1 with the RIS reflecting unit; h is ₂ [n+1]Indicating a second vehicle k ₂ Channel gain at time slot n +1 with the RIS reflection unit; sigma ² Representing the noise power; />

Representing a delivery probability;the delta T is the communication time length of the two vehicles based on the unmanned aerial vehicle as a communication relay, namely T; b represents the sum of the data size generated by the communication of the two vehicles in the time delta T, namely the size of the load needing to be delivered by the V2V link; n represents the total number of communication slots; p is _th Representing a preset delivery probability threshold; e _UAV Representing the energy consumption of the unmanned aerial vehicle in the current time slot; e _MAX Represents the maximum energy consumption of the drone; l [ n ]]Indicating the position of the unmanned aerial vehicle in the current time slot n; l [ n +1 ]]Indicating the position of the unmanned plane in the time slot n + 1; l 1]Indicating the position of the unmanned plane in the 1 st time slot; l ₀ Representing an initial position of the drone; d _l ＝V _max τ，V _max Is the maximum speed at which the drone is flying; τ is the slot duration.

Wherein the constraints respectively represent: the delay and reliability constraints of V2V communication are expressed as a measure of the probability of successful delivery, defined as the probability of successfully delivering a payload of V2V communication of B bit size within an acceptable time T,

the probability that a payload representing a V2V communication that successfully delivers a B-bit size within an acceptable time Δ T is not less than its delivery threshold P _th ；E _UAV ≤E _MAX Indicating that the current energy consumption of the UAV is not more than the maximum energy consumption of the UAV; the flying distance of the unmanned aerial vehicle in each time slot is not more than D _l ＝V _max τ，V _max Is the maximum speed of flight.

Because the optimization problem is a non-convex function and high in complexity, the DDPG is used for solving the optimization problem, the intelligent agent inputs the current state into the action network according to the current state of the environment space and the existing decision strategy, outputs a Q value corresponding to each action, evaluates a network estimation state action value, and substitutes the selected action and the converted next state into the target network for solving.

And step B2: and solving the optimization problem of the communication of the first vehicle and the second vehicle in the next time slot of the current time slot based on the predicted vehicle positions of the first vehicle and the second vehicle in the next time slot, and solving by using a DDPG algorithm to obtain an optimized RIS reflecting unit phase shift matrix and an unmanned aerial vehicle track. I.e., the best motion to acquire the optimal RIS phase shift matrix and the optimal UAV trajectory to maximize vehicle communication rate.

Aiming at the optimization problem of communication of two vehicles in the next time slot of the current time slot, in the DDPG algorithm, an unmanned aerial vehicle is used as an intelligent agent to interact with a vehicle communication environment, and the communication environments of the unmanned aerial vehicle and the two vehicles are used as DDPG algorithm environments:

the state of the environment is set as follows: s (t) = { h ₁ (t),h ₂ (t),R _b (t),B _k }

Wherein h is ₁ [t]Representing a first vehicle k ₁ Channel gain at time slot t with the RIS reflecting unit; h is a total of ₂ [t]Indicating a second vehicle k ₂ Channel gain at time slot t with the RIS reflecting unit; r _b (t) represents the communication rate of the two vehicles in the time slot t; b _k Representing the remaining payload of the environment state space at t time slot; s (t) represents the state of the environmental state space at time slot t;

based on the state of the environment, the actions of the agent in the environment include determining the RIS phase shift matrix and the drone trajectory, i.e., a (t) = { Θ, L };

wherein the content of the first and second substances,

representing the RIS reflecting element phase shift matrix at time slot t; />

Representing the trajectory of the drone at time slot t; a (t) represents the action of the agent in the environmental state space at time slot t;

based on the actions of the agent in the environment, the reward-earned function is:

returning a reward to the vehicle communication system after performing action a (t) in the environmental state space, defined as maximizing the V2V communication rate;

wherein r (t) represents the reward at time slot t;

wherein, pi represents a strategy, namely the action of an agent in the environment; e represents expectation; gamma is an element of [0,1 ]]A discount factor representing r (t), r(s) _t ,a _t ) Indicating an immediate reward for taking action a in time slot t + i ambient state s.

In the DDPG algorithm, the current time optimal strategy pi ^* By passing

Get where π represents a policy, s represents a state of the environment, and a represents an action of the agent in the environment. The intelligent agent takes action according to policy π, which is a mapping from state space S to action space A, denoted as S (t) ∈ S → a (t) ∈ A. In the learning process, the agent learning optimizes the strategy to achieve a better strategy through experience conversion, and the optimal strategy pi ^* Can pass through->

Thus obtaining the product. During the DDPG algorithm solving process, the Q value converges to the optimal Q value with the probability of 1 ^*

And C: and adjusting the phase shift matrix of the RIS reflecting unit and the position of the unmanned aerial vehicle in the next time slot flying arrival based on the optimized RIS reflecting unit phase shift matrix and the unmanned aerial vehicle track. According to the optimized result theta solved based on DDPG algorithm ^* ,L ^* And adjusting the phase shift factor of the RIS reflecting unit, and enabling the unmanned plane to fly to the arrived position in the next time slot, so that the communication speed of the two vehicles in the next time slot is maximized. In the time of two vehicles communicating, the unmanned aerial vehicle track and RIS phase shift matrix variable of each time slot are optimized respectively to obtain the vehicle communication speed maximization in the next time slot, guarantee vehicle communication quality, until two vehicles communication end or no communication block.

Solving optimization using DDPG algorithmThe problem is that the core of the algorithm is a policy function and a Q value function, which can be approximated to an Actor network and a Critic network, and in the DDPG algorithm solving process, the Q value converges to the optimal Q value with the probability of 1 ^* The method can be stably represented in the optimization of a series of continuous motion spaces, has high convergence, and comprises the following specific steps:

1) Initializing an experience buffer D and the total number N of time slots;

2) Initializing Actor network mu and criticic network Q and network parameter theta ^μ And theta ^Q ；

3) Initializing a Target-Actor network mu 'and a Target-Critic network Q' and a network parameter theta ^μ′ And theta ^Q′ ；

4) For each round (epicode), training is performed by looping the following steps:

(1) Initializing vehicle communication network environment, selecting initial state s ₁

(2) For each step in the round (step), the following steps are cycled:

(1) according to the current input state s _t And Actor network performs output action a _t Receive an instant prize r _t And the next state s _t+1 Further empirical data(s) are obtained _t ,a _t ,r _t ,s _t+1 )；

(2) Empirical data(s) _t ,a _t ,r _t ,s _t+1 ) Storing the data into an experience cache region D;

(3) when the experience buffer D is full, randomly sampling a small batch of experience data(s) from the experience buffer D _i ,a _i ,r _i ,s _i+1 )；

(4) Calculating the expected return y of the current action through a Target-critical network _i ：

y _i ＝r _i +γQ′(s _i+1 ,max Q(s _i+1 |θ ^Q )|θ ^Q′ )

r _i Indicating the immediate prize, Q'(s), earned by the time slot i _i+1 ,maxQ(s _i+1 |θ ^Q )θ ^Q′ ) Indicating that the next state s is entered on the basis of the action corresponding to the maximum Q value obtained in the time slot i _i+1 Corresponding Q value.

(5) Defining a Critic network minimum loss function to update a Critic network parameter θ ^Q ：

(6) Updating the Actor network parameters by the following gradient function:

(7) updating the Target-Actor network and Target-critical network parameters by:

wherein κ < 1.

(3) The step loop is ended.

5) The round (epicode) loop is ended.

6) And after training is finished, acquiring an optimal strategy for optimizing the unmanned aerial vehicle track and the RIS phase shift matrix to maximize the communication speed.

And obtaining an optimization result by using a DDPG algorithm, and obtaining a phase shift matrix of the RIS reflecting unit optimized by each time slot and an unmanned aerial vehicle track variable, thereby adjusting an RIS phase shift factor, executing the position reached by the unmanned aerial vehicle flying in the next time slot, maximizing the vehicle communication speed in the next time slot, and ensuring the vehicle communication quality.

An unmanned aerial vehicle carries on RIS auxiliary vehicle network communication system, apply to said unmanned aerial vehicle carries on RIS auxiliary vehicle network communication method, including the prediction module of the position, optimizing module; the position prediction module respectively predicts the vehicle positions of the first vehicle and the second vehicle in the next time slot based on the information of the first vehicle and the second vehicle in the current time slot, wherein the information specifies the state types of the vehicles; obtaining a vehicle k from a prediction model ₁ ,k ₂ Travel to the position of n +1 time slot

And

the optimization module optimizes the phase shift matrix of the RIS reflection unit and the unmanned aerial vehicle trajectory based on predicting vehicle positions of the first vehicle and the second vehicle at a next time slot.

Reconfigurable Intelligent Surface (RIS) has become an important transmission technology for improving spectrum or energy efficiency in next-generation (5G (B5G) and 6G) wireless communication networks, and has numerous applications in the fields of internet of things and vehicle communication systems. An RIS consisting of a large number of reflecting elements can passively reflect an incident signal by intelligently adjusting its phase shift by the RIS controller, which can improve link quality and significantly enhance coverage. Compared to traditional communication assistance technologies (such as relays), the RIS consumes less energy due to passive reflection and can operate in full-duplex mode without self-interference. The unmanned aerial vehicle has the advantages of high maneuverability and maneuverability, low cost, line-of-sight transmission and the like, is used as a mobile aerial base station or an airborne relay, and improves the ground communication performance by deploying the optimal flight position. Compared with a DQN algorithm, the Deep Deterministic Policy Gradient (DDPG) algorithm has higher convergence, and comprises two networks of action (Actor) and evaluation (Critic), wherein the Actor network is used for generating the current Policy, and the Critic network is used for judging the quality of the Policy in the current state, so that the DDPG algorithm can stably perform optimization in a series of continuous action spaces. The unmanned aerial vehicle is used as an intelligent agent, actions are selected and executed based on the current state, a reward function is obtained, and an optimal strategy is selected through feedback updating, so that an optimal RIS phase shift matrix and an unmanned aerial vehicle track variable are obtained to maximize the communication speed between vehicles, and the performance of vehicle networking communication is improved.

Therefore, the invention designs a communication method and a system for an unmanned aerial vehicle carrying RIS auxiliary vehicle network, which are used for enhancing the quality of a signal reflected to a vehicle prediction position by optimizing a phase shift matrix of an RIS in a scene that the unmanned aerial vehicle carries the RIS auxiliary vehicle network for communication, optimizing the track of the unmanned aerial vehicle according to a vehicle position prediction model and determining the position of the unmanned aerial vehicle in each time slot, thereby better serving the vehicle communication in a coverage range. Under the condition of meeting the vehicle communication delay constraint and reliability and the unmanned aerial vehicle energy consumption constraint, the optimization problem of maximizing the communication speed between vehicles is quickly solved by using the DDPG algorithm, and the optimal strategy of the communication between the vehicles is obtained. According to the method, the communication speed maximization between vehicles is realized by optimizing the RIS phase shift matrix and the unmanned aerial vehicle track under the condition of meeting the V2V communication delay constraint and the unmanned aerial vehicle energy consumption constraint, and the communication performance of a vehicle network is improved.

Although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that modifications may be made to the embodiments described in the foregoing detailed description, or equivalent changes may be made in some of the features of the embodiments described above. All equivalent structures made by using the contents of the specification and the attached drawings of the invention can be directly or indirectly applied to other related technical fields, and are also within the protection scope of the patent of the invention.

Claims

1. An unmanned aerial vehicle carries on RIS auxiliary vehicle network communication method which characterized in that: aiming at two vehicles with communication obstacles in a target area, combining unmanned planes carrying a preset number of RIS reflecting units as communication relays, executing the following steps to realize the maximization of the communication rate of the two vehicles in each time slot:

step A: respectively predicting vehicle positions of the first vehicle and the second vehicle in the next time slot based on the information that the first vehicle and the second vehicle specify each vehicle state type in the current time slot;

step C: and adjusting the phase shift matrix of the RIS reflecting unit and the position of the unmanned aerial vehicle in the next time slot flying arrival based on the optimized RIS reflecting unit phase shift matrix and the unmanned aerial vehicle track.

2. The network communication method of the unmanned aerial vehicle equipped RIS assisted vehicle according to claim 1, characterized in that: in the step A, based on the information that the first vehicle and the second vehicle specify each vehicle state type at the current time slot, the vehicle positions of the first vehicle and the second vehicle at the next time slot are predicted by the following formula, namely the first vehicle k ₁ Vehicle position at the next time slot

First vehicle k ₂ In the next time slot->

Wherein the content of the first and second substances,

indicating a position of the first vehicle; />

Indicating a position of the second vehicle; />

Indicating a second vehicle k ₂ The y-axis coordinate position at the current time slot n; n +1 represents the next time slot of the current time slot n; v. of ₁ (n) represents the speed of the first vehicle at the current time slot n; v. of ₂ (n) represents the speed of the second vehicle at the current time slot n; a is a ₁ (n) represents the acceleration of the first vehicle at the current time slot n; a is a ₂ (n) an acceleration of the second vehicle at the current time slot n; w is a ₁ (n) represents the angular velocity of the first vehicle at the current time slot n; w is a ₂ (n) represents the angular velocity of the second vehicle at the current time slot n; theta ₁ (n) represents a deflection angle of the first vehicle at the current time slot n; theta ₂ (n) represents the deviation angle of the second vehicle at the current time slot n.

3. The network communication method of the unmanned aerial vehicle equipped RIS assisted vehicle according to claim 1, characterized in that: the information specifying each vehicle state type includes a position, a speed, a deviation angle, and an angular acceleration of the vehicle at the current time slot.

4. The unmanned aerial vehicle carries on RIS auxiliary vehicle network communication method of claim 1, characterized by that: in the step B, the following steps are specifically executed to obtain an optimized RIS reflecting unit phase shift matrix and an unmanned aerial vehicle trajectory:

5. The network communication method of the unmanned aerial vehicle equipped RIS assisted vehicle according to claim 4, wherein: in step B1, based on the unmanned aerial vehicle having the RIS reflector as a communication relay, the communication rate model of the two vehicles is expressed as follows:

wherein the content of the first and second substances,

in the formula, R _b [i]Representing the communication rate of the two vehicles in the time slot i, namely a communication rate model between the two vehicles; b is _w Is the bandwidth;

indicating a first vehicle k ₁ The transmit power of (a); h is ₁ [i]Indicating a first vehicle k ₁ Channel gain at time slot i with the RIS reflection unit; h is ₂ [i]Indicating a second vehicle k ₂ Channel gain at time slot i with the RIS reflecting unit; theta [ i ]]Representing the RIS reflecting element phase shift matrix at time slot i; sigma ² Representing the noise power; ρ represents a path loss at a reference distance of 1 m; α represents a path loss exponent;λ represents a carrier wavelength; />

Indicating a flight from drone to second vehicle k ₂ The cosine of the signal angle of (d); x is a radical of a fluorine atom _A [i]X-axis coordinate position, y, representing unmanned aerial vehicle at time slot i _A [i]The coordinate position of the unmanned aerial vehicle on the y axis of the time slot i is represented, and the coordinate position of the unmanned aerial vehicle on the z axis of the time slot i is represented by H and is a fixed height; />

Representing a first vehicle k ₁ Z-axis coordinate position at time slot i;

indicating a second vehicle k ₂ At the z-axis coordinate position of slot i.

6. The unmanned aerial vehicle carries on RIS auxiliary vehicle network communication method of claim 4, characterized by that: in step B1, the problem of optimizing the communication between the two vehicles in the next time slot of the current time slot is to maximize the communication rate of the two vehicles in the next time slot of the current time slot, which is expressed as follows:

s.t

/>

E _UAV ≤E _MAX

||l[n+1]-l[n]|| ² ≤D _l ² ，n＝1,2,...,N-1

||l[1]-l ₀ || ² ≤D _l ²

indicating a first vehicle k ₁ The transmit power of (a); h is ₁ [n+1]Indicating a first vehicle k ₁ Channel gain at time slot n +1 with the RIS reflection unit; h is a total of ₂ [n+1]Indicating a second vehicle k ₂ Channel gain at time slot n +1 with the RIS reflection unit; sigma ² Representing the noise power; />

Representing a delivery probability; Δ T is the communication duration of the two vehicles based on the drone as a communication relay; b represents the sum of the data sizes required to be generated by the communication of the two vehicles within the time delta T; n represents the total number of communication time slots; p _th Representing a preset delivery probability threshold; e _UAV Representing the energy consumption of the unmanned aerial vehicle in the current time slot; e _MAX Represents the maximum energy consumption of the drone; l [ n ]]Representing the position of the unmanned plane in the current time slot n; l [ n +1 ]]Indicating the position of the unmanned plane in the time slot n + 1; l 1]Indicating the position of the unmanned plane in the 1 st time slot; l ₀ Representing an initial position of the drone; d _l ＝V _max τ，V _max Is the maximum speed at which the drone is flying; τ is a time slotThe length of time.

7. The unmanned aerial vehicle carries on RIS auxiliary vehicle network communication method of claim 4, characterized by that: in the step B2, based on the predicted vehicle positions of the first vehicle and the second vehicle in the next time slot, the optimization problem of the communication of the two vehicles in the next time slot of the current time slot is solved by using the DDPG algorithm, so as to obtain the optimized RIS reflection unit phase shift matrix and the unmanned aerial vehicle track.

8. The network communication method of an unmanned aerial vehicle equipped RIS assisted vehicle according to claim 7, characterized in that: aiming at the optimization problem of communication of two vehicles in the next time slot of the current time slot, in the DDPG algorithm, an unmanned aerial vehicle is used as an intelligent agent, and the communication environment of the unmanned aerial vehicle and the two vehicles is used as a DDPG algorithm environment:

Wherein h is ₁ [t]Indicating a first vehicle k ₁ Channel gain at time slot t with the RIS reflecting unit; h is a total of ₂ [t]Indicating a second vehicle k ₂ Channel gain at time slot t with the RIS reflecting unit; r _b (t) represents the communication rate of the two vehicles at time slot t; b _k Representing the remaining payload of the environment state space at t time slot; s (t) represents the state of the environmental state space at time slot t;

based on the state of the environment, the actions of the agent in the environment include determining the RIS phase shift matrix and the drone trajectory, i.e. a (t) = { Φ, L };

wherein the content of the first and second substances,

representing the RIS reflecting element phase shift matrix at time slot t; />

wherein r (t) represents the reward at time slot t;

wherein, pi represents a strategy, namely the action of the intelligent agent in the environment; e represents expectation; gamma is an element of [0,1 ]]A discount factor representing r (t), r(s) _t ,a _t ) Indicating an immediate reward for taking action a in time slot t + i ambient state s.

9. The network communication method of an unmanned aerial vehicle equipped RIS assisted vehicle according to claim 8, characterized in that: in the DDPG algorithm, an optimal strategy pi ^* By means of an objective function pi ^* ＝argmaxQ ^π (s, a) is derived, where π represents a policy, s represents a state of the environment, and a represents an action of the agent in the environment.

10. An unmanned aerial vehicle carries on RIS auxiliary vehicle network communication system, apply to a unmanned aerial vehicle carries on RIS auxiliary vehicle network communication method of claim 1, characterized by that: the system comprises a position prediction module and an optimization module; the position prediction module respectively predicts the vehicle positions of the first vehicle and the second vehicle in the next time slot based on the information of the first vehicle and the second vehicle in the current time slot, wherein the information specifies the state types of the vehicles; the optimization module optimizes the phase shift matrix of the RIS reflection unit and the unmanned aerial vehicle trajectory based on predicting vehicle positions of the first vehicle and the second vehicle at a next time slot.