CN114598702A

CN114598702A - VR (virtual reality) service unmanned aerial vehicle edge calculation method based on deep learning

Info

Publication number: CN114598702A
Application number: CN202210172797.8A
Authority: CN
Inventors: 丁晟杰; 刘娟; 谢玲富; 屈龙
Original assignee: Ningbo University
Current assignee: Ningbo University
Priority date: 2022-02-24
Filing date: 2022-02-24
Publication date: 2022-06-07

Abstract

The invention provides a VR service unmanned aerial vehicle edge calculation method based on deep learning, which relates to the field of unmanned aerial vehicle virtual technology, and comprises the following steps: s1: rendering in an unmanned aerial vehicle (MEC) system through a preset VR rendering mode, wherein the unmanned aerial vehicle MEC system comprises an unmanned aerial vehicle and a plurality of VR equipment; s2: obtaining delay and energy consumption of the VR device in VR service rendering according to the step S1, and determining that the VR service requested by the VR device is finished when the rendering delay does not exceed a set value; and optimizing the VR service rendering completion rate in the T time slots through a preset optimization process, wherein the constraint condition is that the total energy consumption of each VR device is less than or equal to a given threshold. According to the method, the flight path of the unmanned aerial vehicle and the VR service rendering mode can be optimized in a combined mode under the constraint of VR service characteristics and equipment energy, the rendering completion rate of VR services is maximized, and the robustness of the system is improved.

Description

VR service unmanned aerial vehicle edge calculation method based on deep learning

Technical Field

The invention relates to the technical field of unmanned aerial vehicle virtualization, in particular to a VR (virtual reality) service unmanned aerial vehicle edge calculation method based on deep learning.

Background

With the development and commercialization of 5G technology, VR (Virtual Reality) applications supported by the technology bring brand new technological life experiences to people. At present, a simulation scene is one of key applications of a VR technology, and various foreground interaction information and rich background environment rendering information are involved in implementation, which requires that VR equipment has enough energy, memory and computing resources, and ensures real-time processing to improve the immersive experience quality of a user. However, with the significant increase of VR service data traffic, the computing power of the portable VR device is limited, and the processing cannot be completed within a specified delay threshold, which makes it difficult to meet the experience quality requirement of the user. To address this challenge, edge computing servers are deployed at the edge of the wireless network, bringing the computing resources closer to the VR device, providing a sink computing service that assists the VR device in completing the rendering process in real-time.

The unmanned aerial vehicle is widely applied by virtue of the characteristics of high flexibility, low deployment cost and the like, and the unmanned aerial vehicle carrying the edge computing platform, namely the Mobile Edge Computing (MEC) unmanned aerial vehicle is deployed in a wireless network, so that the deployment cost of network fixed infrastructure can be saved, and the movable high-performance computing power is provided for VR users as required.

In the existing research, an unmanned aerial vehicle is proposed to serve as a mobile computing server to help a user complete a computing task, resource scheduling and unmanned aerial vehicle trajectory are optimized in a combined mode, total weighted energy consumption of the unmanned aerial vehicle and the user is minimized, but delay sensitivity of the computing task is not considered, and the unmanned aerial vehicle cannot be applied to VR services. Aiming at the characteristics of ultra-large calculation amount and ultra-low delay of VR service, the content transmission problem of VR users is researched, the performance of a VR system is improved by utilizing cache, and balance among communication, calculation and cache is made. However, due to the diversity of foreground interaction information, the problem of real-time requirements of rendering processing cannot be solved essentially by rendering the background in the cache part.

Disclosure of Invention

The problem to be solved by the invention is how to perform joint optimization on the flight path of the unmanned aerial vehicle and the VR service rendering mode under the constraint of VR service characteristics and equipment energy, so that the rendering completion rate of VR service is maximized and the robustness of the system is improved.

In order to solve the problems, the invention provides a VR service unmanned aerial vehicle edge calculation method based on deep learning, which comprises the following steps:

s1: rendering in an unmanned aerial vehicle MEC system through a preset VR rendering mode, wherein the unmanned aerial vehicle MEC system comprises an unmanned aerial vehicle and a plurality of VR devices;

s2: obtaining delay and energy consumption of the VR device in VR service rendering according to the step S1, and determining that the VR service requested by the VR device is finished when the rendering delay does not exceed a set value; optimizing VR service rendering completion rate in T time slots through a preset optimization process, wherein the constraint condition is that the total energy consumption of each VR device is less than or equal to a given threshold;

s3: the method comprises the steps of modeling a preset optimization flow through a Markov decision process, wherein the state of an unmanned aerial vehicle MEC system comprises the energy consumption of VR equipment of a user and the position of the unmanned aerial vehicle, taking actions by the unmanned aerial vehicle MEC system comprise selecting the flight track and the rendering mode of the unmanned aerial vehicle, and obtaining an expected optimal strategy through an MDP optimization target.

In the method, an unmanned aerial vehicle MEC system of VR service is composed of an unmanned aerial vehicle and a plurality of VR devices. Suppose that each time slot in the system is the same length as T_maxUnmanned plane horizontal position coordinates l (t) ═ x (t), y (t)]The flying height is H, and the position of each VR device is c_n＝[x_n,y_n]. When the distance from the VR device to the unmanned aerial vehicle does not exceed the coverage radius R, | | l (t) -c_nR is less than or equal to | R, and the unmanned aerial vehicle can provide service for VR equipment. Wherein b is_n(t) epsilon {0,1} represents the association state of the device n and the unmanned aerial vehicle, and if the association is 1, the association is 0 otherwise. The method comprises the steps that three VR rendering modes are provided for a user, wherein the three VR rendering modes comprise a local rendering mode, a remote rendering mode, a local and remote joint rendering mode and a non-rendering mode, and VR equipment cannot render a requested VR task in the non-rendering mode. RenderingDye delay set as

I.e. 10 times larger than the delay threshold of the VR task. The energy consumption of the VR device is zero,

and under the constraint of VR service characteristics and VR equipment energy, performing combined optimization on the flight path of the unmanned aerial vehicle and a VR rendering mode, and maximizing the rendering completion rate of the VR task. Modeling the problem as a Markov decision process, scheduling the unmanned aerial vehicle and selecting a VR rendering mode by a double-delay depth certainty strategy gradient algorithm under the framework of deep reinforcement learning, and finding an optimal strategy so as to meet the requirement of randomly arriving VR services.

Further, the preset VR rendering mode in step S1 includes a local rendering mode, a remote rendering mode, and a local and remote joint rendering mode.

Further, the local rendering mode simultaneously renders foreground interaction information and background environment information for each VR device, and a time for completing rendering in a time slot is represented as:

where t denotes a time slot, n denotes a VR device, γ_nRepresenting the computing power of VR device n;

rendering data representing foreground interaction information required by a VR device n for generating a service in a time slot t, wherein the unit is a bit;

rendering data representing background environment information required by the VR device n for generating the service in the time slot t, wherein the unit is bit; mu.s_nRepresenting the CPU cycle required by VR device n to render one bit of data;

the amount of energy consumed to complete rendering in a slot is expressed as:

wherein the content of the first and second substances,

expressed as a constant.

Further, the drone in the remote rendering mode performs a process comprising the steps of:

s11: acquiring foreground interaction information and background environment information through VR equipment, and transmitting the foreground interaction information and the background environment information to the unmanned aerial vehicle;

s12: rendering foreground interaction information and background environment information through an unmanned aerial vehicle (MEC) system;

s13: compressing and coding the rendered information by the unmanned aerial vehicle and transmitting the information to VR equipment of a user;

s14: the decoding is received and applied by the VR device.

Further, the remote rendering mode completion rendering time is expressed as:

wherein the content of the first and second substances,

represents the time required for the VR device to upload foreground interaction information to the drone,

representing the rendering processing time of the drone MEC system,

representing the drone code compression time,

representing the time required for the drone to transmit rendered information to the user's VR device,

representing the time required for decoding by the VR device, and n representing the VR device;

the energy consumption amount for completing the rendering in the remote rendering mode is represented as:

wherein the content of the first and second substances,

energy consumed by VR devices representing users to upload foreground interaction information to the drone,

represents the energy consumed by the VR device in decoding;

in uplink transmission, foreground interaction information is transmitted by adopting a Sub-6GHz frequency band, so that the receiving signal-to-noise ratio of VR equipment on an unmanned aerial vehicle can be obtained

Comprises the following steps:

wherein the content of the first and second substances,

representing the transmission power of the VR device, B_n(t) indicates the bandwidth of the VR device under the time slot, g_n(t) expressed as small-scale fading channel gain between the VR device and the drone,

representing large scale fading effects between the VR device and the drone;

the large scale fading effects represent a distance function of

Wherein, beta^upRepresenting a constant related to the frequency of the VR device, a^upRepresents a path loss exponent; d_n(t) represents the distance between the drone and the VR device in the time slot, N₀Representing white noise power;

the VR equipment shares the uplink bandwidth in a frequency division multiplexing mode, and the uploading data rate of the VR equipment at the time slot t is determined according to a Shannon capacity formula

Comprises the following steps:

wherein, B^upWhich represents the bandwidth of the channel and,

representing the sum of the number of VR devices associated with the unmanned aerial vehicle at the time slot t;

the unmanned aerial vehicle rendering processing time delay is as follows:

wherein the content of the first and second substances,

representing the size of foreground interaction information transmitted to the unmanned aerial vehicle by the VR equipment of the user;

the uplink transmission energy consumption is as follows:

unmanned rendering processing delay of

Wherein, γ_uavRepresenting the computational power of the drone, mu_uavRepresenting the CPU cycles required for the drone to render one bit of data,

which represents the rendering of the content, and,

representing the computing resources to which each associated VR device is allocated;

in code compression and downlink transmission, the delay required for compressing rendered information by the MEC system of the unmanned aerial vehicle is as follows:

wherein the content of the first and second substances,

indicating the size of the compressed data information;

in data decoding, the decoding delay of the VR device receiving the encoded rendering information transmitted by the unmanned aerial vehicle is:

the data decoded by the VR equipment is data information obtained by downlink transmission, and the energy consumed by the VR equipment in decoding

In the above method, it is assumed that the downlink wireless channel is a line-of-sight link, small-scale fading is ignored, and the downlink transmission rate and the downlink transmission delay are respectively expressed as:

the received signal-to-noise ratio of the VR device is:

P^uavrepresenting the transmission power of the drone, h_n(t) represents the antenna gain corresponding to beamforming,

represents the path loss, beta^downRepresents a frequency dependent constant, a^downRepresenting the path loss exponent.

Further, the local and remote joint rendering mode is to perform foreground interaction information rendering on the VR device and perform background environment rendering on the drone, and the rendering completion time of the local and remote joint rendering mode is expressed as:

wherein the content of the first and second substances,

data representing the time required for rendering locally, from foreground interaction information

Replacing in remote rendering

Calculating to obtain;

representing the time required for remote rendering; data of background environment information

Replacement of

Calculating by substitution;

and representing the time delay of the VR device for rendering and integrating the foreground interaction information and the background environment information, wherein the total energy consumption is represented as:

wherein the content of the first and second substances,

representing the energy consumed by the VR device for integrating the rendering of the foreground interaction information and the background environment information.

Further, the delay and the energy consumption amount in the step S2 are expressed as:

by a binary parameter delta_n(t) represents whether the VR service rendering of the VR device is completed, and is represented as:

through η_n(t) is equal to {0,1} and represents whether VR service request exists in VR equipment at time slot t, eta_n(t) < 1 > indicates a request, η_n(t) ═ 0 denotes no request; then the rendering completion rate of all VR devices at each time slot is expressed as:

the preset optimization flow is expressed as：

Wherein, the unmanned aerial vehicle track L ═ L (1), L, L (T)]And selection of user rendering mode O ═ O₁(1),L,o_N(1),L,o₁(T),L,o_N(T)]Is an optimization variable, the optimization target is VR service rendering completion rate in T time slots, and the constraint condition is that the total energy consumption of each VR device is less than or equal to a given threshold value E_th。

In the above process, I_{x}An indicator function is represented which equals 1 when x is true and 0 otherwise. When the VR rendering task requested by the VR equipment does not exceed T in the rendering delay_maxIs deemed complete.

Further, the markov decision process in step S3 models the preset optimization process, and the VR device energy consumption amount of the user is represented as:

e(t)＝[e₁(t),L e_n(t),L e_N(t)]∈[0,E_max]^N；

the drone position is represented as:

l(t)＝[x(t),y(t)]：

wherein E_maxIs the starting energy of each VR device;

the flight trajectory of the drone is represented as:

d(t)＝(k₁(t),k₂(t))；k₁(t)∈[0,2π],k₂(t)∈[0,D_max]；

wherein k is₁(t) indicates the flight direction of the drone, k₂(t) represents the unmanned aerial vehicle flight distance; d_maxRepresenting the maximum distance of flight of the unmanned aerial vehicle in each time slot; the rendering mode is represented as:

O(t)＝[o₁(t),L,o_N(t)]；

the update process of the MEC system status of the drone is represented as:

l(t+1)＝l(t)+[k₂(t)·cos(k₁(t)),k₂(t)·sin(k₁(t))]；

according to the selection of the preset VR rendering mode, the energy updating process of the VR equipment of the user is represented as follows:

e_n(t+1)＝e_n(t)-E_n(t)。

in the method, the Markov decision process modeling the preset optimization process further comprises a reward function, and the reward function is used for obtaining an expected optimal strategy through interaction with the environment. Under the given state and action s (t) of the time slot t, a (t), the reward function is defined as VR service rendering completion degree plus penalty term, and is represented as follows: r (t) + g (t), wherein g (t) is used for guiding the drone MEC system to adopt a better strategy, increasing long-term return, adding a penalty when taking action to cause the drone MEC system not to meet the constraint condition, namely g (t) < 0, and if the constraint condition is met, g (t) ═ 0. The long-term average reward function obtained by the system by adopting the strategy pi can be expressed as:

wherein E is_πRepresents the expectation under strategy pi, gamma ∈ [0,1 ]]Represents a discount factor, r_t(s_t,a_t) Indicating an instant prize. The MDP optimization goal is to find the long-term optimal strategy pi^*Maximizing long-term return.

The technical scheme adopted by the invention has the following beneficial effects:

according to the invention, through Markov decision process modeling and a depth reinforcement learning TD3 (double-delay depth deterministic strategy gradient) algorithm, the flight trajectory of the unmanned aerial vehicle and a VR service preset rendering mode are optimized and configured in a combined manner, rendering tasks borne by VR equipment and the unmanned aerial vehicle are optimized and configured, computing resources of the unmanned aerial vehicle are reasonably distributed to covered VR users for a long time, the utilization rate of the computing resources is maximized, and therefore the VR service rendering completion rate is maximized.

Drawings

Fig. 1 is a first flowchart of a VR service unmanned aerial vehicle edge calculation method based on deep learning according to an embodiment of the present invention;

fig. 2 is a flowchart of a VR service unmanned aerial vehicle edge calculation method based on deep learning according to an embodiment of the present invention;

fig. 3 is a schematic diagram comparing convergence of the TD3 algorithm and the DDPG algorithm in the VR service unmanned aerial vehicle edge calculation method based on deep learning according to the embodiment of the present invention;

fig. 4 is a schematic diagram of the VR service unmanned aerial vehicle edge calculation method based on deep learning according to the embodiment of the present invention, based on the TD3 algorithm, in different rendering modes;

fig. 5 is a schematic diagram illustrating a relationship between VR service completion rate and unmanned aerial vehicle computing power in a VR service unmanned aerial vehicle edge computing method based on deep learning according to an embodiment of the present invention;

fig. 6 is a schematic diagram illustrating a relationship between VR service completion rate and user number in the VR service unmanned aerial vehicle edge calculation method based on deep learning according to the embodiment of the present invention.

Detailed Description

In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below.

The following are specific embodiments of the present invention and are further described with reference to the drawings, but the present invention is not limited to these embodiments.

Examples

The embodiment provides a VR service unmanned aerial vehicle edge calculation method based on deep learning, as shown in fig. 1 and 2, the method includes the steps of:

Specifically, in the unmanned aerial vehicle MEC system of VR business, constitute by an unmanned aerial vehicle and a plurality of VR equipment. Suppose that each time slot in the system is the same length as T_maxUnmanned plane horizontal position coordinates l (t) ═ x (t), y (t)]The flying height is H, and the position of each VR device is c_n＝[xn,y_n]. When the distance from the VR device to the unmanned aerial vehicle does not exceed the coverage radius R, | | l (t) -c_nR is less than or equal to | R, and the unmanned aerial vehicle can provide service for VR equipment. Wherein b is_n(t) e {0,1} represents the association status of device n with the drone, association being 1 and conversely 0. The method comprises the steps that three VR rendering modes are provided for a user, wherein the three VR rendering modes comprise a local rendering mode, a remote rendering mode, a local and remote joint rendering mode and a non-rendering mode, and VR equipment cannot render a requested VR task in the non-rendering mode. Rendering delay is set to

The preset VR rendering mode in step S1 includes a local rendering mode, a remote rendering mode, and a local and remote joint rendering mode.

Wherein, the local rendering mode simultaneously renders foreground interaction information and background environment information for each VR device, and the rendering completion time in the time slot is represented as:

rendering data representing background environment information required by the VR device n to generate a service in a time slot t, wherein the unit is bit; mu.s_nRepresenting the CPU cycle required by VR device n to render one bit of data;

the amount of energy consumed to complete rendering in a slot is expressed as:

wherein the content of the first and second substances,

expressed as a constant.

Referring to fig. 2, in the remote rendering mode, the execution of the drone includes the steps of:

s14: the decoding is received and applied by the VR device.

Wherein, the rendering completion time of the remote rendering mode is represented as:

wherein, the first and the second end of the pipe are connected with each other,

representing the rendering processing time of the drone MEC system,

representing the drone code compression time,

the energy consumption amount to complete rendering in the remote rendering mode is represented as:

wherein the content of the first and second substances,

represents the energy consumed by the VR device in decoding;

in uplink transmission, foreground interaction information is transmitted by adopting a Sub-6GHz frequency band, so that the receiving signal-to-noise ratio of VR equipment at the unmanned aerial vehicle can be obtained

Comprises the following steps:

wherein the content of the first and second substances,

representing the transmission power of the VR device, B_n(t) denotes the bandwidth of the VR device under the time slot, g_n(t) expressed as small-scale fading channel gain between the VR device and the drone,

representing large scale fading effects between the VR device and the drone;

the large scale fading effects represent a distance function of

Wherein, beta^upRepresenting a constant related to the frequency of the VR device, a^upRepresents a path loss exponent; d is a radical of_n(t) represents the distance between the drone and the VR device in the time slot, N₀Representing white noise power;

the VR equipment shares the uplink bandwidth by adopting a frequency division multiplexing mode, and the uploading data rate of the VR equipment at the time slot t according to a Shannon capacity formula

Comprises the following steps:

wherein, B^upRepresenting channel bandwidth，

the unmanned aerial vehicle rendering processing time delay is as follows:

wherein the content of the first and second substances,

the uplink transmission energy consumption is as follows:

unmanned aerial vehicle rendering processing delay of

which represents the content of the rendering, and,

wherein the content of the first and second substances,

representing compressed data information;

Specifically, assuming that the downlink wireless channel is a line-of-sight link, ignoring small-scale fading, the downlink transmission rate and the downlink transmission delay are respectively expressed as:

the received signal-to-noise ratio of the VR device is:

represents the path loss, beta^downRepresenting a constant related to frequency, a^downRepresenting the path loss exponent.

Wherein, local and remote joint rendering mode carries out the mutual information rendering of prospect on VR equipment, carries out the background environment on unmanned aerial vehicle and renders, and local and remote joint rendering mode accomplishes the rendering time and expresses as:

Replacing in remote rendering

Calculating to obtain;

Replacement of

Calculating by substitution;

wherein the content of the first and second substances,

representing the energy consumed by the VR device for integrating foreground interaction information and background environment information rendering.

Wherein the delay and the energy consumption amount in step S2 are expressed as:

the preset optimization flow is represented as:

In step S3, the markov decision process models the preset optimization process, and the energy consumption of the VR device of the user is represented as:

e(t)＝[e₁(t),L e_n(t),L e_N(t)]∈[0,E_max]^N；

the drone position is represented as:

l(t)＝[x(t),y(t)]：

wherein E_maxIs the starting energy of each VR device;

the flight trajectory of the drone is represented as:

d(t)＝(k₁(t),k₂(t))；k₁(t)∈[0,2π],k₂(t)∈[0,D_max]；

O(t)＝[o₁(t),L,o_N(t)]；

the update process of the MEC system status of the drone is represented as:

l(t+1)＝l(t)+[k₂(t)·cos(k₁(t)),k₂(t)·sin(k₁(t))]；

e_n(t+1)＝e_n(t)-E_n(t)。

specifically, the markov decision process modeling the preset optimization process further includes a reward function, and the reward function is used for obtaining an expected optimal strategy through interaction with the environment. In a given state and action s (t) of a time slot t, a (t), a bonus function is defined as VR service rendering completion degree plus penalty, which is expressed as follows: and r (t) + g (t), wherein g (t) is used for guiding the drone MEC system to adopt a better strategy, increasing the long-term return, adding a penalty when an action is taken to cause the drone MEC system not to meet the constraint condition, namely g (t) < 0, and if the constraint condition is met, g (t) < 0. The long-term average reward function obtained by the system by adopting the strategy pi can be expressed as:

wherein E is_πRepresents the expectation under strategy π, γ ∈ [0,1 ]]Represents a discount factor, r_t(s_t,a_t) Indicating an instant prize. The MDP optimization goal is to find the long-term optimal strategy pi^*To achieve long-term returnAnd max.

Specifically, the MDP problem is solved by a policy iteration or value iteration method, and the computational complexity depends on the scale of the problem, i.e., the size of the state and the action space. The unmanned aerial vehicle MEC system researched by the invention has larger rendering problem scale, and a depth reinforcement learning TD3(Twin Delayed Deep Deterministic Policy Gradient) algorithm is adopted. The algorithm solves the problems: first, initialize Critic network

And parameters in an Actor network

Theta; initializing a target network

Initializing an experience playback pool B; next, the following steps are performed for each epsilon loop: initialize drone location l (t) ═ x (t), y (t)]And device location c_n＝[x_n,y_n]And the equipment energy consumption e_n(t), N is 1, N; the following steps are executed for each time slot T-1: T cycle: selection action a_t＝π_θ(s_t) + epsilon, wherein epsilon belongs to N (0, xi), and obtaining the flight direction k of the unmanned aerial vehicle₁(t), distance k₂(t) and rendering mode O (t); performing action a_tInteracting with the environment to obtain a reward r_tAn empirical sample(s)_t,a_t,r_t,s_t+1) Storing the data into an experience playback pool B; randomly taking a small batch of M sampling experience samples from an experience playback pool; calculating the expected reward of action by the Critic target network:

wherein a% ← pi_θ′(s') + oa, oa (N (0, ξ), -c, c) is a clipped normal noise; updating critical network parameters:

and updating the Actor network parameter theta by the following strategy gradient every d steps:

soft update target network:

θ′←τθ_i+(1-τ)θ′；

wherein tau represents a soft update rate factor, and when tau is larger, the network parameter is estimated

And a policy network parameter theta to a target network parameter

The faster the transfer speed of θ'; and continuously executing the steps until the epsilon loop is ended.

Specifically, due to the fact that energy and computing power of the portable VR equipment are extremely limited, task rendering cannot be completed locally in real time, different modes such as local rendering, remote rendering and collaborative rendering are adopted in the method to meet requirements of different VR services, rendering tasks born by VR equipment and the unmanned aerial vehicle are configured in an optimized mode, computing power resources of the unmanned aerial vehicle are reasonably distributed to covered VR users for a long time, and the computing power resource utilization rate is maximized. The method researches the mobile computing resource management problem of the MEC platform of the unmanned aerial vehicle, performs combined optimization on the movement track and the service rendering mode of the unmanned aerial vehicle, considers the network computing resource and equipment energy consumption constraint, and establishes a Markov decision process by taking the maximized VR user service rendering completion rate as an optimization target.

Preferably, the target area is 300m × 300m, 20 VR devices are randomly distributed, and 1 unmanned aerial vehicle is deployed as an aerial computing platform. Each VR device may request one of four VR rendering tasks, following a bernoulli distribution with a parameter p of 0.95.

As shown in fig. 3 to fig. 6, in order to verify the feasibility and the effectiveness of the method of the present invention, after the simulation test performed on the method of the present invention, the results are as follows:

referring to fig. 3, two deep reinforcement learning algorithms are compared: the convergence performance of the two deep reinforcement learning algorithms is compared. TD3 (dual delay depth deterministic policy gradient) based algorithms and DDPG based algorithms. The algorithm based on TD3 converges faster than DDPG algorithm, and VR rendering completion rate is higher.

Referring to fig. 4, the convergence speed of the present invention when selecting different VR rendering modes is shown. It can be seen that when the VR rendering mode is increased, the convergence speed may be slowed because the dimension of the action space is significantly increased, but the VR service rendering completion rate may be increased. The completion rate of the algorithm of the local rendering and remote rendering modes after convergence is almost the same as that of the algorithm provided by the method.

Referring to fig. 5, a VR rendering completion rate curve at different computing capabilities is presented. The results show that the algorithm shows the best performance among the three rendering options. The whole computing resources of the whole unmanned aerial vehicle and all VR devices can be effectively scheduled, and long-term rendering is carried out on VR tasks.

Referring to fig. 6, VR traffic rendering completion rates at different network scales are considered. The result shows that the VR rendering completion rates of the three rendering modes all show a descending trend along with the increase of the number of the VR devices. This is because, with limited communication and computing resources, the computing services provided for each virtual reality device are significantly reduced as more virtual reality devices and their rendering tasks are involved. However, in four rendering modes, the algorithm proposed by the present invention performs better than the other two rendering modes.

Specifically, under a deep reinforcement learning framework, the method provides a solution based on the TD3 algorithm, and the solution has better adaptability and expansibility for different user service characteristics and network scales. Simulation results show that: the method has better performance in the aspects of VR user service rendering success rate, convergence time and the like.

According to the method, under the constraint of VR service characteristics and equipment energy, the flight path of the unmanned aerial vehicle and the VR rendering mode are jointly optimized, and the rendering completion rate of VR tasks is maximized. Modeling the problem as a Markov decision process, scheduling the unmanned aerial vehicle and selecting a VR rendering mode by a TD3(Twin Delayed depth Deterministic Policy Gradient) algorithm under the framework of depth reinforcement learning, and finding an optimal strategy so as to meet the requirement of randomly-arrived VR service.

According to the method, the flight path of the unmanned aerial vehicle and the VR service preset rendering mode are optimized through Markov decision process modeling and depth reinforcement learning dual-delay depth certainty strategy gradient algorithm, rendering tasks borne by VR equipment and the unmanned aerial vehicle are optimized and configured, computing resources of the unmanned aerial vehicle are reasonably distributed to covered VR users for a long time, the utilization rate of the computing resources is maximized, and therefore the VR service rendering completion rate is maximized.

Although the present disclosure has been described above, the scope of the present disclosure is not limited thereto. Those skilled in the art can make various changes and modifications without departing from the spirit and scope of the present disclosure, and such changes and modifications will fall within the scope of the present invention.

Claims

1. A VR (virtual reality) service unmanned aerial vehicle edge calculation method based on deep learning is characterized by comprising the following steps:

2. The deep learning based VR business drone edge computing method of claim 1, wherein the preset VR rendering modes in step S1 include a local rendering mode, a remote rendering mode, and a local and remote joint rendering mode.

3. The deep learning based VR service drone edge computation method of claim 2, wherein the local rendering mode simultaneously renders foreground interaction information and background environment information for each VR device, and a time to complete rendering in a time slot is represented as:

the amount of energy consumed to complete rendering in a slot is expressed as:

wherein the content of the first and second substances,

expressed as a constant.

4. The deep learning based VR business drone edge computation method of claim 2, wherein the drone executes in the remote rendering mode comprising the steps of:

s14: the decoding is received and applied by the VR device.

5. The deep learning based VR business drone edge computation method of claim 4, wherein the remote rendering mode completion rendering time is expressed as:

wherein the content of the first and second substances,

representing the rendering processing time of the drone MEC system,

representing the drone code compression time,

represents the time required for the drone to transmit rendered information to the user's VR device,

wherein the content of the first and second substances,

represents the energy consumed by the VR device in decoding;

Comprises the following steps:

wherein the content of the first and second substances,

representing the transmission power of the VR device, B_n(t) denotes the bandwidth of the VR device under the time slot, g_n(t) is represented by VSmall scale fading channel gain between the R device and the drone,

representing large scale fading effects between the VR device and the drone;

the large scale fading effect represents a distance function of:

Comprises the following steps:

wherein, B^upWhich represents the bandwidth of the channel and,

the unmanned aerial vehicle rendering processing time delay is as follows:

wherein the content of the first and second substances,

the uplink transmission energy consumption is as follows:

the unmanned aerial vehicle rendering processing time delay is as follows:

which represents the content of the rendering, and,

wherein the content of the first and second substances,

representing compressed data information;

6. The method of claim 5, wherein the local and remote joint rendering modes are foreground interaction information rendering on the VR device and background environment rendering on the UAV, and wherein the local and remote joint rendering modes complete rendering time expressed as:

wherein the content of the first and second substances,

Replacing in remote rendering

Calculating to obtain;

Replacement of

Calculating by substitution;

wherein the content of the first and second substances,

7. The deep learning based VR business drone edge computation method of claim 1, wherein the delay and energy consumption in step S2 are expressed as:

the preset optimization flow is represented as:

8. The deep learning based VR business drone edge computing method of claim 1, wherein the Markov decision process in step S3 models a preset optimization process, and VR device energy consumption of the user is expressed as:

e(t)＝[e₁(t),L e_n(t),L e_N(t)]∈[0,E_max]^N；

the drone position is represented as:

l(t)＝[x(t),y(t)]；

wherein E_maxIs the starting energy of each VR device;

the flight trajectory of the drone is represented as:

d(t)＝(k₁(t),k₂(t))；k₁(t)∈[0,2π],k₂(t)∈[0,D_max]；

O(t)＝[o₁(t),L,o_N(t)]；

the update process of the MEC system status of the drone is represented as:

l(t+1)＝l(t)+[k₂(t)·cos(k₁(t)),k₂(t)·sin(k₁(t))]；

e_n(t+1)＝e_n(t)-E_n(t)。