CN115037751A

CN115037751A - Unmanned aerial vehicle-assisted heterogeneous Internet of vehicles task migration and resource allocation method

Info

Publication number: CN115037751A
Application number: CN202210744842.2A
Authority: CN
Inventors: 宋晓勤; 王书墨; 宋铁成; 彭昱捷; 杨雨露
Original assignee: Shenzhen Institute Of Southeast University
Current assignee: Shenzhen Institute Of Southeast University
Priority date: 2022-06-28
Filing date: 2022-06-28
Publication date: 2022-09-09
Anticipated expiration: 2042-06-28
Also published as: CN115037751B

Abstract

The invention provides an unmanned aerial vehicle assisted heterogeneous Internet of vehicles task migration and resource allocation method, aiming at a scene of cooperative computing unloading of a mobile edge server and an unmanned aerial vehicle, firstly, a decision of whether a vehicle unloads is obtained through a potential game, namely, the vehicle decides local computing or unloads to an MEC server or an unmanned aerial vehicle for computing. For the vehicles deciding the task unloading, a distributed resource allocation method is adopted, a base station is not required to centrally schedule channel state information, each vehicle deciding the task unloading is regarded as an intelligent agent, deep reinforcement learning is trained through DDQN, and each vehicle deciding the task unloading selects an unloading node and emission power based on local observation state information. The algorithm can minimize the system time delay under the limit of the maximum transmitting power, and a good balance is obtained between the complexity and the performance.

Description

Unmanned aerial vehicle-assisted heterogeneous Internet of vehicles task migration and resource allocation method

Technical Field

The invention relates to an internet of vehicles technology, in particular to an unmanned aerial vehicle-assisted task migration and resource allocation method for the internet of vehicles, and more particularly to an unmanned aerial vehicle-assisted heterogeneous internet of vehicles task migration and resource allocation method.

Background

With the development of the internet of vehicles, various vehicle applications such as route planning, autopilot, and infotainment applications have emerged. The applications can ensure the safety of travel and can also provide entertainment interconnection in the journey. However, most of these applications are delay sensitive, resource intensive, computationally complex, and energy demanding. Currently, many vehicles still have limited storage capacity and insufficient computing resources to meet the stringent delay constraints of these applications. Mobile Edge Computing (MEC) can provide low-latency computing services for vehicles by deploying computing and storage resources at the edge of the network, and in a typical scenario of an internet of vehicles, MEC servers are deployed on roadside units to provide computing services for vehicles. In wisdom road construction, unmanned aerial vehicle is used for road patrol, bridge inspection and road damage inspection. When the unmanned aerial vehicle patrols in a certain area, the powerful computing power of the unmanned aerial vehicle can also be used as an MEC server.

Currently, unmanned aerial vehicle-assisted mobile edge computing is still in a starting stage, and only a few studies have conducted detailed studies on this field. In addition, the existing research mainly optimizes a calculation unloading strategy, but does not fully consider the problem of cooperation and communication resource allocation between heterogeneous MEC servers under a time-varying channel.

Therefore, the invention provides an unmanned aerial vehicle-assisted heterogeneous Internet of vehicles task migration and resource allocation method, which aims at the scene of cooperative computing and unloading of a mobile edge server and an unmanned aerial vehicle, takes minimization of system time delay as an optimization target of task migration and resource allocation, and achieves good balance between complexity and performance.

Disclosure of Invention

The purpose of the invention is as follows: aiming at the problems in the prior art, the unmanned aerial vehicle assisted heterogeneous Internet of vehicles task migration and resource allocation method is provided, and the unmanned aerial vehicle can provide computing resources for vehicles. The method adopts a mixed frequency spectrum access technology to carry out transmission, and realizes the minimization of system time delay.

The technical scheme is as follows: aiming at the scene of cooperative calculation unloading of the mobile edge server and the unmanned aerial vehicle, the purpose of minimizing system delay is achieved by reasonably and efficiently calculating unloading decisions and resource allocation. In order to reduce system time delay and improve the frequency spectrum utilization rate, a hybrid frequency spectrum access technology is adopted for transmission, a vehicle unloads tasks to an MEC server on a roadside unit for calculation through a vehicle to roadside facility (V2I) link, the tasks are unloaded to an unmanned aerial vehicle for calculation through a vehicle to vehicle (V2V) link, and the V2I link and the V2V link are accessed to different slices through a 5G slice technology and do not interfere with each other. Firstly, the decision of whether the vehicle is unloaded is obtained through potential game, namely the vehicle decides local calculation or is unloaded to an MEC server or unmanned aerial vehicle calculation. And for the vehicles deciding the task unloading, a distributed resource allocation method is adopted, the base station is not required to centrally schedule the channel state information, each vehicle deciding the task unloading is regarded as an intelligent agent, and the transmitting power is selected based on the local observation state information. By establishing the Deep reinforcement Learning model, the Deep reinforcement Learning model is optimized by using a Deep Double Q-Learning algorithm (DDQN). And obtaining the unloading node and the transmitting power of each vehicle for deciding the task unloading according to the optimized DDQN model. The invention is realized by the following technical scheme: an unmanned aerial vehicle-assisted heterogeneous Internet of vehicles task migration and resource allocation method comprises the following steps:

(1) deploying a Mobile Edge Computing (MEC) server at a Road Side Unit (RSU), deploying an unmanned aerial vehicle in the system to provide computing service for a vehicle, and locally processing computing tasks of the vehicle and unloading the computing tasks to the unmanned aerial vehicle or the MEC server;

(2) establishing a communication model and a calculation model comprising N vehicles and M unmanned aerial vehicles, and further establishing a joint calculation migration and resource allocation model;

(3) each vehicle acquires the positions of the unmanned aerial vehicle and the MEC, the occupation condition of computing resources and task information;

(4) obtaining a decision whether each vehicle is unloaded or not based on the potential game, and establishing a deep reinforcement learning model for the vehicle determining task unloading with the goal of reducing system time delay according to the obtained vehicle unloading decision;

(5) training a deep reinforcement learning model based on the DDQN;

(6) in the execution stage, the vehicle n with the calculation task judges whether the task is unloaded or not through potential game, and determines the unloaded vehicle n ₀ Obtaining a current state from local observations

Obtaining unloading nodes and transmitting power of the vehicle by using the trained deep reinforcement learning model;

further, the step (2) comprises the following specific steps:

(2a) the system comprises N vehicles, M unmanned aerial vehicles, a road side unit for deploying an MEC server, and a set for vehicles

Representation, collection

Representing unmanned vehicles, the task of vehicle n may be represented as

c _n Representing the number of CPU cycles, s, required for vehicle n to complete a task _n Indicating the size of the amount of task data unloaded by vehicle n,

representing the maximum tolerable delay of the vehicle n for the task execution. Considering each time slot, the vehicle generates a task, and the unloading decision of the vehicle task is used

Is shown as a _n 0 denotes that vehicle n performs the calculation task locally, a _n 1 denotes the off-load of vehicle n to MEC server computation over V2I link, a _n 2 means vehicle n offloads mission drone calculations over the V2V link. V2V communication and V2I communication do not interfere with each other using 5G slicing technique,

by collections

Indicating the task computation location, where loc, uav [ m ]]And MEC respectively represents that a calculation task is executed locally, the task is unloaded to the mth unmanned aerial vehicle for calculation, and the task is unloaded to the MEC server for calculation. For task-unloading positions

A sign, wherein

The calculation task representing vehicle n is performed at location z,

the calculation task representing vehicle n is not performed at location z.

(2b) The signal-to-interference-and-noise ratio (SINR) at which vehicle n offloads the mission to drone m is expressed as:

the transfer rate at which vehicle n offloads the task to drone m is expressed as:

wherein the content of the first and second substances,

transmission bandwidth, P, representing the vehicle offloading of tasks to the drone _n Representing the transmitted power, σ, of the vehicle n ² Representing work of noiseRate, h _n，uav[m] Representing the channel gain from vehicle n to drone m,

indicating interference of vehicles other than vehicle n in offloading tasks to drone m' on vehicle n

Wherein

a _n′ When 2, J (a) _n′ 2) 1, otherwise J (a) _n′ ＝2)＝0，P _n′ Representing the transmitted power, h, of the vehicle n _{n′，uav[m]} Representing the channel gain from vehicle n' to drone m;

(2c) likewise, the signal to interference plus noise ratio (SINR) at which vehicle n offloads the task to the MEC server is expressed as:

the transmission rate at which vehicle n offloads tasks to the MEC server is expressed as:

wherein the content of the first and second substances,

transmission bandwidth, P, representing the vehicle offloading of tasks to the MEC server _n Representing the transmitted power, σ, of the vehicle n ² Representing the noise power, h _n，mec Representing the channel gain of vehicle n to the MEC server,

indicating that vehicles other than vehicle n offload tasks to MEC server for vehicle nInterference

Wherein

a _n′ When 1, J (a) _n′ 1, otherwise J (a) _n′ ＝1)＝0，P _n′ Representing the transmitted power, h, of the vehicle n _n′，mec Representing the channel gain of vehicle n' to the MEC server;

(2d) establishing a calculation model, a _n 0 means that vehicle n performs the calculation task locally,

representing the local computing power of vehicle n, the local computing latency is then:

a _n 1 represents the calculation of unloading the task to the MEC server by the vehicle n through the V2R link, and the uploading time delay of the vehicle n to the MEC server is:

the computation time delay for vehicle n to upload a task to the MEC server is:

computing power allocated to the task of vehicle n for the MEC server.

a _n 2 denotes a vehiclen is calculated by unloading the task to the unmanned aerial vehicle through the V2R link, and the uploading time delay of the task uploaded to the unmanned aerial vehicle m by the vehicle n is

The calculation time delay of the vehicle n uploading the task to the unmanned aerial vehicle m is

The computing power of the task allocated to vehicle n for drone m ignores the issue delay, so the delay of vehicle n offloading the task to the MEC server is

The time delay for vehicle n to offload a task to drone m is:

the time delay for local computation, task offloading to the MEC server, and task offloading to the drone may be expressed as:

(2e) in summary, the following objective functions and constraints can be established:

wherein, the constraint conditions C1 and C2 are shown in the tableThe tasks can be executed only locally, and are unloaded to the MEC server computation or the unmanned plane computation, each computation task can only select one computation mode, the constraint condition C3 indicates the local computation capability range of the vehicle n,

is the vehicle n local maximum computing power, constraints C4 and C5 mean that the computing power allocated to the vehicle by the MEC server and drone is not negative, constraints C6 and C7 indicate that the computing power allocated to the vehicle by the MEC server and drone cannot exceed their maximum computing power, F ^mec Is the maximum computing power of the MEC server, F ^uav[m] Is the maximum computing power of drone m; constraints C8 and C9 indicate that the task for vehicle n is offloaded to the MEC server or drone to perform calculations that satisfy its maximum latency constraint; the C10 constraint indicates that vehicle n transmit power is non-negative and satisfies its maximum transmit power constraint;

further, the step (4) comprises the following specific steps:

(4a) obtaining a decision whether each vehicle is unloaded or not based on the potential game, modeling the unloading decision of the task vehicle into the potential game, and expressing the potential game as

Wherein

Is a collection of vehicles, a _n For the unloading decision of vehicle n, u _n As a cost function for vehicle n.

In the game model, each vehicle is a resource competitor, so that N vehicles compete for limited resources in a network, and each vehicle can choose to unload calculation or execute task calculation locally, wherein a _n E 0, 1 is the unloading decision for vehicle n,

set of unloading decisions representing all vehicles, a _n 0 means that vehicle n performs the calculation task locally, a _n 1 indicates that vehicle n is off-loading the missionTo the MEC server or drone. When the unloading decision of the vehicle n is a _n When, its cost function is expressed as u _n (a _n ，a _-n ) Wherein a is _-n Representing the set of unloading decisions for all vehicles except vehicle n. Each vehicle may wish to minimize its own cost by finding the optimal unloading decision, i.e.

The potential game converges on Nash equilibrium, that is, the unloading decision is found through the optimal response iteration

The absence of all vehicles to change the current unloading decision can minimize its own cost.

(4b) Based on offloading decisions

By collections

Offloading decisions in a vehicle

Vehicle, N ₀ To represent

The number of vehicles, defining states s as observation information and low-dimensional fingerprint information related to transmitting power and unloading nodes, including vehicles n ₀ To unmanned aerial vehicle

Channel state information of

Vehicle n ₀ Channel state information to MEC

Vehicle n ₀ To unmanned aerial vehicle

Received of

Vehicle n ₀ Received interference to MEC

Vehicle n ₀ Task information of

Training turns e and the random exploration variable ε in the ε -greedy algorithm, i.e.

Will be provided with

Vehicles being treated as agents, each time based on current state

Selecting an unloading node and transmitting power;

(4c) defining each unloading-deciding vehicle n ₀ Is selected offload node and transmit power, denoted as

For vehicles n ₀ The selected one of the task off-load nodes,

for vehicles n ₀ Discrete transmit power levels;

(4d) statorDefining a reward function r, the objective of offloading being an offloading decision

The vehicle selects an offload node and transmit power that minimizes all offload decisions under the constraint of satisfying maximum transmit power

The task processing latency of the vehicle, so the reward function can be expressed as:

where b is a fixed value used to adjust the value of the reward function,

(4e) establishing a deep reinforcement learning model on the basis of Q learning according to the established state, action and reward function, and determining n vehicles to be unloaded ₀ Establishing corresponding evaluation function

Indicating vehicle n deciding to unload ₀ Slave status

Performing an action

The resulting discount reward, the Q-value update function is:

wherein r is _t For the instant reward function, gamma is the discount factor,

for deciding the number of unloaded vehicles n ₀ Acquiring observation information and low-dimensional fingerprint information related to transmitting power and an unloading node at the moment t,

indicating vehicle n deciding to unload ₀ Is carried out at time t

In the latter state of the system, the system is,

is an action

The formed motion space.

Further, the step (5) comprises the following specific steps:

(5a) starting the environment simulator to initialize the predicted network parameters of each agent

And target network parameters

(5b) Initializing the training round number P;

(5c) updating the vehicle position and the unmanned aerial vehicle position, acquiring the occupation condition of unmanned aerial vehicle and MEC calculation resources, task information and the like, and initializing a time step t in the P round;

(5d) asynchronously operating the prediction network for each agent based on the input state

Output motion

And obtain an instant prize r _t While going to the next state

Thereby obtaining training data

(5e) To train data

Storing the experience data into respective experience playback pools;

(5f) each agent randomly samples N from the experience replay pool _k Training data

Forming a data set D, inputting into a prediction network

(5g) Each agent calculates Loss value Loss (n) through the prediction network and the target network ₀ ) Updating the agent prediction network by back propagation of the neural network using a small batch gradient descent strategy

The parameters of (1);

(5h) training times reach the target network updating interval, and network parameters are predicted according to the training times

Updating target network parameters

(5i) Judging whether t is less than K, wherein K is the total time step in the p round, if so, t is t +1, and entering the step (5c), otherwise, entering the step (5 j);

(5j) judging whether p is less than I or not, setting a threshold value for the training round number by I, if so, setting p to be p +1, and entering the step (5c), otherwise, finishing the optimization to obtain an optimized deep reinforcement learning model;

further, the step (6) comprises the following specific steps:

(6a) calculating resource occupation according to the position of the unmanned aerial vehicle and the MECAnd obtaining unloading decisions of the vehicles by the task information through potential game, and calculating the unloading decisions of the vehicles n which are not locally calculated for each unloading decision ₀ Obtaining the state information of the moment

(6b) Each unloading decision not being a locally calculated vehicle n ₀ Inputting state information by using a trained deep reinforcement learning model

(6c) Outputting the optimal action strategy, namely each unloading decision is not the vehicle n calculated locally ₀ Offload node to obtain optimal vehicle selection

And transmit power

Has the advantages that: the invention provides an unmanned aerial vehicle-assisted heterogeneous Internet of vehicles task migration and resource allocation method, which adopts a mixed frequency spectrum access technology to transmit aiming at a scene of cooperative calculation unloading of a mobile edge server and an unmanned aerial vehicle, accesses different slices to a V2V link and a V2I based on a 5G slicing technology without mutual interference, obtains a decision of whether a vehicle is locally calculated through potential game, optimizes unloading nodes and transmitting power of the unloaded vehicle by adopting deep double-Q learning, and realizes task calculation by minimizing system delay. .

In conclusion, under the scene of collaborative computing and unloading of the mobile edge server and the unmanned aerial vehicle, the unmanned aerial vehicle-assisted heterogeneous Internet of vehicles task migration and resource allocation method provided by the invention is superior in minimizing system time delay.

Drawings

Fig. 1 is a flowchart of a method for task migration and resource allocation in heterogeneous internet of vehicles assisted by an unmanned aerial vehicle according to an embodiment of the present invention;

fig. 2 is a schematic diagram of a system for task migration and resource allocation in heterogeneous internet of vehicles assisted by an unmanned aerial vehicle according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a DDQN algorithm framework provided by an embodiment of the present invention;

Detailed Description

The core idea of the invention is that: aiming at the scene of collaborative computing unloading of the mobile edge server and the unmanned aerial vehicle, a mixed spectrum access technology is adopted for transmission, a V2V link and a V2I are accessed into different slices based on a 5G slicing technology and are not interfered with each other, a decision of whether the vehicle is unloaded or not is obtained through potential game, the unloaded vehicle is regarded as an intelligent body, a deep reinforcement learning model is established, and the deep reinforcement learning model is optimized through deep double-Q learning. And obtaining the optimal unloading node and transmitting power for determining the unloading vehicle according to the optimized deep reinforcement learning model, thereby achieving the purpose of minimizing the system time delay.

The present invention is described in further detail below.

Step (1), a mobile edge computing server (MEC server) is deployed at a Road Side Unit (RSU), an unmanned aerial vehicle is deployed in the system to provide computing service for a vehicle, and computing tasks of the vehicle can be processed locally and unloaded to the unmanned aerial vehicle or the MEC server;

step (2), a communication model and a calculation model including N vehicles and M unmanned aerial vehicles are established, and then a joint calculation migration and resource allocation model is established, specifically:

Representation, collection

Denotes unmanned plane, vehicle nCan be expressed as

representing the maximum tolerable delay of the vehicle n to the task execution. Considering each time slot, the vehicle generates a task, and the unloading decision of the vehicle task is used

Is shown as a _n 0 denotes that vehicle n performs the calculation task locally, a _n 1 denotes the offloading of tasks by vehicle n to MEC server computation over V2I link, a _n 2 means vehicle n offloads mission drone calculations over the V2V link. V2V communication and V2I communication do not interfere with each other using 5G slicing technique,

by collections

A sign, wherein

The calculation task representing vehicle n is performed at location z,

the calculation task representing vehicle n is not performed at location z.

wherein the content of the first and second substances,

transmission bandwidth, P, representing the vehicle offloading of tasks to the drone _n Representing the transmitted power, σ, of the vehicle n ² Represents the noise power, h _n，uav[m] Representing the channel gain from vehicle n to drone m,

Wherein

wherein the content of the first and second substances,

transmission bandwidth, P, representing the vehicle offloading tasks to the MEC server _n Representing the transmitted power, σ, of the vehicle n ² Representing the noise power, h _n，mec Representing the channel gain of the vehicle n to the MEC server,

representing disturbances to vehicle n by vehicles other than vehicle n offloading tasks to MEC server

Wherein

representing the local computing power of vehicle n, the local computing latency is:

the computation time delay for vehicle n to upload a task to the MEC server is:

computing power allocated to the task of vehicle n for the MEC server.

a _n 2 represents that the vehicle n unloads the task through the V2R link, and the uploading time delay of the vehicle n uploading the task to the unmanned aerial vehicle m is

The computing power of the task allocated to vehicle n for unmanned aerial vehicle m ignores the issue delay, so the delay generated when vehicle n unloads the task to the MEC server is

The time delay for the vehicle n to unload the task to the drone m is:

the latency of local computation, task offloading to the MEC server, and task offloading to the drone may be expressed as:

wherein, the constraint conditions C1 and C2 indicate that the tasks can be executed locally, are unloaded to an MEC server for calculation or are unloaded to an unmanned aerial vehicle for calculation, each calculation task can only select one calculation mode, the constraint condition C3 indicates the local calculation capability range of the vehicle n,

is the vehicle n local maximum computing power, constraints C4 and C5 mean that the computing power allocated to the vehicle by the MEC server and drone is not negative, constraints C6 and C7 indicate that the computing power allocated to the vehicle by the MEC server and drone cannot exceed their maximum computing power, F ^mec Is the maximum computing power of the MEC server, F ^uav[m] Is the maximum computing power of drone m; constraints C8 and C9 indicate that the task of vehicle n is offloaded to the MEC server or the drone performs the calculation to meet its maximum latency constraint; the C10 constraint indicates that vehicle n transmit power is non-negative and satisfies its maximum transmit power constraint;

step (3), each vehicle acquires the positions of the unmanned aerial vehicle and the MEC, the occupation condition of computing resources and task information;

and (4) obtaining a decision whether each vehicle is unloaded or not based on the potential game, and establishing a deep reinforcement learning model for the vehicle determining task unloading with the goal of reducing system time delay according to the obtained vehicle unloading decision, wherein the method comprises the following specific steps:

(4a) obtaining each vehicle based on potential gameAnd (4) making a decision of unloading, namely modeling the unloading decision of the task vehicle into a potential game, which is expressed as

Wherein

Is a collection of vehicles, a _n For the unloading decision of vehicle n, u _n Cost function n for vehicle n

In the game model, each vehicle is a resource competitor, so that N vehicles compete for limited resources in a network, each vehicle can choose to unload calculation or execute task calculation locally, wherein a _n E 0, 1 is the unloading decision for vehicle n,

set of unloading decisions representing all vehicles, a _n 0 means that vehicle n performs the calculation task locally, a _n 1 means that vehicle n offloads the task to the MEC server or drone for computation. When the unloading decision of the vehicle n is a _n When, its cost function is expressed as u _n (a _n ，a _-n ) Wherein a is _-n Representing the set of unloading decisions for all vehicles except vehicle n. Each vehicle may wish to minimize its own cost by finding an optimal unloading decision, i.e.

The absence of all vehicles changing the current unloading decision minimizes its own cost.

(4b) Based on offload decisions

By collections

Decision making for unloading in vehicle

Vehicle, N ₀ To represent

Channel state information of

Vehicle n ₀ Channel state information to MEC

Vehicle n ₀ To unmanned aerial vehicle

Received of

Vehicle n ₀ Received interference to MEC

Vehicle n ₀ Task information of

Training round numbers e and random exploratory variables ε in the ε -greedy algorithm

Will be provided with

Vehicle-to-vehicle agent, each time based on current state

Selecting an unloading node and transmitting power;

For vehicles n ₀ The selected task of the off-load node,

for vehicles n ₀ Discrete transmit power levels;

(4d) defining a reward function r, the objective of offloading being an offloading decision

The mission handling delay of the vehicle, so the reward function can be expressed as:

where b is a fixed value used to adjust the value of the reward function,

(4e) according to the established state, action and reward function, a deep reinforcement learning model is established on the basis of Q learning, and each vehicle determines n unloaded vehicles ₀ Establishing corresponding evaluation function

Indicating vehicle n deciding to unload ₀ Slave status

Performing an action

The resulting discount reward, the Q-value update function is:

wherein r is _t For the instant reward function, gamma is the discount factor,

indicating vehicle n deciding to unload ₀ Is carried out at time t

In the latter state of the system, the system is,

is an action

The formed motion space.

Step (5), training a deep reinforcement learning model based on the DDQN, and specifically comprising the following steps:

And target network parameters

(5b) Initializing training round number P;

Output motion

And obtains the real-time reward rt and simultaneously turns to the next state

Thereby obtaining training data

(5e) To train data

Storing the experience data into respective experience playback pools;

Forming a data set D, inputting into a prediction network

The parameters of (1);

Updating target network parameters

step (6), in the execution stage, judging whether the task is unloaded or not by the vehicle n with the calculation task through potential game, and determining the unloaded vehicle n ₀ Obtaining a current state from local observations

The method for obtaining the unloading node and the transmitting power of the vehicle by using the trained deep reinforcement learning model specifically comprises the following steps:

(6a) obtaining unloading decisions of vehicles through potential game according to the position of the unmanned aerial vehicle, the occupation condition of MEC computing resources and task information, and carrying out local computation on each unloading decision ₀ Obtaining the state information of the moment

(6c) Output the optimal action strategy, i.e. eachVehicle n with vehicle unloading decisions not calculated locally ₀ Offload node to obtain optimal vehicle selection

And transmit power

In fig. 1, a flow chart of an unmanned aerial vehicle-assisted heterogeneous internet of vehicles task migration and resource allocation method is described, and first, whether a task is unloaded is judged by a vehicle through a potential game, and the unloaded vehicle is determined to select an unloading node and transmission power based on a DDQN trained deep reinforcement learning model.

In fig. 2, a system model of drone-assisted heterogeneous internet of vehicles task migration and resource allocation is described, and it can be seen that the MEC server and the drone can provide computing services for the vehicle.

In fig. 3, an algorithmic framework of a DDQN is depicted, the DDQN comprising two networks, a predicted network and a target network, respectively.

According to the description of the invention, it should be obvious to those skilled in the art that the invention provides the unmanned aerial vehicle-assisted heterogeneous Internet of vehicles task migration and resource allocation method, which can effectively reduce the system delay and achieve good balance between complexity and performance.

Details not described in the present application are well within the skill of those in the art.

Claims

1. An unmanned aerial vehicle-assisted heterogeneous Internet of vehicles task migration and resource allocation method is characterized by comprising the following steps:

(5) training a deep reinforcement learning model based on the DDQN;

(6) in the execution stage, the vehicle n with the calculation task judges whether the task is unloaded through potential game, and determines the unloaded vehicle n ₀ Obtaining a current state from local observations

further, the step (4) comprises the following specific steps:

Wherein

Is a collection of vehicles, a _n For the unloading decision of vehicle n, u _n As a cost function of the vehicles N, each vehicle is a resource competitor in the game model, so that limited resources exist in a competition network of N vehicles, each vehicle can choose to unload calculation or locally execute task calculation, wherein a _n E 0, 1 is the unloading decision for vehicle n,

set of unloading decisions representing all vehicles, a _n 0 means that vehicle n performs the calculation task locally, a _n 1 means that vehicle n offloads task to MEC server or drone for computationWhen the unloading decision of the vehicle n is a _n When, its cost function is expressed as u _n (a _n ，a _-n ) Wherein a is _-n Representing a set of unloading decisions for all vehicles except vehicle n, each of which may wish to minimize its own cost by finding the optimal unloading decision, i.e. the vehicle n

Wherein

The time delay of the task is calculated locally for the vehicle n,

for the time delay for vehicle n to offload tasks to the MEC server,

the potential game converges to nash equilibrium for the time delay of unloading the task to the unmanned plane calculation for vehicle n, i.e. the unloading decision is found through the optimal response iteration

All vehicles do not change the current unloading decision so that the cost per se can be minimized;

(4b) based on offload decisions

By collections

Decision making for unloading in vehicle

Vehicle, N ₀ To represent

Number of vehicles, decision to unload from vehicle

The vehicle is regarded as an intelligent agent, and the state s is defined as observation information and low-dimensional fingerprint information related to the transmitting power and the unloading node, including the vehicle n ₀ To unmanned aerial vehicle