CN115037751B - Unmanned aerial vehicle-assisted heterogeneous Internet of vehicles task migration and resource allocation method - Google Patents
Unmanned aerial vehicle-assisted heterogeneous Internet of vehicles task migration and resource allocation method Download PDFInfo
- Publication number
- CN115037751B CN115037751B CN202210744842.2A CN202210744842A CN115037751B CN 115037751 B CN115037751 B CN 115037751B CN 202210744842 A CN202210744842 A CN 202210744842A CN 115037751 B CN115037751 B CN 115037751B
- Authority
- CN
- China
- Prior art keywords
- vehicle
- unloading
- task
- vehicles
- decision
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/14—Network analysis or design
- H04L41/145—Network analysis or design involving simulating, designing, planning or modelling of a network
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/104—Peer-to-peer [P2P] networks
- H04L67/1074—Peer-to-peer [P2P] networks for supporting data block transmission mechanisms
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/12—Protocols specially adapted for proprietary or special-purpose networking environments, e.g. medical networks, sensor networks, networks in vehicles or remote metering networks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W4/00—Services specially adapted for wireless communication networks; Facilities therefor
- H04W4/30—Services specially adapted for particular environments, situations or purposes
- H04W4/40—Services specially adapted for particular environments, situations or purposes for vehicles, e.g. vehicle-to-pedestrians [V2P]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W4/00—Services specially adapted for wireless communication networks; Facilities therefor
- H04W4/30—Services specially adapted for particular environments, situations or purposes
- H04W4/40—Services specially adapted for particular environments, situations or purposes for vehicles, e.g. vehicle-to-pedestrians [V2P]
- H04W4/46—Services specially adapted for particular environments, situations or purposes for vehicles, e.g. vehicle-to-pedestrians [V2P] for vehicle-to-vehicle communication [V2V]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Abstract
The invention provides an unmanned aerial vehicle-assisted heterogeneous Internet of vehicles task migration and resource allocation method, which aims at a scene of cooperatively calculating and unloading a mobile edge server and an unmanned aerial vehicle, and firstly obtains a decision of whether a vehicle is unloaded or not through potential game, namely whether the vehicle decides to calculate locally or to unload to a MEC server or unmanned aerial vehicle. For the vehicles for determining task unloading, a distributed resource allocation method is adopted, the base station is not required to intensively schedule channel state information, the vehicles for determining task unloading are regarded as intelligent bodies, and each vehicle for determining task unloading selects unloading nodes and transmitting power based on local observation state information through DDQN training deep reinforcement learning. The algorithm can minimize the system time delay under the limit of the maximum transmitting power, and achieves good balance between complexity and performance.
Description
Technical Field
The invention relates to a vehicle networking technology, in particular to a task migration and resource allocation method of a vehicle networking assisted by an unmanned aerial vehicle, and more particularly relates to a task migration and resource allocation method of a heterogeneous vehicle networking assisted by an unmanned aerial vehicle.
Background
With the development of the internet of vehicles, various vehicle applications such as route planning, autopilot and infotainment applications are emerging. The applications can ensure travel safety and also can provide entertainment interconnection in travel. However, most of these applications are delay sensitive, resource intensive, characterized by complex computation and high energy requirements. Many vehicles currently have limited memory capacity and insufficient computing resources to meet these application-critical delay constraints. Mobile edge computing (mobile edge computation, MEC) can provide low-latency computing services for vehicles by deploying computing and storage resources at the network edge, with MEC servers deployed on roadside units providing computing services for vehicles in a typical scenario of the internet of vehicles. In intelligent road construction, unmanned aerial vehicles are used for road patrol, bridge inspection and road damage inspection. When the unmanned aerial vehicle patrols in a certain area, the powerful computing power of the unmanned aerial vehicle can also be used as an MEC server.
At present, unmanned aerial vehicle-assisted movement edge calculation is still in a starting stage, and only a few researches have been conducted on the field in detail. And the existing research mainly optimizes the calculation unloading strategy, but does not fully consider the problem of cooperation and the problem of communication resource allocation among heterogeneous MEC servers under a time-varying channel.
Therefore, the invention provides an unmanned aerial vehicle-assisted heterogeneous Internet of vehicles task migration and resource allocation method, which aims at a scene of cooperative calculation and unloading of a mobile edge server and an unmanned aerial vehicle, takes system delay minimization as an optimization target of task migration and resource allocation, and achieves good balance between complexity and performance.
Disclosure of Invention
The invention aims to: aiming at the problems in the prior art, the unmanned aerial vehicle assisted heterogeneous Internet of vehicles task migration and resource allocation method is provided, and the unmanned aerial vehicle can provide computing resources for vehicles. The method is to adopt a hybrid spectrum access technology for transmission, so as to realize the minimization of system time delay.
The technical scheme is as follows: aiming at the scene of cooperative computing and unloading of the mobile edge server and the unmanned aerial vehicle, the aim of minimizing the system time delay is achieved by reasonably and efficiently computing and unloading decisions and resource allocation. In order to reduce system time delay and improve spectrum utilization rate, a hybrid spectrum access technology is adopted for transmission, a vehicle unloads tasks to MEC server calculation on a road side unit through a vehicle-to-road facility (vehicle to infrastructure, V2I) link, the tasks are unloaded to an unmanned aerial vehicle for calculation through a vehicle-to-vehicle (vehicle to vehicle, V2V) link, and the V2I and V2V links are accessed to different slices through a 5G slicing technology without mutual interference. Firstly, a decision of whether the vehicle is unloaded or not is obtained through potential game, namely, the vehicle decides local calculation or is unloaded to an MEC server or unmanned aerial vehicle calculation. For the vehicles for determining task unloading, a distributed resource allocation method is adopted, the base station is not required to intensively schedule channel state information, each vehicle for determining task unloading is regarded as an intelligent agent, and the transmitting power is selected based on the local observation state information. By establishing a Deep reinforcement Learning model, the Deep reinforcement Learning model is optimized by utilizing a Double Deep Q-Learning Network (DDQN). And obtaining the unloading node and the transmitting power of each vehicle for deciding the task unloading according to the optimized DDQN model. The invention is realized by the following technical scheme: an unmanned aerial vehicle assisted heterogeneous Internet of vehicles task migration and resource allocation method comprises the following steps:
(1) The method comprises the steps that a Mobile Edge Computing (MEC) server is deployed in a Road Side Unit (RSU), a system is provided with a unmanned aerial vehicle to provide computing service for a vehicle, and a computing task of the vehicle can be processed locally and unloaded to the unmanned aerial vehicle or the MEC server;
(2) Establishing a communication model and a calculation model comprising N vehicles and M unmanned aerial vehicles, and further establishing a joint calculation migration and resource allocation model;
(3) Each vehicle acquires positions of the unmanned plane and the MEC, calculates the occupation condition of resources and task information;
(4) Based on potential game, obtaining a decision of whether each vehicle is unloaded, and establishing a deep reinforcement learning model for the vehicle unloaded by the determined task with the aim of reducing the system time delay according to the obtained vehicle unloading decision;
(5) Training a deep reinforcement learning model based on the DDQN;
(6) In the execution stage, the vehicle n with the calculation task judges whether the task is unloaded or not through potential game, and decides the unloaded vehicle n 0 Obtaining current state from local observationsObtaining unloading nodes and transmitting power of the vehicle by using the trained deep reinforcement learning model;
further, the step (2) includes the following specific steps:
(2a) The method comprises the steps of establishing a communication model for calculating and unloading of the Internet of vehicles, wherein the system comprises N vehicles, M unmanned aerial vehicles and a road side unit for deploying an MEC server, and the vehicles are assembledRepresentation, set->Representing an unmanned aerial vehicle, the mission of the vehicle n may be represented as +.>c n Representing the number of CPU cycles, s, required for the vehicle n to complete a task n Task data size indicating unloading of vehicle n, < +.>Representing the maximum tolerable delay for the task execution by vehicle n. Regarding each time slot, the vehicle generates a task for unloading decision of the vehicle task +.>Representation, a n =0 means that vehicle n performs the calculation task locally, a n =1 denotes the task off-load of vehicle n to MEC server calculation over V2I link, a n =2 means that vehicle n is unloading the task via V2V link from the drone calculation. The V2V communication and the V2I communication do not interfere with each other using a 5G slicing technique,
by collectionRepresenting task calculation locations, where loc, uav [ m]And MEC respectively represents locally executing the calculation tasks, unloading the tasks to the mth unmanned aerial vehicle for calculation, and unloading the tasks to the MEC server for calculation. Task off-load location->Sign (I) of>The calculation task representing vehicle n is performed at position z,/->The calculation task representing vehicle n is not performed at location z.
(2b) The vehicle n offloads the task to the drone m as signal to interference plus noise ratio (SINR):
the transmission rate at which the vehicle n offloads the mission to the drone m is expressed as:
wherein, the liquid crystal display device comprises a liquid crystal display device,representing the transmission bandwidth of a vehicle offloading tasks to a drone, P n Representing the emission power, sigma, of the vehicle n 2 Represents noise power, h n,uav[m] Represents the channel gain of vehicle n to drone m, < >>Indicating the disturbance to vehicle n caused by a vehicle other than vehicle n offloading a task to drone m
Wherein the method comprises the steps ofa n′ When=2, J (a n′ =2) =1, otherwise J (a n′ =2)=0,P n′ Indicating the emission power of the vehicle n', h n′,uav[m] Representing the channel gain of vehicle n' to drone m;
(2c) Likewise, vehicle n expresses the signal-to-interference-and-noise ratio (SINR) of the task offloading to the MEC server as:
the transmission rate at which vehicle n offloads tasks to the MEC server is expressed as:
wherein, the liquid crystal display device comprises a liquid crystal display device,transmission bandwidth, P, representing the offloading of tasks by a vehicle to an MEC server n Representing the emission power, sigma, of the vehicle n 2 Represents noise power, h n,mec Indicating the channel gain of vehicle n to MEC server, < >>Indicating the disturbance to vehicle n by a vehicle other than vehicle n offloading tasks to the MEC server
Wherein the method comprises the steps ofa n′ When=1, J (a n′ =1) =1, otherwise J (a n′ =1)=0,P n′ Indicating the emission power of the vehicle n', h n′,mec Representing the channel gain of vehicle n' to the MEC server;
(2d) Establishing a calculation model, a n =0 means that vehicle n performs the calculation task locally,representing the local computing power of vehicle n, the local computation delay is:
a n =1 denotes the calculation of the vehicle n to unload the task to the MEC server through the V2R link, and the uploading delay of the vehicle n to upload the task to the MEC server is:
the calculation time delay of the vehicle n uploading the task to the MEC server is as follows:
a n =2 denotes the task being offloaded to the drone calculation by vehicle n over the V2R link, the upload latency of the task being uploaded to drone m by vehicle n being
The calculation time delay for uploading the task to the unmanned plane m by the vehicle n is as follows
The computing power allocated to the task of vehicle n for unmanned aircraft m ignores the issuing latency, so that the latency of the vehicle n's offloading of the task to the MEC server is
The delay of the vehicle n unloading the task to the unmanned aerial vehicle m is:
the latency of local computation, task offloading to the MEC server, and task offloading to the drone can be expressed as:
(2e) In summary, the following objective functions and constraints can be established:
wherein the constraint conditions C1 and C2 indicate that the tasks can only be executed locally, offloaded to MEC server calculation or offloaded to unmanned aerial vehicle calculation, each calculation task can only select one calculation mode, the constraint condition C3 indicates the local calculation capability range of the vehicle n,is the local maximum computing power of the vehicle n, the constraints C4 and C5 mean that the computing power of the MEC server and the unmanned aerial vehicle allocated to the vehicle is not negative, the constraints C6 and C7 mean that the computing power of the MEC server and the unmanned aerial vehicle allocated to the vehicle cannot exceed the maximum computing power thereof, F mec Is the maximum computing power of MEC server, F uav[m] Is the maximum computing power of the unmanned aerial vehicle m; constraint conditions C8 and C9 indicate that the task of the vehicle n is unloaded to the MEC server or the unmanned aerial vehicle to execute calculation so as to meet the maximum time delay constraint; the C10 constraint indicates that the vehicle n transmit power is non-negative and meets its maximum transmit power constraint;
further, the step (4) comprises the following specific steps:
(4a) The decision of whether each vehicle is unloaded or not is obtained based on potential game, and the unloading decision of the task vehicle is modeled as potential game and expressed asWherein->A is a collection of vehicles n For the unloading decision of vehicle n, u n Is the cost function of vehicle n.
In the gaming model, each vehicle is a resource competitor, so there are N vehicles competing for limited resources within the network, each vehicle can choose to offload calculations or perform task calculations locally, where a) n E {0,1} is the unloading decision for vehicle n,representing a set of unloading decisions for all vehicles, a n =0 means that vehicle n performs the calculation task locally, a n =1 means that vehicle n offloads the task to the MEC server or drone for calculation. When the unloading decision of the vehicle n is a n When the cost function is expressed as u n (a n ,a -n ) Wherein a is -n Representing a set of unloading decisions for all vehicles except vehicle n. Each vehicle may wish to minimize its own cost by finding the optimal unloading decision, i.e +.>
Potential gaming converges to Nash equilibrium, i.e. an offloading decision is found by optimal response iterationThe absence of changing the current offloading decision for all vehicles may minimize its own costs.
(4b) Based on offloading decisionsBy means of the collection->Decision to offload in a vehicleVehicle, N 0 Representation->The number of vehicles, defining the state s as the observed information and low-dimensional fingerprint information related to the transmitting power and unloading node, including vehicle n 0 To unmanned plane->Channel state information>Vehicle n 0 Channel state information to MEC +.>Vehicle n 0 To unmanned plane->Is received->Vehicle n 0 Interference received to MEC +.>Vehicle n 0 Task information of->The training round number e and the random exploration variable epsilon in the epsilon-greedy algorithm, namely
Will beThe vehicle is regarded as an agent, each time the vehicle is based on the current state +.>Selecting an unloading node and a transmitting power;
(4c) Defining each vehicle n deciding to unload 0 Acts as selected offload nodes and transmit power, denoted as For vehicle n 0 Selected tasksUnloading node->For vehicle n 0 Discrete transmit power levels;
(4d) Defining a reward function r, the goal of offloading is to offload decisionsIs to minimize all offloading decisions +.>The task processing delay of the vehicle, therefore the reward function can be expressed as:
where b is a fixed value used to adjust the value of the bonus function,
(4e) According to the established state, action and rewarding function, establishing a deep reinforcement learning model based on Q learning, and each vehicle n deciding unloading 0 Establishing a corresponding evaluation functionVehicle n indicating decision to unload 0 From state->Execution of action->The discount rewards generated later, the Q value update function is:
wherein r is t For the instant prize function, gamma is the discount factor,for determining unloading vehicle n 0 Acquisition of observation information and low-dimensional fingerprint information concerning the transmit power and the offload node at time t,/>Vehicle n indicating decision to unload 0 Execute +.>Status of the back->For action->An action space is formed.
Further, the step (5) comprises the following specific steps:
(5a) Starting up an environment simulator, initializing predicted network parameters of each agentAnd target network parameters->
(5b) Initializing a training round number P;
(5c) Updating the vehicle position and the unmanned plane position, acquiring the occupation condition of the unmanned plane and MEC computing resources, task information and the like, and initializing a time step t in the P round;
(5d) Running the predictive network asynchronously for each agent based on input stateOutput action->And obtain instant rewards r t At the same time go to the next state +.>Thereby obtaining training data->/>
(5f) Each agent randomly samples N from the experience playback pool k Training dataComposing data set D, inputting predictive network +.>
(5g) Each agent calculates a Loss value Loss (n 0 ) Updating an agent predictive network by back propagation of neural networks using a small batch gradient descent strategyParameters of (2);
(5h) The training times reach the target network updating interval, and according to the predicted network parametersUpdating target network parameters
(5i) Judging whether t < K is satisfied, wherein K is the total time step in the p rounds, if so, t=t+1, entering the step (5 c), otherwise, entering the step (5 j);
(5j) Judging whether p < I is met, wherein I is a training round number set threshold value, if so, p=p+1, entering a step (5 c), otherwise, finishing optimization, and obtaining an optimized deep reinforcement learning model;
further, the step (6) comprises the following specific steps:
(6a) Acquiring unloading decisions of vehicles through potential games according to unmanned plane positions, MEC (mean time between computing) calculation resource occupation conditions and task information, and calculating a vehicle n which is not locally calculated for each unloading decision 0 Acquiring state information at the time
(6b) Each unloading decision is not a locally calculated vehicle n 0 Inputting state information by using a trained deep reinforcement learning model
(6c) Outputting an optimal action strategy, i.e. each unloading decision is not a locally calculated vehicle n 0 Unloading node for optimal vehicle selectionAnd transmit power->
The beneficial effects are that: the invention provides an unmanned aerial vehicle assisted heterogeneous Internet of vehicles task migration and resource allocation method, which aims at a scene of cooperatively calculating and unloading a mobile edge server and an unmanned aerial vehicle, adopts a hybrid spectrum access technology to transmit, a V2V link and a V2I access different slices based on a 5G slice technology, do not interfere with each other, obtain a decision of whether a vehicle is locally calculated or not through potential game, optimize unloading nodes and transmitting power of the unloaded vehicle by adopting deep double Q learning, minimize system time delay to realize task calculation, and the algorithm combining the potential game and the deep double Q learning can effectively solve the joint optimization problem of the unloading decision and the transmitting power of the vehicle, thereby achieving good balance between complexity and performance. .
In summary, in the scenario of collaborative computing and unloading of the mobile edge server and the unmanned aerial vehicle, the unmanned aerial vehicle-assisted heterogeneous internet of vehicles task migration and resource allocation method provided by the invention is superior in minimizing system time delay.
Drawings
Fig. 1 is a flowchart of an unmanned aerial vehicle assisted heterogeneous internet of vehicles task migration and resource allocation method provided by an embodiment of the present invention;
fig. 2 is a schematic diagram of a system for unmanned aerial vehicle assisted task migration and resource allocation of heterogeneous internet of vehicles according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a DDQN algorithm framework provided by an embodiment of the present invention;
Detailed Description
The core idea of the invention is that: aiming at a scene of cooperatively calculating and unloading by a mobile edge server and an unmanned aerial vehicle, a hybrid spectrum access technology is adopted for transmission, a V2V link and a V2I are accessed into different slices based on a 5G slice technology, the mutual interference is avoided, a decision of whether a vehicle is unloaded or not is obtained through potential game, the decision is taken as an intelligent body, a deep reinforcement learning model is established, and a deep double-Q learning optimization deep reinforcement learning model is adopted. And obtaining optimal unloading nodes and transmitting power for deciding to unload the vehicle according to the optimized deep reinforcement learning model, so as to achieve the aim of minimizing the system time delay.
The present invention is described in further detail below.
The method comprises the steps that (1), a Mobile Edge Computing (MEC) server is deployed in a Road Side Unit (RSU), a system is deployed to enable an unmanned aerial vehicle to provide computing service for a vehicle, computing tasks of the vehicle can be processed locally, and the computing tasks are unloaded to the unmanned aerial vehicle or the MEC server;
step (2), establishing a communication model and a calculation model comprising N vehicles and M unmanned aerial vehicles, and further establishing a joint calculation migration and resource allocation model, wherein the method specifically comprises the following steps:
(2a) The method comprises the steps of establishing a communication model for calculating and unloading of the Internet of vehicles, wherein the system comprises N vehicles, M unmanned aerial vehicles and a road side unit for deploying an MEC server, and the vehiclesBy collectionRepresentation, set->Representing an unmanned aerial vehicle, the mission of the vehicle n may be represented as +.>c n Representing the number of CPU cycles, s, required for the vehicle n to complete a task n Task data size indicating unloading of vehicle n, < +.>Representing the maximum tolerable delay for the task execution by vehicle n. Regarding each time slot, the vehicle generates a task for unloading decision of the vehicle task +.>Representation, a n =0 means that vehicle n performs the calculation task locally, a n =1 denotes the task off-load of vehicle n to MEC server calculation over V2I link, a n =2 means that vehicle n is unloading the task via V2V link from the drone calculation. The V2V communication and the V2I communication do not interfere with each other using a 5G slicing technique,
by collectionRepresenting task calculation locations, where loc, uav [ m]And MEC respectively represents locally executing the calculation tasks, unloading the tasks to the mth unmanned aerial vehicle for calculation, and unloading the tasks to the MEC server for calculation. Task off-load location->Sign (I) of>The calculation task representing vehicle n is performed at position z,/->The calculation task representing vehicle n is not performed at location z.
(2b) The vehicle n offloads the task to the drone m as signal to interference plus noise ratio (SINR):
the transmission rate at which the vehicle n offloads the mission to the drone m is expressed as:
wherein, the liquid crystal display device comprises a liquid crystal display device,representing the transmission bandwidth of a vehicle offloading tasks to a drone, P n Representing the emission power, sigma, of the vehicle n 2 Represents noise power, h n,uav[m] Represents the channel gain of vehicle n to drone m, < >>Indicating the disturbance to vehicle n caused by a vehicle other than vehicle n offloading a task to drone m
Wherein the method comprises the steps ofa n′ When=2, J (a n′ =2) =1, otherwise J (a n′ =2)=0,P n′ Indicating the emission power of the vehicle n', h n′,uav[m] Representing the channel gain of vehicle n' to drone m;
(2c) Likewise, vehicle n expresses the signal-to-interference-and-noise ratio (SINR) of the task offloading to the MEC server as:
the transmission rate at which vehicle n offloads tasks to the MEC server is expressed as:
wherein, the liquid crystal display device comprises a liquid crystal display device,transmission bandwidth, P, representing the offloading of tasks by a vehicle to an MEC server n Representing the emission power, sigma, of the vehicle n 2 Represents noise power, h n,mec Indicating the channel gain of vehicle n to MEC server, < >>Indicating the disturbance to vehicle n by a vehicle other than vehicle n offloading tasks to the MEC server
Wherein the method comprises the steps ofa n′ When=1, J (a n′ =1) =1, otherwise J (a n′ =1)=0,P n′ Indicating the emission power of the vehicle n', h n′,mec Representing the channel gain of vehicle n' to the MEC server;
(2d) Establishing a calculation model, a n =0 means that vehicle n performs the calculation task locally,representing the local computing power of vehicle n, the local computation delay is:
a n =1 denotes the calculation of the vehicle n to unload the task to the MEC server through the V2R link, and the uploading delay of the vehicle n to upload the task to the MEC server is:
the calculation time delay of the vehicle n uploading the task to the MEC server is as follows:
a n =2 denotes the calculation of the vehicle n by the vehicle-mounted cloud server for unloading the task via the V2R link, and the uploading delay of the vehicle n to the unmanned aerial vehicle m is
The calculation time delay for uploading the task to the unmanned plane m by the vehicle n is as follows
The computing power allocated to the task of vehicle n for unmanned aircraft m ignores the issuing latency, so that the latency of the vehicle n's offloading of the task to the MEC server is
The delay of the vehicle n unloading the task to the unmanned aerial vehicle m is:
the latency of local computation, task offloading to the MEC server, and task offloading to the drone can be expressed as:
(2e) In summary, the following objective functions and constraints can be established:
wherein the constraint conditions C1 and C2 indicate that the tasks can only be executed locally, offloaded to MEC server calculation or offloaded to unmanned aerial vehicle calculation, each calculation task can only select one calculation mode, the constraint condition C3 indicates the local calculation capability range of the vehicle n,is the local maximum computing power of the vehicle n, the constraints C4 and C5 mean that the computing power of the MEC server and the unmanned aerial vehicle allocated to the vehicle is not negative, the constraints C6 and C7 mean that the computing power of the MEC server and the unmanned aerial vehicle allocated to the vehicle cannot exceed the maximum computing power thereof, F mec Is the maximum computing power of MEC server, F uav[m] Is the maximum computing power of the unmanned aerial vehicle m; constraint conditions C8 and C9 indicate that the task of the vehicle n is unloaded to the MEC server or the unmanned aerial vehicle to execute calculation so as to meet the maximum time delay constraint; the C10 constraint indicates that the vehicle n transmit power is non-negative and meets its maximum transmit power constraint;
step (3), each vehicle acquires positions of the unmanned plane and the MEC, calculates resource occupation conditions and task information;
step (4), obtaining a decision of whether each vehicle is unloaded or not based on potential game, and establishing a deep reinforcement learning model for the vehicle unloaded by the determined task with the aim of reducing the system time delay according to the obtained vehicle unloading decision, wherein the method comprises the following specific steps of:
(4a) The decision of whether each vehicle is unloaded or not is obtained based on potential game, and the unloading decision of the task vehicle is modeled as potential game and expressed asWherein->A is a collection of vehicles n For the unloading decision of vehicle n, u n Cost function n for vehicle n
In the gaming model, each vehicle is a resource competitor, so there are N vehicles competing for limited resources within the network, each vehicle can choose to offload calculations or perform task calculations locally, where a) n E {0,1} is the unloading decision for vehicle n,representing a set of unloading decisions for all vehicles, a n =0 means that vehicle n performs the calculation task locally, a n =1 means that vehicle n offloads the task to the MEC server or drone for calculation. When the unloading decision of the vehicle n is a n When the cost function is expressed as u n (a n ,a -n ) Wherein a is -n Representing a set of unloading decisions for all vehicles except vehicle n. Each vehicle may wish to minimize its own cost by finding the optimal offloading decision, i.e
Potential gaming converges to Nash equilibrium, i.e. an offloading decision is found by optimal response iterationThe absence of changing the current offloading decision for all vehicles may minimize its own costs.
(4b) Based on offloading decisionsBy means of the collection->Unloading decision in a vehicle->Vehicle, N 0 Representation->The number of vehicles, defining the state s as the observed information and low-dimensional fingerprint information related to the transmitting power and unloading node, including vehicle n 0 To unmanned plane->Channel state information>Vehicle n 0 Channel state information to MEC +.>Vehicle n 0 To unmanned plane->Is received->Vehicle n 0 Interference received to MEC +.>Vehicle n 0 Task information of->The training round number e and the random exploration variable epsilon in the epsilon-greedy algorithm, namely
Will beVehicle is regarded as an agent, each time the vehicle is based on the current state +.>Selecting an unloading node and a transmitting power;
(4c) Defining each vehicle n deciding to unload 0 Acts as selected offload nodes and transmit power, denoted as For vehicle n 0 Selected task offload node, ">For vehicle n 0 Discrete transmit power levels;
(4d) Defining a reward function r, the goal of offloading is to offload decisionsIs to minimize all offloading decisions +.>The task processing delay of the vehicle, therefore the reward function can be expressed as:
where b is a fixed value used to adjust the value of the bonus function,
(4e) According to the established state, action and rewarding function, establishing a deep reinforcement learning model based on Q learning, and each vehicle n deciding unloading 0 Establishing a corresponding evaluation functionVehicle n indicating decision to unload 0 From state->Execution of action->The discount rewards generated later, the Q value update function is:
wherein r is t For the instant prize function, gamma is the discount factor,for determining unloading vehicle n 0 Acquisition of observation information and low-dimensional fingerprint information concerning the transmit power and the offload node at time t,/>Vehicle n indicating decision to unload 0 Execute +.>Status of the back->For action->An action space is formed.
And (5) training a deep reinforcement learning model based on the DDQN, which specifically comprises the following steps:
(5a) Starting up an environment simulator, initializing predicted network parameters of each agentAnd target network parameters->
(5b) Initializing a training round number P;
(5c) Updating the vehicle position and the unmanned plane position, acquiring the occupation condition of the unmanned plane and MEC computing resources, task information and the like, and initializing a time step t in the P round;
(5d) Running the predictive network asynchronously for each agent based on input stateOutput action->And acquires the instant prize rt while going to the next state +.>Thereby obtaining training data->
(5f) Each agent randomly samples N from the experience playback pool k Training dataComposing the numberAccording to set D, input prediction network->
(5g) Each agent calculates a Loss value Loss (n 0 ) Updating an agent predictive network by back propagation of neural networks using a small batch gradient descent strategyParameters of (2);
(5h) The training times reach the target network updating interval, and according to the predicted network parametersUpdating target network parameters +.>
(5i) Judging whether t < K is satisfied, wherein K is the total time step in the p rounds, if so, t=t+1, entering the step (5 c), otherwise, entering the step (5 j);
(5j) Judging whether p < I is met, wherein I is a training round number set threshold value, if so, p=p+1, entering a step (5 c), otherwise, finishing optimization, and obtaining an optimized deep reinforcement learning model;
step (6), in the execution stage, the vehicle n with the calculation task judges whether the task is unloaded or not through potential game, and determines the unloaded vehicle n 0 Obtaining current state from local observationsObtaining unloading nodes and transmitting power of the vehicle by using the trained deep reinforcement learning model, and specifically comprising the following steps:
(6a) Acquiring unloading decisions of vehicles through potential games according to unmanned plane positions, MEC (mean time between computing) calculation resource occupation conditions and task information, and calculating a vehicle n which is not locally calculated for each unloading decision 0 Acquiring state information at the time
(6b) Each unloading decision is not a locally calculated vehicle n 0 Inputting state information by using a trained deep reinforcement learning model
(6c) Outputting an optimal action strategy, i.e. each unloading decision is not a locally calculated vehicle n 0 Unloading node for optimal vehicle selectionAnd transmit power->
In fig. 1, a flowchart of an unmanned aerial vehicle assisted heterogeneous internet of vehicles task migration and resource allocation method is described, firstly, a vehicle judges whether a task is unloaded through potential game, and the unloaded vehicle selects an unloading node and transmitting power based on a deep reinforcement learning model trained by DDQN.
In fig. 2, a system model of unmanned aerial vehicle-assisted heterogeneous internet of vehicles task migration and resource allocation is depicted, it being seen that MEC servers and unmanned aerial vehicles can provide computing services for vehicles.
In fig. 3, an algorithmic framework of DDQN is depicted, which contains two networks, a predicted network and a target network, respectively.
According to the description of the invention, it should be apparent to those skilled in the art that the invention provides the unmanned aerial vehicle-assisted heterogeneous Internet of vehicles task migration and resource allocation method, which can effectively reduce system time delay and achieve good balance between complexity and performance.
What is not described in detail in the present application belongs to the prior art known to those skilled in the art.
Claims (1)
1. The unmanned aerial vehicle assisted heterogeneous Internet of vehicles task migration and resource allocation method is characterized by comprising the following steps of:
(1) The method comprises the steps that a Mobile Edge Computing (MEC) server is deployed in a Road Side Unit (RSU), a system is provided with a unmanned aerial vehicle to provide computing service for a vehicle, and a computing task of the vehicle can be processed locally and unloaded to the unmanned aerial vehicle or the MEC server;
(2) Establishing a communication model and a calculation model comprising N vehicles and M unmanned aerial vehicles, and further establishing a joint calculation migration and resource allocation model;
(3) Each vehicle acquires positions of the unmanned plane and the MEC, calculates the occupation condition of resources and task information;
(4) Based on potential game, obtaining a decision of whether each vehicle is unloaded, and establishing a deep reinforcement learning model for the vehicle unloaded by the determined task with the aim of reducing the system time delay according to the obtained vehicle unloading decision;
(5) Training a deep reinforcement learning model based on the DDQN;
(6) In the execution stage, the vehicle n with the calculation task judges whether the task is unloaded or not through potential game, and decides the unloaded vehicle n 0 Obtaining current state from local observationsObtaining unloading nodes and transmitting power of the vehicle by using the trained deep reinforcement learning model;
further, the step (4) comprises the following specific steps:
(4a) The decision of whether each vehicle is unloaded or not is obtained based on potential game, and the unloading decision of the task vehicle is modeled as potential game and expressed asWherein->A is a collection of vehicles n For the unloading decision of vehicle n, u n For the cost function of vehicle N, each vehicle is a resource competitor in the game model, so that N vehicles raceContending for limited resources within the network, each vehicle may choose to offload calculations or perform task calculations locally, where a) n E {0,1} is the unload decision for vehicle n, +.>Representing a set of unloading decisions for all vehicles, a n =0 means that vehicle n performs the calculation task locally, a n =1 means that vehicle n is unloading the task to MEC server or drone for calculation, when the unloading decision of vehicle n is a n When the cost function is expressed as u n (a n ,a -n ) Wherein a is -n Representing a set of unloading decisions for all vehicles except vehicle n, each vehicle may wish to minimize its own cost by finding the optimal unloading decision, i.e
Wherein the method comprises the steps ofCalculating the time delay of the task for vehicle n locally, < >>Delay in offloading tasks to MEC server for vehicle n, +.>For the time delay calculated by the task unloading to the unmanned plane for the vehicle n, the potential game converges to Nash equilibrium, i.e. the unloading decision is found by optimal response iteration +.>The absence of changing the current offloading decision for all vehicles can minimize its own costs;
(4b) Based on offloading decisionsBy means of the collection->Unloading decision in a vehicle->Vehicle, N 0 Representation->Number of vehicles, unloading decision in vehicle +.>The vehicle is regarded as an intelligent agent, and the states s are defined as observed information and low-dimensional fingerprint information related to the transmitting power and the unloading nodes, including the vehicle n 0 To unmanned plane->Channel state information>Vehicle n 0 Channel state information to MEC +.>Vehicle n 0 To unmanned plane->Is received->Vehicle n 0 Interference received to MEC +.>Vehicle n 0 Task information of->The training round number e and the random exploration variable epsilon in the epsilon-greedy algorithm, namely
Will beThe vehicles are regarded as intelligent bodies, and each time n is a vehicle 0 Based on the current state->Selecting an unloading node and a transmitting power;
(4c) Defining each vehicle n for which unloading is decided 0 Acts as selected offload nodes and transmit power, denoted as For vehicle n 0 Selected task offload node, ">For vehicle n 0 A set of selectable task offloading nodes, +.>For vehicle n 0 Discrete transmit power levels;
(4d) Defining a reward function r, the goal of offloading is to offload decisionsThe vehicle selects an unloading node and a transmitting power, in the range satisfying the maximum transmitting powerUnder beam, minimize all offloading decisions +.>The task processing delay of the vehicle, therefore the reward function can be expressed as:
where b is a fixed value used to adjust the value of the bonus function,representing vehicle n 0 Is at position z 0 Execution (S)>Representing vehicle n 0 Is not at position z 0 Execution (S)>Representing vehicle n 0 Is at position z 0 Time delay of execution;
(4e) According to the established state, action and rewarding function, establishing a deep reinforcement learning model based on Q learning, and each vehicle n deciding unloading 0 Establishing a corresponding evaluation functionVehicle n indicating decision to unload 0 From state->Execution of action->The discount rewards generated later, the Q value update function is:
wherein r is t For the instant prize function, gamma is the discount factor,for determining unloading vehicle n 0 Acquisition of observation information and low-dimensional fingerprint information concerning the transmit power and the offload node at time t,/>Vehicle n indicating decision to unload 0 At time t executeStatus of the back->For action->An action space is formed. />
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210744842.2A CN115037751B (en) | 2022-06-28 | 2022-06-28 | Unmanned aerial vehicle-assisted heterogeneous Internet of vehicles task migration and resource allocation method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210744842.2A CN115037751B (en) | 2022-06-28 | 2022-06-28 | Unmanned aerial vehicle-assisted heterogeneous Internet of vehicles task migration and resource allocation method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115037751A CN115037751A (en) | 2022-09-09 |
CN115037751B true CN115037751B (en) | 2023-05-05 |
Family
ID=83127119
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210744842.2A Active CN115037751B (en) | 2022-06-28 | 2022-06-28 | Unmanned aerial vehicle-assisted heterogeneous Internet of vehicles task migration and resource allocation method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115037751B (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111713126A (en) * | 2017-11-15 | 2020-09-25 | 联想(北京)有限公司 | Exchange of UL interference detection related information |
CN112543115A (en) * | 2020-11-13 | 2021-03-23 | 北京科技大学 | Unmanned aerial vehicle edge computing network resource allocation method and device based on block chain |
CN113543074A (en) * | 2021-06-15 | 2021-10-22 | 南京航空航天大学 | Joint computing migration and resource allocation method based on vehicle-road cloud cooperation |
CN114584951A (en) * | 2022-03-08 | 2022-06-03 | 南京航空航天大学 | Combined computing unloading and resource allocation method based on multi-agent DDQN |
CN114626298A (en) * | 2022-03-14 | 2022-06-14 | 北京邮电大学 | State updating method for efficient caching and task unloading in unmanned aerial vehicle-assisted Internet of vehicles |
CN114650567A (en) * | 2022-03-17 | 2022-06-21 | 江苏科技大学 | Unmanned aerial vehicle-assisted V2I network task unloading method |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11252533B2 (en) * | 2018-12-07 | 2022-02-15 | T-Mobile Usa, Inc. | UAV supported vehicle-to-vehicle communication |
-
2022
- 2022-06-28 CN CN202210744842.2A patent/CN115037751B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111713126A (en) * | 2017-11-15 | 2020-09-25 | 联想(北京)有限公司 | Exchange of UL interference detection related information |
CN112543115A (en) * | 2020-11-13 | 2021-03-23 | 北京科技大学 | Unmanned aerial vehicle edge computing network resource allocation method and device based on block chain |
CN113543074A (en) * | 2021-06-15 | 2021-10-22 | 南京航空航天大学 | Joint computing migration and resource allocation method based on vehicle-road cloud cooperation |
CN114584951A (en) * | 2022-03-08 | 2022-06-03 | 南京航空航天大学 | Combined computing unloading and resource allocation method based on multi-agent DDQN |
CN114626298A (en) * | 2022-03-14 | 2022-06-14 | 北京邮电大学 | State updating method for efficient caching and task unloading in unmanned aerial vehicle-assisted Internet of vehicles |
CN114650567A (en) * | 2022-03-17 | 2022-06-21 | 江苏科技大学 | Unmanned aerial vehicle-assisted V2I network task unloading method |
Non-Patent Citations (5)
Title |
---|
Computation offloading game in multiple unmanned aerial vehicle-enabled mobile edge computing networks;Yanling Ren; Zhibin Xie; Zhenfeng Ding; Xiyuan Sun;Jie Xia; Yubo Tian;IET Communication;第15卷(第10期);全文 * |
Joint Resource Allocation and Offloading Decision in Mobile Edge Computing;Ata Khalili; Sheyda Zarandi; Mehdi Rasti;IEEE Communications Letters;第23卷(第4期);全文 * |
Messous ; Hichem Sedjelmaci ; Noureddin Houari ; S. Senouci.Computation offloading game for an UAV network in mobile edge computing.2017 IEEE International Conference on Communications (ICC).2017,全文. * |
基于拟牛顿内点法的认知车联网能效优先资源分配算法;宋晓勤; 谈雅竹; 董莉; 王健康; 胡静; 宋铁成;东南大学学报(自然科学版);第49卷(第2期);全文 * |
基于无人机的边缘智能计算研究综述;董超; 沈赟; 屈毓锛;智能科学与技术学报(第03期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN115037751A (en) | 2022-09-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111414252B (en) | Task unloading method based on deep reinforcement learning | |
CN113032904B (en) | Model construction method, task allocation method, device, equipment and medium | |
Wu et al. | Deep reinforcement learning-based computation offloading for 5G vehicle-aware multi-access edge computing network | |
CN113543074A (en) | Joint computing migration and resource allocation method based on vehicle-road cloud cooperation | |
CN114143346B (en) | Joint optimization method and system for task unloading and service caching of Internet of vehicles | |
Callegaro et al. | Optimal edge computing for infrastructure-assisted UAV systems | |
CN115640131A (en) | Unmanned aerial vehicle auxiliary computing migration method based on depth certainty strategy gradient | |
CN113254188B (en) | Scheduling optimization method and device, electronic equipment and storage medium | |
CN112929849B (en) | Reliable vehicle-mounted edge calculation unloading method based on reinforcement learning | |
CN113573363A (en) | MEC calculation unloading and resource allocation method based on deep reinforcement learning | |
CN116489226A (en) | Online resource scheduling method for guaranteeing service quality | |
CN113573342A (en) | Energy-saving computing unloading method based on industrial Internet of things | |
Chen et al. | An intelligent task offloading algorithm (iTOA) for UAV network | |
CN115037751B (en) | Unmanned aerial vehicle-assisted heterogeneous Internet of vehicles task migration and resource allocation method | |
CN115052262A (en) | Potential game-based vehicle networking computing unloading and power optimization method | |
CN117528649A (en) | Method for establishing end-edge cloud system architecture, task unloading and resource allocation optimization method and end-edge cloud system architecture | |
CN116963034A (en) | Emergency scene-oriented air-ground network distributed resource scheduling method | |
CN115499875B (en) | Satellite internet task unloading method, system and readable storage medium | |
CN115967430A (en) | Cost-optimal air-ground network task unloading method based on deep reinforcement learning | |
CN116600316A (en) | Air-ground integrated Internet of things joint resource allocation method based on deep double Q networks and federal learning | |
CN116137724A (en) | Task unloading and resource allocation method based on mobile edge calculation | |
CN115134242B (en) | Vehicle-mounted computing task unloading method based on deep reinforcement learning strategy | |
CN117573383B (en) | Unmanned aerial vehicle resource management method based on distributed multi-agent autonomous decision | |
CN117580105B (en) | Unmanned aerial vehicle task unloading optimization method for power grid inspection | |
Du et al. | A Joint Trajectory and Computation Offloading Scheme for UAV-MEC Networks via Multi-Agent Deep Reinforcement Learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |