CN115002725A - Unmanned aerial vehicle-assisted Internet of vehicles resource allocation method and device and electronic equipment - Google Patents

Unmanned aerial vehicle-assisted Internet of vehicles resource allocation method and device and electronic equipment Download PDF

Info

Publication number
CN115002725A
CN115002725A CN202210612465.7A CN202210612465A CN115002725A CN 115002725 A CN115002725 A CN 115002725A CN 202210612465 A CN202210612465 A CN 202210612465A CN 115002725 A CN115002725 A CN 115002725A
Authority
CN
China
Prior art keywords
vehicle
resource allocation
task
unmanned aerial
aerial vehicle
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210612465.7A
Other languages
Chinese (zh)
Inventor
王晔
童恩
张剑斌
周继革
王红妹
杜方芳
刘立军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Group Jiangsu Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Group Jiangsu Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Group Jiangsu Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN202210612465.7A priority Critical patent/CN115002725A/en
Publication of CN115002725A publication Critical patent/CN115002725A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/30Services specially adapted for particular environments, situations or purposes
    • H04W4/40Services specially adapted for particular environments, situations or purposes for vehicles, e.g. vehicle-to-pedestrians [V2P]
    • H04W4/44Services specially adapted for particular environments, situations or purposes for vehicles, e.g. vehicle-to-pedestrians [V2P] for communication between vehicles and infrastructures, e.g. vehicle-to-cloud [V2C] or vehicle-to-home [V2H]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B7/00Radio transmission systems, i.e. using radiation field
    • H04B7/14Relay systems
    • H04B7/15Active relay systems
    • H04B7/185Space-based or airborne stations; Stations for satellite systems
    • H04B7/18502Airborne stations
    • H04B7/18506Communications with or from aircraft, i.e. aeronautical mobile service
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W28/00Network traffic management; Network resource management
    • H04W28/02Traffic management, e.g. flow control or congestion control
    • H04W28/08Load balancing or load distribution
    • H04W28/09Management thereof
    • H04W28/0958Management thereof based on metrics or performance parameters
    • H04W28/0967Quality of Service [QoS] parameters
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W28/00Network traffic management; Network resource management
    • H04W28/02Traffic management, e.g. flow control or congestion control
    • H04W28/08Load balancing or load distribution
    • H04W28/09Management thereof
    • H04W28/0958Management thereof based on metrics or performance parameters
    • H04W28/0967Quality of Service [QoS] parameters
    • H04W28/0975Quality of Service [QoS] parameters for reducing delays
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/02Services making use of location information
    • H04W4/029Location-based management or tracking services
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/30Services specially adapted for particular environments, situations or purposes
    • H04W4/40Services specially adapted for particular environments, situations or purposes for vehicles, e.g. vehicle-to-pedestrians [V2P]
    • H04W4/46Services specially adapted for particular environments, situations or purposes for vehicles, e.g. vehicle-to-pedestrians [V2P] for vehicle-to-vehicle communication [V2V]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W84/00Network topologies
    • H04W84/02Hierarchically pre-organised networks, e.g. paging networks, cellular networks, WLAN [Wireless Local Area Network] or WLL [Wireless Local Loop]
    • H04W84/04Large scale networks; Deep hierarchical networks
    • H04W84/06Airborne or Satellite Networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The application relates to the field of Internet of vehicles resource allocation, and provides an unmanned aerial vehicle auxiliary Internet of vehicles resource allocation method, an unmanned aerial vehicle auxiliary Internet of vehicles resource allocation device and electronic equipment, wherein the method comprises the following steps: predicting the vehicle position at the next moment of the vehicle according to the detected track point data of the vehicle; receiving a task unloading request of a vehicle, and establishing a resource allocation model for allocating the vehicle networking resources for the vehicle based on the task unloading request; the task unloading request comprises a vehicle association mode, the amount of computing resources required by the task, the amount of data required by the task and the maximum delay tolerable by the task; the vehicle association mode comprises a vehicle and unmanned aerial vehicle association mode and a vehicle and unmanned aerial vehicle non-association mode; and solving the resource allocation model based on a reinforcement learning method and a depth certainty strategy gradient method to obtain a vehicle association optimal mode and a resource allocation strategy. According to the embodiment of the application, the time delay limit and the service quality requirement of the vehicle task are met by considering the mobility of the vehicle and the time-varying property of the resource request.

Description

Unmanned aerial vehicle-assisted Internet of vehicles resource allocation method and device and electronic equipment
Technical Field
The application relates to the technical field of Internet of vehicles resource allocation, in particular to an unmanned aerial vehicle auxiliary Internet of vehicles resource allocation method, device and electronic equipment.
Background
The Internet of vehicles is a product of integration of the Internet and the Internet of things, and provides convenient and diverse services for intelligent transportation. At present, two major camps of the internet of vehicles technology are respectively a universal short-range communication (DSRC) dominated by the United states and a long term evolution (LTE-V) system of workshop communication promoted by domestic enterprises. With the rapid development of the industrial internet of things technology in the vehicle network, data exchange between vehicles, vehicles and pedestrians, and vehicles and infrastructure units is more and more frequent, and strong data processing capability is required. In the process of providing service, information of surrounding vehicles needs to be continuously processed, and the data volume is very large, so that reasonable vehicle networking resource allocation is very important for reducing interference, improving network efficiency and finally optimizing wireless communication performance.
At present, the existing vehicle resource allocation technical scheme mostly ignores the mobility of the vehicle and the time variation of the resource request, and cannot meet the time delay limitation and the service quality requirement of the vehicle task.
Disclosure of Invention
The embodiment of the application provides an unmanned aerial vehicle auxiliary vehicle networking resource allocation method, device and electronic equipment, and aims to solve the technical problems that the mobility of a vehicle and the time-varying property of a resource request are mostly ignored and the time delay limit and the service quality requirement of a vehicle task cannot be met in the existing vehicle resource allocation technical scheme.
In a first aspect, an embodiment of the present application provides an unmanned aerial vehicle-assisted internet of vehicles resource allocation method, including:
predicting the vehicle position at the next moment of the vehicle according to the detected track point data of the vehicle;
receiving a task unloading request of the vehicle, and establishing a resource allocation model for allocating the vehicle networking resources for the vehicle based on the task unloading request; the task unloading request comprises a vehicle association mode, the amount of computing resources required by the task, the amount of data required by the task and the maximum delay tolerable by the task; the vehicle association mode comprises a vehicle and unmanned aerial vehicle association mode and a vehicle and unmanned aerial vehicle non-association mode;
and solving the resource allocation model based on a reinforcement learning method and a depth certainty strategy gradient method to obtain a vehicle association optimal mode and a resource allocation strategy.
In one embodiment, the predicting the vehicle position at the next time of the vehicle from the detected vehicle trajectory point data comprises:
determining a plurality of trajectory point data of the detected vehicle;
calculating the speed and the acceleration corresponding to the plurality of track point data of the vehicle;
and calculating the distance between the vehicle and the unmanned aerial vehicle and the azimuth angle of the vehicle according to the speed and the acceleration of the plurality of track point data.
In one embodiment, the establishing a resource allocation model for vehicle networking resource allocation for vehicles based on the task unloading request includes:
and establishing a resource allocation model for allocating the vehicle networking resources for the vehicle based on the task unloading request, the vehicle available resources, the unmanned aerial vehicle available resources, the transmission rate of the vehicle and the unmanned aerial vehicle uplink and the transmission rate of the vehicle link.
In one embodiment, the solving the resource allocation model based on the reinforcement learning method and the depth certainty strategy gradient method to obtain the vehicle association optimal mode and the resource allocation strategy includes:
obtaining an optimization problem function based on the resource allocation model;
converting the optimization problem function based on a reinforcement learning method;
and solving the converted result according to a depth certainty strategy gradient method to obtain a vehicle association optimal mode and a resource allocation strategy.
In one embodiment, the transforming the optimization problem function based on the reinforcement learning method includes:
converting the optimization problem function into an environment state space set, an action decision set and a reward function;
the environment state space set comprises the amount of computing resources required by the vehicle task, the amount of data required by the task, the maximum delay tolerable by the task, the vehicle position and the unmanned aerial vehicle position;
the action decision set comprises a vehicle association mode, and a computing resource proportion and a cache resource proportion which are distributed to an associated vehicle by the unmanned aerial vehicle;
the reward function is based on the maximum delay that can be tolerated by the vehicle mission and the construction of the caching resources that the vehicle has.
In one embodiment, the step of solving the converted result according to a depth deterministic strategy gradient method to obtain a vehicle association optimal mode and a resource allocation strategy comprises
Initializing network parameters of a depth certainty strategy gradient method, selecting action decisions of the action decision set based on the state of the environment state space set, and executing to obtain the reward function;
and training a network of a deep deterministic strategy gradient method based on empirical data serving as a training set, and updating the network parameters to obtain a vehicle association optimal mode and a resource allocation strategy.
In a second aspect, an embodiment of the present application provides an unmanned aerial vehicle assists vehicle networking resource allocation device, include:
the vehicle position prediction module is used for predicting the vehicle position of the vehicle at the next moment according to the detected track point data of the vehicle;
the resource allocation model establishing module is used for receiving a task unloading request of the vehicle and establishing a resource allocation model for allocating the vehicle networking resources for the vehicle based on the task unloading request; the task unloading request comprises a vehicle association mode, the amount of computing resources required by the task, the amount of data required by the task and the maximum delay tolerable by the task; the vehicle association mode comprises a vehicle and unmanned aerial vehicle association mode and a vehicle and unmanned aerial vehicle non-association mode;
and the solving module is used for solving the resource allocation model based on a reinforcement learning method and a depth certainty strategy gradient method to obtain a vehicle association optimal mode and a resource allocation strategy.
In a third aspect, an embodiment of the present application provides an electronic device, which includes a processor and a memory storing a computer program, where the processor implements the steps of the method for allocating resources in an unmanned aerial vehicle-assisted internet of vehicles according to the first aspect when executing the program.
In a fourth aspect, an embodiment of the present application provides a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the method for allocating network resources in unmanned aerial vehicle assisted by a drone according to the first aspect.
In a fifth aspect, an embodiment of the present application provides a computer program product, which includes a computer program, and when the computer program is executed by a processor, the steps of the method for allocating resources in an unmanned aerial vehicle-assisted internet network according to the first aspect are implemented.
According to the unmanned aerial vehicle auxiliary vehicle networking resource allocation method (invention name) provided by the embodiment of the application, the vehicle position of the vehicle at the next moment is predicted according to the detected track point data of the vehicle, so that the mobility of the vehicle is considered, and the interaction of communication data between the vehicles is carried out in time; the method comprises the steps of receiving a task unloading request of a vehicle, and establishing a resource allocation model for allocating the resources of the vehicle network for the vehicle based on the task unloading request; and solving the resource allocation model based on a reinforcement learning method and a depth certainty strategy gradient method to obtain a vehicle association optimal mode and a resource allocation strategy. Therefore, the embodiment of the application considers the time-varying property of the resources, and enables the limited resources to be reasonably and dynamically allocated to the vehicles requesting the resources, so that the time delay limit and the service quality requirement of vehicle tasks are met, the performance of vehicle-mounted communication is improved, the optimization of a series of vehicle continuous motion spaces is stable, and the convergence is high.
Drawings
In order to more clearly illustrate the technical solutions in the present application or prior art, the drawings used in the embodiments or the description of the prior art are briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
Fig. 1 is a schematic flow chart of a method for allocating resources in an auxiliary vehicle networking of an unmanned aerial vehicle according to an embodiment of the present application;
fig. 2 is an auxiliary vehicle networking scenario of an unmanned aerial vehicle provided by an embodiment of the present application;
fig. 3 is a second schematic flowchart of a method for allocating resources of an auxiliary vehicle networking of an unmanned aerial vehicle according to an embodiment of the present application;
fig. 4 is a third schematic flowchart of a method for allocating resources of an auxiliary vehicle networking of an unmanned aerial vehicle according to an embodiment of the present application;
FIG. 5 is a deep reinforcement learning model based on a deep deterministic strategy gradient method provided by an embodiment of the present application;
FIG. 6 is a schematic diagram of an algorithm flow of a gradient method based on a depth certainty strategy provided by an embodiment of the present application;
fig. 7 is a schematic structural diagram of an unmanned aerial vehicle auxiliary internet of vehicles resource allocation device provided in an embodiment of the present application;
fig. 8 is a schematic structural diagram of an electronic device provided in an embodiment of the present application.
Detailed Description
To make the purpose, technical solutions and advantages of the present application clearer, the technical solutions in the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Fig. 1 shows a method for allocating resources of an unmanned aerial vehicle-assisted internet of vehicles. Referring to fig. 1, an embodiment of the present application provides a method for allocating resources of an auxiliary vehicle networking of an unmanned aerial vehicle, which may include:
step 100, predicting the vehicle position at the next moment of the vehicle according to the detected track point data of the vehicle;
the electronic device predicts a vehicle position at a next time of the vehicle based on the detected trajectory point data of the vehicle. Wherein, the electronic device may be a drone. In the embodiment of the present application, please refer to fig. 2, and fig. 2 shows an auxiliary internet of vehicles scenario for an unmanned aerial vehicle according to the embodiment of the present application. The unmanned aerial vehicle assists the vehicle networking scene by N vehicle in the two-way road of straight line with deploy in aerial rotor M unmanned aerial vehicle and constitute, carry out effectual resource allocation in the vehicle networking to the task that maximize vehicle and unmanned aerial vehicle successfully accomplished. The task that the vehicle and the drone successfully complete means that the vehicle is associated with the drone, and the vehicle unloads the task to an MEC (Multi-access Edge Computing) server of the drone for execution.
In one embodiment, referring to fig. 3, the step 100 of predicting the vehicle position at the next time of the vehicle according to the detected vehicle trajectory point data includes:
step 110, determining a plurality of detected track point data of the vehicle;
the electronic device determines a plurality of trajectory point data for the detected vehicle. Specifically, in practice, a plurality of track point data of vehicle can be perceived through the radar device of a plurality of different unmanned aerial vehicles. For example, when the number of the unmanned aerial vehicles is S, the S unmanned aerial vehicles can sense three track points of the vehicle through the radar device. If a certain vehicle is in the coverage area of S unmanned aerial vehicles, the three trace point sets sensed by the S unmanned aerial vehicles are respectively:
a set a: { (x) n,1, y n,1 ),(x n,2 ,y n,2 ),...,(x n,S ,y n,S )};
And b, collection: { (x) (n-1),1, y (n-1),1 ),(x (n-1),2, y (n-1),2 ),...,(x (n-1),S, y (n-1),S )};
And c, set c: { (x) (n-2),1, y (n-2),1 ),(x (n-2),2, y (n-2),2 ),...,(x (n-2),S, y (n-2),S )};
In the application embodiment, three fusion track points of the vehicle are obtained by weighting and averaging three track point data, (x) n ,y n ),(x n-1 ,y n-1 ),(x n-2 ,y n-2 ) The formula of averaging by weight is:
Figure BDA0003672452610000061
Figure BDA0003672452610000071
Figure BDA0003672452610000072
wherein (x) n ,y n ),(x n-1 ,y n-1 ),(x n-2 ,y n_2 ) A plurality of detected trajectory point data of the vehicle may be determined in the embodiment of the present application.
Step 120, calculating the corresponding speed and acceleration of the plurality of track point data of the vehicle;
and the electronic equipment calculates the speed and the acceleration of the vehicle corresponding to the plurality of track point data. Specifically, in the embodiment of the present application, calculating the speed and the acceleration corresponding to a plurality of the track point data of the vehicle is represented by the following formula:
Figure BDA0003672452610000073
where v denotes the velocity, a denotes the acceleration, and Δ T denotes the time interval from this time to the next time.
And 130, calculating the distance between the vehicle and the unmanned aerial vehicle and the azimuth angle of the vehicle according to the speed and the acceleration of the plurality of track point data.
And the electronic equipment calculates the distance between the vehicle and the unmanned aerial vehicle and the azimuth angle of the vehicle according to the speed and the acceleration of the plurality of track point data. Specifically, it is considered that the acceleration of the vehicle does not change much from the (n-2) th time slot to the nth time slot, i.e., a x,n ≈a x,n_1 ≈a x,n-2 And predicting the position coordinate parameters of the vehicle at the next moment according to the state information of the vehicle corresponding to the three track point data, including the azimuth angle of the vehicle. At x, the epaxial distance of y, and then calculate the vehicle and correspond the distance between the unmanned aerial vehicle according to known unmanned aerial vehicle fixed flight height H and the pythagorean theorem, its formula is:
Figure BDA0003672452610000081
y n+1|n =3y n -3y n-1 +y n_2 ≈3y n -3y n-1 +y n-2
Figure BDA0003672452610000082
Figure BDA0003672452610000083
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003672452610000084
represents the distance component of the vehicle on the x-axis from the nth slot to the (n +1) th slot,
Figure BDA0003672452610000085
represents the distance component of the vehicle on the y axis from the nth time slot to the (n +1) th time slot, H represents the fixed flying height of the unmanned plane,
Figure BDA0003672452610000086
indicating the straight-line distance of the vehicle from the nth slot to the (n +1) th slot to the drone,
Figure BDA0003672452610000087
indicating the azimuth of the (n +1) th slot vehicle.
The embodiment utilizes the unmanned aerial vehicle radar device to detect the track point data of vehicle, utilizes the prediction formula to predict the next position of vehicle based on track point data, realizes perception vehicle position, in time carries out communication data's between the vehicle interaction.
200, receiving a task unloading request of the vehicle, and establishing a resource allocation model for allocating the vehicle networking resources for the vehicle based on the task unloading request; the task unloading request comprises a vehicle association mode, the amount of computing resources required by the task, the amount of data required by the task and the maximum delay tolerable by the task; the vehicle association mode comprises a vehicle and unmanned aerial vehicle association mode and a vehicle and unmanned aerial vehicle non-association mode;
after the unmanned aerial vehicle predicts the position of the vehicle at the next moment, the vehicle can randomly generate different calculation tasks and send a task unloading request to the unmanned aerial vehicle according to the requirement. The task unload request includes a vehicle association schema that considers b i Being binary variables, b i (t)={0,1},b i 1 denotes a vehicleAssociating with the unmanned aerial vehicle, and unloading the task to an MEC server of the unmanned aerial vehicle for execution; b i 0 means that the vehicle is not associated with a drone, performing its own computational task. For task off-loading requests sent by vehicle i at time t
Figure BDA0003672452610000091
Is shown in which
Figure BDA0003672452610000092
The amount of computing resources required by the task, the amount of data required by the task, and the maximum delay that can be tolerated by the task, respectively.
And the electronic equipment receives the task unloading request of the vehicle and establishes a resource allocation model for allocating the vehicle networking resources for the vehicle based on the task unloading request.
In an embodiment, the step 200 of establishing a resource allocation model for vehicle networking resource allocation for a vehicle based on the task offloading request specifically includes:
and the electronic equipment establishes a resource allocation model for allocating the vehicle networking resources for the vehicle based on the task unloading request, the available resources of the vehicle, the available resources of the unmanned aerial vehicle, the transmission rate of the uplink of the vehicle and the unmanned aerial vehicle and the transmission rate of the link of the vehicle.
The available resources of the vehicle comprise computing resources and cache resources of the vehicle, and the computing resources and cache resources of the vehicle executing the task by the vehicle
Figure BDA0003672452610000093
And (4) showing. Where i represents the serial number of the vehicle. The available resources for the drone include available computing and caching resources for the drone. Available computing and caching resources of the drone are used separately
Figure BDA0003672452610000094
Indicate where j indicates the serial number of the drone. The unmanned plane allocates the calculation resource and the buffer resource to the associated vehicles according to the allocation proportion
Figure BDA0003672452610000095
Represents; the condition that the unmanned aerial vehicle and the vehicle successfully complete the task is that the cache resource of the unmanned aerial vehicle is more than or equal to the data amount required by the task, namely
Figure BDA0003672452610000101
For the vehicle i ∈ N (T), the total time from generating the task to receiving the processing result is T i (t) is expressed as:
Figure BDA0003672452610000102
wherein T is i (t) resource allocation model for vehicle networking resource allocation, e j,i (t) is the transmission rate of the vehicle and drone uplink, e i (t) is the transmission rate of the vehicle's own link.
According to the embodiment of the application, the association mode of the vehicle and the unmanned aerial vehicle is determined after the vehicle task is generated, the dynamic resource allocation proportion of the Internet of vehicles is reasonably adjusted, and a resource management model is established so as to maximize the number of tasks successfully completed by the vehicle and the unmanned aerial vehicle, so that the time delay limit and the service quality requirement of the vehicle task are met.
And step 300, solving the resource allocation model based on a reinforcement learning method and a depth certainty strategy gradient method to obtain a vehicle association optimal mode and a resource allocation strategy.
And the electronic equipment solves the resource allocation model based on a reinforcement learning method and a depth certainty strategy gradient method to obtain a vehicle association optimal mode and a resource allocation strategy.
Specifically, in an embodiment, referring to fig. 4, the solving the resource allocation model based on the reinforcement learning method and the deep deterministic policy gradient method to obtain the vehicle-associated optimal mode and the resource allocation policy includes:
step 310, obtaining an optimization problem function based on the resource allocation model;
after the electronic device establishes the resource allocation model, an optimization problem function F is obtained, which is expressed as the following formula:
Figure BDA0003672452610000103
Figure BDA0003672452610000111
where b (t) is a vehicle correlation pattern matrix, f co (t) is an allocation matrix of unmanned aerial vehicle computing resources, f ca And (t) is an unmanned aerial vehicle cache resource allocation matrix, H (·) is a step function, when the variable is greater than or equal to 0, the value is 1, otherwise, the value is 0. That is, for the vehicle i which is allocated with enough buffer resources and meets the task delay requirement, there are
Figure BDA0003672452610000112
Or alternatively
Figure BDA0003672452610000113
The constraint condition is to maximally utilize computing resources and cache resources.
Step 320, converting the optimization problem function based on a reinforcement learning method;
and the electronic equipment converts the optimization problem function based on a reinforcement learning method. Since the optimization problem function F is a non-convex function and has high complexity, the embodiment of the present application converts the optimization problem function F by using a reinforcement learning method.
Specifically, in an embodiment, the step 320 of converting the optimization problem function based on the reinforcement learning method includes:
and converting the optimization problem function into an environment state space set, an action decision set and a reward function. The environment state space set comprises the amount of computing resources required by the vehicle task, the amount of data required by the task, the maximum delay tolerable by the task, the vehicle position and the unmanned aerial vehicle position; the action decision set comprises a vehicle association mode, and a computing resource proportion and a cache resource proportion which are distributed to an associated vehicle by the unmanned aerial vehicle; the reward function is based on the maximum delay that can be tolerated by the vehicle mission and the construction of the caching resources that the vehicle has.
Wherein the set of environmental state spaces S is represented as the following set:
Figure BDA0003672452610000121
Figure BDA0003672452610000122
x’ 1 (t),x’ 2 (t),...,x’ M (t),y’ 1 (t),y’ 2 (t),...,y’ M (t),z’ 1 (t),z’ 2 (t),...,z’ M (t)};
let the number of vehicles associated with unmanned aerial vehicle be N' j (j ∈ {1,2, …, M }), defining an action space a, selecting a vehicle association mode by the drone at time t according to a current policy pi, and defining an action decision set as a (t) as the proportion of computing resources and cache resources allocated to the associated vehicle by the drone, namely:
Figure BDA0003672452610000123
after performing the action decision a (t) in the set s (t) of environmental state spaces, a reward is returned to the drone, defined as a reward function R, expressed as:
Figure BDA0003672452610000124
the reward function directs the drone to perform policy updates, where two reward elements are defined, denoted as:
Figure BDA0003672452610000125
Figure BDA0003672452610000131
wherein
Figure BDA0003672452610000132
The amount of computing resources required by the task, the amount of data required by the task, and the maximum delay that can be tolerated by the task, respectively. For computing resources and cache resources possessed by vehicles performing tasks by themselves
Figure BDA0003672452610000133
And (4) showing.
And step 330, solving the converted result according to a depth certainty strategy gradient method to obtain a vehicle association optimal mode and a resource allocation strategy.
And the electronic equipment solves the converted result according to a depth certainty strategy gradient method to obtain a vehicle association optimal mode and a resource allocation strategy.
The electronic device solves the transformed optimization problem function by using a depth certainty strategy gradient method, and a depth reinforcement learning model based on the depth certainty strategy gradient method is shown in fig. 5.
According to the method and the device, the unmanned aerial vehicle (electronic equipment) serves as an intelligent agent, actions are selected and executed based on the current state, the reward function is obtained, and the optimal strategy is selected through feedback updating. Obtaining an evaluation function Q according to the determined S, A and R, and expressing the evaluation function Q as
Figure BDA0003672452610000134
Where E represents expectation, γ is a discount factor of r (t), r (t) represents the instant prize returned to the drone at time t, defined as the average prize for the vehicle, and τ < 1.
In one embodiment, step 330, the method of solving the converted result according to the depth deterministic strategy gradient method to obtain the optimal mode of vehicle association and the resource allocation strategy includes
Step 331, initializing network parameters of a deep deterministic policy gradient method, selecting and executing an action decision of the action decision set based on the state of the environment state space set, and obtaining the reward function;
and 332, training a network of the deep deterministic strategy gradient method based on empirical data serving as a training set, and updating the network parameters to obtain a vehicle associated optimal mode and a resource allocation strategy.
Specifically, the DDPG method (i.e., the depth deterministic policy gradient method) includes an Actor network and a Critic network, where the Actor network is used to generate a current policy, and the Critic network is used to judge whether the policy is good or bad in the current state, and an algorithm flow diagram of the depth deterministic policy gradient method according to the embodiment of the present application is shown in fig. 6.
In order to improve the stability of training, a Target-Actor network and a Target-critical network are introduced, and step 330 of the embodiment of the present application includes the following specific steps:
1) initializing Actor network pi and Critic network Q and network parameter theta π And theta Q
2) Initializing a Target-Actor network pi 'and a Target-Critic network Q' and a network parameter theta π′ And theta Q′
3) Initializing successful experience cache pool R success And a failure experience cache pool R failure
4) For each round (epicode), the following steps are cycled:
(1) selecting an initial state s 1
(2) For each step in the round (step), the following steps are cycled:
according to the current input state s t And Actor network performs output action a t Receive the instant prize r t And the next state s t+1 Further empirical data(s) are obtained t ,a t ,r t ,S t+1 );
Determining whether the learning of the wheel is terminated, if not, using the experience data(s) t ,a t ,r t ,s t+1 ) Store to successful experience cache pool R success If not, executing the third step;
③ using the experience data(s) t ,a t ,r t ,s t+1 ) Store to failure experience cache pool R failure From R to success In which N is taken out failure An experience also puts in R failure Performing the following steps;
fourthly, randomly sampling m experience data(s) from the two experience pools i ,a i ,r i ,S i+1 ),i≤m;
Calculating the expected return of the current action through a Target-critical network:
y i =r i +γQ′(s i+1 ,π′,θ Q′ )
defining a Critic network minimum loss function to update network parameters:
Figure BDA0003672452610000151
seventhly, updating the network parameters of the Actor through the following gradients:
Figure BDA0003672452610000152
updating Target-Actor network and Target-critical network parameters by the following equation:
Figure BDA0003672452610000153
(3) the step (step) loop is ended.
5) The round (epicode) loop is ended.
After training set training is finished, an optimization objective function is solved, the optimal strategy of the vehicle association mode and the resource distribution proportion is obtained, the purpose of meeting vehicle time delay limitation and service quality requirements is achieved, and therefore performance of vehicle communication is improved.
According to the method and the device, the reinforcement learning method and the depth certainty strategy gradient (DDPG) method are used for converting and solving the optimization problem function, the vehicle association mode and resource allocation combined decision in the vehicle networking is effectively carried out, the delay constraint and task service quality requirements of the vehicle are met, the performance of vehicle-mounted communication is improved, the stability is shown in the optimization of a series of vehicle continuous action spaces, and the convergence is high.
Predicting the vehicle position at the next moment of the vehicle according to the track point data of the detected vehicle so as to consider the mobility of the vehicle and perform interaction of communication data between the vehicles; the method comprises the steps of receiving a task unloading request of a vehicle, and establishing a resource allocation model for allocating the resources of the vehicle network for the vehicle based on the task unloading request; and solving the resource allocation model based on a reinforcement learning method and a depth certainty strategy gradient method to obtain a vehicle association optimal mode and a resource allocation strategy. Therefore, the embodiment of the application considers the time-varying property of the resources, and enables the limited resources to be reasonably and dynamically allocated to the vehicles requesting the resources, so that the time delay limit and the service quality requirement of vehicle tasks are met, the performance of vehicle-mounted communication is improved, the optimization of a series of vehicle continuous motion spaces is stable, and the convergence is high.
The unmanned aerial vehicle auxiliary vehicle networking resource allocation device provided by the embodiment of the application is described below, and the unmanned aerial vehicle auxiliary vehicle networking resource allocation device described below and the unmanned aerial vehicle auxiliary vehicle networking resource allocation method described above can be referred to correspondingly.
Referring to fig. 7, an embodiment of the present application provides an unmanned aerial vehicle assisted internet of vehicles resource allocation device, including:
a vehicle position prediction module 201, configured to predict a vehicle position at a next moment of the vehicle according to the detected trajectory point data of the vehicle;
a resource allocation model establishing module 202, configured to receive a task unloading request of the vehicle, and establish a resource allocation model for allocating resources of the vehicle in the internet of vehicles based on the task unloading request; the task unloading request comprises a vehicle association mode, the amount of computing resources required by the task, the amount of data required by the task and the maximum delay tolerable by the task; the vehicle association mode comprises a vehicle and unmanned aerial vehicle association mode and a vehicle and unmanned aerial vehicle non-association mode;
and the solving module 203 is used for solving the resource allocation model based on a reinforcement learning method and a depth certainty strategy gradient method to obtain a vehicle association optimal mode and a resource allocation strategy.
According to the unmanned aerial vehicle auxiliary internet of vehicles resource allocation device, the vehicle position of the vehicle at the next moment is predicted according to the detected track point data of the vehicle, so that the mobility of the vehicle is considered, and communication data interaction between the vehicles is carried out in time; the method comprises the steps of receiving a task unloading request of a vehicle, and establishing a resource allocation model for allocating the resources of the Internet of vehicles based on the task unloading request; and solving the resource allocation model based on a reinforcement learning method and a depth certainty strategy gradient method to obtain a vehicle association optimal mode and a resource allocation strategy. Therefore, the embodiment of the application considers the time-varying property of the resources, and enables the limited resources to be reasonably and dynamically allocated to the vehicles requesting the resources, so that the time delay limit and the service quality requirement of vehicle tasks are met, the performance of vehicle-mounted communication is improved, the optimization of a series of vehicle continuous motion spaces is stable, and the convergence is high.
In one embodiment, the vehicle position prediction module comprises:
a trajectory point data determination module for determining a plurality of detected trajectory point data of the vehicle;
the speed and acceleration calculation module is used for calculating the speed and acceleration corresponding to the plurality of track point data of the vehicle;
and the position prediction module is used for calculating the distance between the vehicle and the unmanned aerial vehicle and the azimuth angle of the vehicle according to the speeds and the accelerations of the plurality of track point data.
In one embodiment, the resource allocation model building module is specifically configured to build a resource allocation model for allocating the vehicle-networking resources to the vehicle based on the task unloading request, the vehicle available resources, the drone available resources, the transmission rate of the vehicle and drone uplink, and the transmission rate of the vehicle own link.
In one embodiment, the solving module comprises:
an optimization problem function obtaining module, configured to obtain an optimization problem function based on the resource allocation model;
the conversion module is used for converting the optimization problem function based on a reinforcement learning method;
and the final solving module is used for solving the converted result according to a depth certainty strategy gradient method to obtain a vehicle association optimal mode and a resource allocation strategy.
In one embodiment, the conversion module is specifically configured to convert the optimization problem function into an environment state space set, an action decision set, and a reward function;
the environment state space set comprises the amount of computing resources required by the vehicle task, the amount of data required by the task, the maximum delay tolerable by the task, the vehicle position and the unmanned aerial vehicle position;
the action decision set comprises a vehicle association mode, and a computing resource proportion and a cache resource proportion which are distributed to an associated vehicle by the unmanned aerial vehicle;
the reward function is based on the maximum delay that can be tolerated by the vehicle mission and the construction of the caching resources that the vehicle has.
In one embodiment, the final solution module is configured to:
initializing network parameters of a depth certainty strategy gradient method, selecting action decisions of the action decision set based on the state of the environment state space set, and executing to obtain the reward function;
and training a network of the deep deterministic strategy gradient method based on empirical data as a training set, and updating the network parameters to obtain a vehicle associated optimal mode and a resource allocation strategy.
Fig. 8 illustrates a physical structure diagram of an electronic device, and as shown in fig. 8, the electronic device may include: a processor (processor)810, a Communication Interface 820, a memory 830 and a Communication bus 840, wherein the processor 810, the Communication Interface 820 and the memory 830 communicate with each other via the Communication bus 840. The processor 810 may invoke the computer program in the memory 830 to perform the steps of the drone assisted vehicle networking resource deployment method, including, for example: predicting the vehicle position at the next moment of the vehicle according to the detected track point data of the vehicle; receiving a task unloading request of the vehicle, and establishing a resource allocation model for allocating the vehicle networking resources for the vehicle based on the task unloading request; the task unloading request comprises a vehicle association mode, the amount of computing resources required by the task, the amount of data required by the task and the maximum delay tolerable by the task; the vehicle association mode comprises a vehicle and unmanned aerial vehicle association mode and a vehicle and unmanned aerial vehicle non-association mode; and solving the resource allocation model based on a reinforcement learning method and a depth certainty strategy gradient method to obtain a vehicle association optimal mode and a resource allocation strategy.
In addition, the logic instructions in the memory 830 may be implemented in software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products. Based on such understanding, the technical solutions of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In another aspect, the present application further provides a computer program product, where the computer program product includes a computer program, where the computer program is stored on a non-transitory computer-readable storage medium, and when the computer program is executed by a processor, a computer is capable of executing the steps of the drone assisted vehicle networking resource allocation method provided in the foregoing embodiments, for example, including: predicting the vehicle position at the next moment of the vehicle according to the detected track point data of the vehicle; receiving a task unloading request of the vehicle, and establishing a resource allocation model for allocating the vehicle networking resources for the vehicle based on the task unloading request; the task unloading request comprises a vehicle association mode, the amount of computing resources required by the task, the amount of data required by the task and the maximum delay tolerable by the task; the vehicle association mode comprises a vehicle and unmanned aerial vehicle association mode and a vehicle and unmanned aerial vehicle non-association mode; and solving the resource allocation model based on a reinforcement learning method and a depth certainty strategy gradient method to obtain a vehicle association optimal mode and a resource allocation strategy.
On the other hand, embodiments of the present application further provide a processor-readable storage medium, where the processor-readable storage medium stores a computer program, where the computer program is configured to cause a processor to perform the steps of the method provided in each of the above embodiments, for example, including: predicting the vehicle position at the next moment of the vehicle according to the detected track point data of the vehicle; receiving a task unloading request of the vehicle, and establishing a resource allocation model for allocating the vehicle networking resources for the vehicle based on the task unloading request; the task unloading request comprises a vehicle association mode, the amount of computing resources required by the task, the amount of data required by the task and the maximum delay tolerable by the task; the vehicle association mode comprises a vehicle and unmanned aerial vehicle association mode and a vehicle and unmanned aerial vehicle non-association mode; and solving the resource allocation model based on a reinforcement learning method and a depth certainty strategy gradient method to obtain a vehicle association optimal mode and a resource allocation strategy.
The processor-readable storage medium can be any available medium or data storage device that can be accessed by a processor, including, but not limited to, magnetic memory (e.g., floppy disks, hard disks, magnetic tape, magneto-optical disks (MOs), etc.), optical memory (e.g., CDs, DVDs, BDs, HVDs, etc.), and semiconductor memory (e.g., ROMs, EPROMs, EEPROMs, non-volatile memory (NAND FLASH), Solid State Disks (SSDs)), etc.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. Based on the understanding, the above technical solutions substantially or otherwise contributing to the prior art may be embodied in the form of a software product, which may be stored in a computer-readable storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the various embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (10)

1. An unmanned aerial vehicle-assisted Internet of vehicles resource allocation method is characterized by comprising the following steps:
predicting the vehicle position at the next moment of the vehicle according to the detected track point data of the vehicle;
receiving a task unloading request of the vehicle, and establishing a resource allocation model for allocating the vehicle networking resources for the vehicle based on the task unloading request; the task unloading request comprises a vehicle association mode, the amount of computing resources required by the task, the amount of data required by the task and the maximum delay tolerable by the task; the vehicle association mode comprises a vehicle and unmanned aerial vehicle association mode and a vehicle and unmanned aerial vehicle non-association mode;
and solving the resource allocation model based on a reinforcement learning method and a depth certainty strategy gradient method to obtain a vehicle association optimal mode and a resource allocation strategy.
2. The unmanned aerial vehicle-assisted internet of vehicles resource allocation method of claim 1, wherein the predicting the vehicle position at the next moment of the vehicle according to the detected vehicle trajectory point data comprises:
determining a plurality of detected trajectory point data for the vehicle;
calculating the corresponding speed and acceleration of the plurality of track point data of the vehicle;
and calculating the distance between the vehicle and the unmanned aerial vehicle and the azimuth angle of the vehicle according to the speed and the acceleration of the plurality of track point data.
3. The unmanned aerial vehicle-assisted internet of vehicles resource allocation method of claim 1, wherein the establishing a resource allocation model for internet of vehicles resource allocation for vehicles based on the task offloading request comprises:
and establishing a resource allocation model for allocating the vehicle networking resources for the vehicle based on the task unloading request, the available resources of the vehicle, the available resources of the unmanned aerial vehicle, the transmission rate of the uplink of the vehicle and the unmanned aerial vehicle and the transmission rate of the link of the vehicle.
4. The unmanned aerial vehicle-assisted internet of vehicles resource allocation method of claim 1, wherein the solving of the resource allocation model based on a reinforcement learning method and a depth certainty strategy gradient method to obtain a vehicle-associated optimal mode and a resource allocation strategy comprises:
obtaining an optimization problem function based on the resource allocation model;
converting the optimization problem function based on a reinforcement learning method;
and solving the converted result according to a depth certainty strategy gradient method to obtain a vehicle association optimal mode and a resource allocation strategy.
5. The UAV-assisted vehicle networking resource deployment method of claim 4, wherein transforming the optimization problem function based on a reinforcement learning method comprises:
converting the optimization problem function into an environment state space set, an action decision set and a reward function;
the environment state space set comprises the amount of computing resources required by the vehicle task, the amount of data required by the task, the maximum delay tolerable by the task, the vehicle position and the unmanned aerial vehicle position;
the action decision set comprises a vehicle association mode, and a computing resource proportion and a cache resource proportion which are distributed to an associated vehicle by the unmanned aerial vehicle;
the reward function is based on the maximum delay that can be tolerated by the vehicle mission and the construction of the caching resources that the vehicle has.
6. The unmanned aerial vehicle-assisted Internet of vehicles resource allocation method of claim 5, wherein the method of solving the converted result according to the depth deterministic strategy gradient method to obtain the optimal mode of vehicle association and the resource allocation strategy comprises
Initializing network parameters of a depth certainty strategy gradient method, selecting action decisions of the action decision set based on the state of the environment state space set, and executing to obtain the reward function;
and training a network of the deep deterministic strategy gradient method based on empirical data as a training set, and updating the network parameters to obtain a vehicle associated optimal mode and a resource allocation strategy.
7. The utility model provides an unmanned aerial vehicle assists car networking resource allotment device which characterized in that includes:
the vehicle position prediction module is used for predicting the vehicle position at the next moment of the vehicle according to the track point data of the detected vehicle;
the resource allocation model establishing module is used for receiving a task unloading request of the vehicle and establishing a resource allocation model for allocating the vehicle networking resources for the vehicle based on the task unloading request; the task unloading request comprises a vehicle association mode, the amount of computing resources required by the task, the amount of data required by the task and the maximum delay tolerable by the task; the vehicle association mode comprises a vehicle and unmanned aerial vehicle association mode and a vehicle and unmanned aerial vehicle non-association mode;
and the solving module is used for solving the resource allocation model based on a reinforcement learning method and a depth certainty strategy gradient method to obtain a vehicle association optimal mode and a resource allocation strategy.
8. An electronic device comprising a processor and a memory storing a computer program, wherein the processor, when executing the computer program, implements the steps of the drone assisted vehicle networking resource allocation method of any one of claims 1 to 6.
9. A non-transitory computer readable storage medium, having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the steps of the drone assisted vehicle networking resource deployment method of any one of claims 1 to 6.
10. A computer program product comprising a computer program, wherein the computer program when executed by a processor implements the steps of the drone assisted vehicle networking resource deployment method of any one of claims 1 to 6.
CN202210612465.7A 2022-05-31 2022-05-31 Unmanned aerial vehicle-assisted Internet of vehicles resource allocation method and device and electronic equipment Pending CN115002725A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210612465.7A CN115002725A (en) 2022-05-31 2022-05-31 Unmanned aerial vehicle-assisted Internet of vehicles resource allocation method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210612465.7A CN115002725A (en) 2022-05-31 2022-05-31 Unmanned aerial vehicle-assisted Internet of vehicles resource allocation method and device and electronic equipment

Publications (1)

Publication Number Publication Date
CN115002725A true CN115002725A (en) 2022-09-02

Family

ID=83032102

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210612465.7A Pending CN115002725A (en) 2022-05-31 2022-05-31 Unmanned aerial vehicle-assisted Internet of vehicles resource allocation method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN115002725A (en)

Similar Documents

Publication Publication Date Title
US11886993B2 (en) Method and apparatus for task scheduling based on deep reinforcement learning, and device
CN111586696B (en) Resource allocation and unloading decision method based on multi-agent architecture reinforcement learning
CN107766135B (en) Task allocation method based on particle swarm optimization and simulated annealing optimization in moving cloud
CN109068391B (en) Internet of vehicles communication optimization algorithm based on edge calculation and Actor-Critic algorithm
CN111405569A (en) Calculation unloading and resource allocation method and device based on deep reinforcement learning
Yang et al. A parallel intelligence-driven resource scheduling scheme for digital twins-based intelligent vehicular systems
Chen et al. Efficiency and fairness oriented dynamic task offloading in internet of vehicles
CN111711666B (en) Internet of vehicles cloud computing resource optimization method based on reinforcement learning
CN113346944A (en) Time delay minimization calculation task unloading method and system in air-space-ground integrated network
CN112422644B (en) Method and system for unloading computing tasks, electronic device and storage medium
CN113543074A (en) Joint computing migration and resource allocation method based on vehicle-road cloud cooperation
CN113645273B (en) Internet of vehicles task unloading method based on service priority
Nguyen et al. Flexible computation offloading in a fuzzy-based mobile edge orchestrator for IoT applications
Lin et al. Deep reinforcement learning-based task scheduling and resource allocation for NOMA-MEC in Industrial Internet of Things
Ahmed et al. MARL based resource allocation scheme leveraging vehicular cloudlet in automotive-industry 5.0
Li et al. DNN Partition and Offloading Strategy with Improved Particle Swarm Genetic Algorithm in VEC
Hou et al. Hierarchical task offloading for vehicular fog computing based on multi-agent deep reinforcement learning
CN113709249A (en) Safe balanced unloading method and system for driving assisting service
CN117221951A (en) Task unloading method based on deep reinforcement learning in vehicle-mounted edge environment
CN116916272A (en) Resource allocation and task unloading method and system based on automatic driving automobile network
CN115002725A (en) Unmanned aerial vehicle-assisted Internet of vehicles resource allocation method and device and electronic equipment
CN116009590A (en) Unmanned aerial vehicle network distributed track planning method, system, equipment and medium
CN115550357A (en) Multi-agent multi-task cooperative unloading method
CN114928826A (en) Two-stage optimization method, controller and decision method for software-defined vehicle-mounted task unloading and resource allocation
CN114980127A (en) Calculation unloading method based on federal reinforcement learning in fog wireless access network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination