CN114268923A - Internet of vehicles task unloading scheduling method and system - Google Patents

Internet of vehicles task unloading scheduling method and system Download PDF

Info

Publication number
CN114268923A
CN114268923A CN202111535739.9A CN202111535739A CN114268923A CN 114268923 A CN114268923 A CN 114268923A CN 202111535739 A CN202111535739 A CN 202111535739A CN 114268923 A CN114268923 A CN 114268923A
Authority
CN
China
Prior art keywords
task
vehicle
time slot
vehicles
queue
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111535739.9A
Other languages
Chinese (zh)
Inventor
鲁蔚锋
刘锐
徐佳
徐力杰
蒋凌云
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN202111535739.9A priority Critical patent/CN114268923A/en
Publication of CN114268923A publication Critical patent/CN114268923A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Abstract

The invention provides a method and a system for unloading and scheduling tasks of an internet of vehicles. The method comprises the steps of designing a queue model of each vehicle by considering a communication model and a calculation model in the Internet of vehicles; considering the limits of energy consumption constraint and time delay constraint, designing a system objective function; modeling task offloading scheduling into a Markov chain decision process; solving the optimal task unloading scheduling based on the double-depth Q network; and carrying out deep reinforcement learning training based on federal learning. The invention fully considers the calculation and cache processes of the calculation task in the vehicle, finds an effective task scheduling strategy by using federal learning in artificial intelligence, ensures the requirements of the delay sensitive task, minimizes the delay loss, the energy loss and the service charge of the system, and simultaneously protects the privacy of the user vehicle by adopting a distributed training method.

Description

Internet of vehicles task unloading scheduling method and system
Technical Field
The invention relates to a method and a system for unloading and scheduling tasks of an internet of vehicles, in particular to a method and a system for unloading and scheduling tasks of an intelligent internet of vehicles based on federal learning.
Background
In recent years, the internet of things and autonomous automobiles have received much attention. In the intelligent internet of vehicles, not only the computing, communication and buffering functions of the terminal vehicle but also the time delay requirements of tasks need to be considered, and the requirements of the functions depend on the communication and computing capability of the system to a great extent. The condition for guaranteeing reasonable distribution of computing and communication resources through computing unloading is a necessary condition for realizing the intellectualization of the Internet of vehicles. The vehicle edge computing migrates the computing task to the network edge, so that the end-to-end delay can be effectively reduced, and the requirements of low time delay and high reliability of the vehicle networking application are met.
The dynamic changes in the car networking system also introduce storage and communication complexities, as the topology of the network is constantly changing due to the high mobility of the vehicles. Due to the dynamic and variable Internet of vehicles environment, resource allocation is usually a non-convex optimization problem with complex objective functions and constraints, the traditional optimization algorithm is difficult to solve, and deep reinforcement learning can well solve the complex optimization problem. With the development of the 5G network, the terminal vehicle has all conditions for carrying out artificial intelligence model training, and the training of the model on the vehicle becomes possible. And meanwhile, the federal learning is a distributed machine learning method, so that the communication delay can be further reduced, and the privacy of a terminal user can be ensured. Compared with the traditional calculation unloading of the Internet of vehicles, the intelligent Internet of vehicles task unloading scheduling method based on the federal learning comprehensively considers the joint optimization problem of communication, cache and calculation resources.
In view of the above, there is a need to design a method and a system for offloading and scheduling tasks of the internet of vehicles based on federal learning to solve the above problems.
Disclosure of Invention
The invention aims to solve the problem of communication and calculation joint optimization in the car networking environment, design a corresponding task queue and energy queue for each car, find an effective task scheduling strategy by using federal learning in artificial intelligence, ensure the delay sensitivity requirement, minimize the delay loss, the energy loss and the service charge of a system, and protect the privacy of a user car.
In order to achieve the above object, the present invention provides a task unloading scheduling method for an internet of vehicles, in the internet of vehicles, according to the coverage of a roadside unit, a whole road is divided into M disjoint road segments, a plurality of vehicles are present in the coverage of one roadside unit, and the vehicles and the roadside unit complete the task calculation and unloading through a wireless link, the method comprising the following steps:
step 1: considering a communication model and a calculation model in the Internet of vehicles, designing a queue model of each vehicle;
step 2: considering the limits of energy consumption constraint and time delay constraint, designing a system objective function;
and step 3: modeling task offloading scheduling into a Markov chain decision process;
and 4, step 4: solving the optimal task unloading scheduling based on the double-depth Q network;
and 5: and carrying out deep reinforcement learning training based on federal learning.
A further development of the invention is that step 1 comprises the following steps:
step 1.1: calculating a wireless communication rate between a vehicle and a roadside unit
Figure BDA0003412539310000021
When task k is offloaded to wayside unit computation, the uplink transmission delay between the vehicle and the RSU is
Figure BDA0003412539310000022
Wherein L is0Is path loss, PiFor vehicles viPower of PwIs Gaussian white noise power, alpha is path loss index, di,mDistance between the vehicle and the RSU, and B is channel bandwidth;
step 1.2: each calculation task can be selected to be calculated locally in the vehicle or unloaded to the RSU calculation, when the task is unloaded to the RSU calculation, the calculation time of unloading the task k to the RSU m is as follows
Figure BDA0003412539310000031
FmIs the CPU frequency, | V, of the edge server mi mThe l is the number of vehicles in the vehicle set of the roadside unit m, and the transmission energy consumption unloaded to the edge server in one time slot is the product of the data quantity transmitted in the time slot and the energy consumption of the unit data quantity:
Figure BDA0003412539310000032
when the task is calculated locally in the vehicle, the local calculation time delay is tc(i,k)=Ik·ck·fi localWhile locally calculating the energy consumption as
Figure BDA0003412539310000033
Wherein f isi localAnd
Figure BDA0003412539310000034
frequency and power of the vehicle, respectively;
step 1.3: calculating the total time for the vehicle i to process the task k:
Figure BDA0003412539310000035
step 1.4: each vehicle having T thereinsIndividual task priority queues, TsFor maximum delay limits in all types of tasks, i.e.
Figure BDA0003412539310000036
The capacity limit of the vehicle task queue is based on time slot unit, namely the maximum capacity of the task queue, to ensure that the vehicle task queue can hold any task in any time slot, and the number of the task queue is {1,2s};
Step 1.5: the initial priority of the task k generated in each time slot in the vehicle i is calculated,
Figure BDA0003412539310000037
tasks with smaller initial priority values are processed with higher priority;
step 1.6: the change of the vehicle energy queue is
Figure BDA0003412539310000038
Wherein the set of n vehicles is V ═ V1,v2,...,vnV vehiclesiE.v. velocity SiDriving on a road; the set of roadside units is G ═ R1,R2,...,RM}, each roadside unit RmThe communication range of e G is a diameter dmIn the same RmVehicle aggregation within communication range
Figure BDA0003412539310000041
Indicating that vehicles within the communication range of the RSU can communicate with the corresponding RSU at V2I; dividing the total time into N equal time slots τ; there are K different types of tasks, each type of task has different generation probability; the generation probability of the k-th task in each time slot is lambdak
Figure BDA0003412539310000042
And satisfy
Figure BDA0003412539310000043
Wherein k represents different types of tasks; each task is represented by a triplet: a isk=<Ik,ck,TkIs disclosed ink,ck,TkRespectively represent tasks akThe number of turns of the CPU required by the calculation task and the task delay limit;
Figure BDA0003412539310000044
and
Figure BDA0003412539310000045
to express binary decision variables when
Figure BDA0003412539310000046
Representing that task k is in vehicle V within time slot tiUnloading to a roadside UnitRmIn the middle stage of calculation when
Figure BDA0003412539310000047
Representing that task k is carried by vehicle v in time slot tiLocal calculation when
Figure BDA0003412539310000048
The representative task k remains in the vehicle's task queue during time slot t.
A further development of the invention is that the step 2 comprises the following steps:
step 2.1: calculating the k-type task quantity locally processed by the vehicle i in a time slot into
Figure BDA0003412539310000049
Step 2.2: calculating the task amount of the time slot t exceeding the time delay limit
Figure BDA00034125393100000410
Wherein the content of the first and second substances,
Figure BDA00034125393100000411
representing the task quantity in a queue with the index of 1 in a task queue of a vehicle i in a time slot t;
step 2.3: minimizing the total cost of system energy consumption and processing tasks, and the system objective function is:
Figure BDA00034125393100000412
satisfies the following conditions:
Figure BDA0003412539310000051
Figure BDA0003412539310000052
fi≤fmax (4)
Figure BDA0003412539310000053
Figure BDA0003412539310000054
wherein, W2Constraint (2) represents that the unloading decision variable can only take 0 or 1 as the weight constant; constraint (3) represents that each vehicle can only select one decision variable in one time slot; constraint (4) represents a vehicle CPU frequency constraint; constraint (5) represents a task delay constraint per time slot; constraint (6) represents a task energy consumption constraint per time slot.
A further development of the invention is that said step 3 comprises the following steps:
step 3.1: designing a system state space:
Figure BDA0003412539310000055
wherein
Figure BDA0003412539310000056
Indicating the state of the vehicle in the mth section of the time slot t, XtRepresenting the position of the vehicle at time t, from the last slot position and the vehicle speed, and
Figure BDA0003412539310000057
wherein
Figure BDA0003412539310000058
Representing the number of vehicles in the set of vehicles in the section m of the time slot t, by
Figure BDA0003412539310000059
Representing the state of each task queue and the state of the energy queue in the vehicle i at the time slot t;
step 3.2: designing system motion space by
Figure BDA00034125393100000510
Describing a task-off decision space, i.e. an action space, of a vehicle, further
Figure BDA00034125393100000511
Wherein
Figure BDA00034125393100000512
(M is more than or equal to 1 and less than or equal to M) represents an unloading decision variable;
step 3.3: energy queue state transition:
Figure BDA00034125393100000513
step 3.4: and (3) task queue state conversion:
Figure BDA0003412539310000061
step 3.5: designing a loss function of the system, wherein in the time slot t, the vehicle is in the system state StTaking action AtThe loss function of (d) is:
Figure BDA0003412539310000062
step 3.6: optimal scheduling strategy of the system:
Figure BDA0003412539310000063
where 0 < η < 1 is a discount factor that indicates the impact of future losses on current operation.
A further development of the invention is that said step 4 comprises the following steps:
step 4.1: initializing values of weight theta and Q functions and an experience buffer pool of the double-depth Q network;
step 4.2: initial state S of given Internet of vehicles system0
Step 4.3: for each time slot t is 0maxStep 4.4-step 4.15 are executed;
step 4.4: in a state StTaking action AtCalculating an expected minimum loss representation:
Figure BDA0003412539310000064
And 4.5, calculating the optimal decision of the depth Q network:
Figure BDA0003412539310000065
step 4.6: selecting actions randomly with a probability p;
step 4.7: if p ≦ ε, choose random action At
Step 4.8: if p > ε, select action:
Figure BDA0003412539310000066
step 4.9: performing action AtObtaining the state S of the next time slott+1
Step 4.10: calculating Loss function Losst
Step 4.11: will experience
Figure BDA0003412539310000071
) Putting the mixture into an experience buffer pool;
step 4.12: randomly extracting a batch of experience from the buffer pool to be used as a training sample, and calculating a double-depth Q function:
Figure BDA0003412539310000072
step 4.13: computing
Figure BDA0003412539310000073
Step 4.14: calculating L (theta)t) Gradient (2):
Figure BDA0003412539310000074
step 4.15: updating theta based on gradient descent methodt
Figure BDA0003412539310000075
Step 4.16: and (5) the training model is converged to obtain a trained model.
A further development of the invention is that said step 5 comprises the following steps:
step 5.1: given a road section m, and a time slot t, the set of vehicles in this road section
Figure BDA0003412539310000076
Step 5.2: performing steps 5.3-5.10 for time slot T ═ 1, 2., T;
step 5.3: randomly selecting n vehicle sets
Figure BDA0003412539310000077
Step 5.4: obtaining the global parameter of a last time slot of the roadside unit m:
Figure BDA0003412539310000078
step 5.5: updating the model local parameters with the global parameters of a time slot on each vehicular roadside unit m:
Figure BDA0003412539310000079
step 5.6: obtaining vehicle local data:
Figure BDA00034125393100000710
step 5.7: utilizing vehicle local data
Figure BDA00034125393100000711
Training the model to obtain vehicle local training parameters and time:
Figure BDA00034125393100000712
step 5.8: uploading model parameters
Figure BDA0003412539310000081
And
Figure BDA0003412539310000082
to the roadside unit m;
step 5.9: the roadside unit m receives the vehicle parameters to carry out global aggregation:
Figure BDA0003412539310000083
step 5.10: aggregating to produce an improved global model and reassigning to the end vehicles;
step 5.11: and (5) the training model is converged, and the federal learning training is completed.
A further development of the invention consists in that, before step 5.2, a step of initializing the model parameters of the roadside units and of the vehicle is also included.
In order to achieve the purpose of the invention, the invention further provides a task unloading scheduling system of the internet of vehicles, which is used for implementing the method of any one of the preceding claims.
A further development of the invention is that the system comprises a Road Side Unit (RSU) and an end vehicle.
In a further development of the invention, each vehicle and roadside unit has its own neural network training model, and vehicles on the same road section can complete distributed federal learning together with the RSU.
The invention has the following beneficial effects: the invention considers the communication and calculation joint optimization problem in the vehicle networking environment, designs a corresponding task queue and energy queue for each vehicle, and fully considers the calculation and cache process of the calculation task in the vehicle; the invention provides an intelligent internet of vehicles task unloading scheduling research based on federal learning by using federal learning in artificial intelligence, finds an effective task scheduling strategy, ensures the requirements of delay sensitive tasks, minimizes the delay loss, energy loss and service charge of a system, and protects the privacy of a user vehicle by adopting a distributed training method.
Drawings
FIG. 1 is a schematic diagram of the Internet of vehicles task offloading system of the present invention.
FIG. 2 is a schematic diagram of a vehicle mission queue.
FIG. 3 is a flow diagram for solving an optimal task offload schedule using a dual-depth Q network.
FIG. 4 is a flow chart of a deep reinforcement learning training based on federated learning.
Fig. 5 is a schematic representation of the federal learning process.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in detail with reference to the accompanying drawings and specific embodiments.
It should be emphasized that in describing the present invention, various formulas and constraints are identified with consistent labels, but the use of different labels to identify the same formula and/or constraint is not precluded and is provided for the purpose of more clearly illustrating the features of the present invention.
The invention designs an intelligent Internet of vehicles task unloading scheduling system and method based on federal learning. The system mainly comprises two types of equipment, namely a roadside unit and a terminal vehicle. Considering the computation offload problem in edge computation, the computation task of the vehicle can be offloaded to the roadside unit computation as well as the vehicle local computation, and we need to find the optimal offload scheduling scheme to minimize the total loss of the system.
The present invention is further described in detail with reference to the following embodiments, wherein the entire road is divided into M disjoint road segments according to the coverage area of the roadside units, a plurality of vehicles are located in the coverage area of one roadside unit, and the vehicles and the roadside units complete the calculation and unloading of tasks through wireless links. As shown in FIG. 1, there are two vehicles within the coverage area of the roadside unit RSU 2, each having a TsThe vehicle computing system comprises a task priority queue and an energy queue, wherein the task priority queue is used for storing tasks to be processed, and the energy queue provides corresponding energy for vehicle computing tasks. Meanwhile, as shown in the right side of fig. 1, each vehicle and roadside unit has a neural network training model thereof, and vehicles in the same road section can complete distributed federal learning together with the RSU.
Considering the dynamic property of the car networking environment, the topological state of the network and the state of the vehicle queue are different in each time slot, the state transition of the vehicle task queue is shown in fig. 2, which represents the change situation of the task in the queue every time slot passes, and the change of the energy queue can be obtained through an energy queue change equation. By calculating the energy consumption of the system and the time delay of the calculation task, an objective optimization function can be established, namely, the total cost of the system energy consumption and the processing task is minimized under the time delay constraint and the energy consumption constraint which guarantee the vehicle task in each time slot.
The invention finds the optimal unloading scheduling scheme of the task by a method of a double-depth Q network according to the state space S of the systemtAn operation space AtAnd designing Markov chain decision process by using state transfer equation, training converged model by using flow chart shown in figure 3, and finding out optimum task unloading decision pi*
Distributed federal learning is adopted in the training process of deep reinforcement learning, and the federal learning process of a road section in the internet of vehicles is shown in fig. 5. First, the end vehicle will download an initial training model from the RSU and perform local model training using its own data to minimize the predefined loss function and update the trained model weights to the RSU via encrypted transmission. The RSU then collects the updated parameters from the end vehicles to produce an improved global model, i.e., global aggregation. Finally, the output of the RSU training model is redistributed to the terminal vehicles and the terminals perform further local training by using the global model as a reference. The training process is repeated until a specified number of iterations is reached. The communication load of the side cloud interaction is reduced and the privacy of the terminal vehicle is protected through federal learning.
The invention provides an intelligent Internet of vehicles task unloading scheduling method based on federal learning, which specifically comprises the following steps:
step 1: considering a communication model and a calculation model in the Internet of vehicles system, designing a queue model of each vehicle;
step 2: considering the limits of energy consumption constraint and time delay constraint, designing a system objective function;
and step 3: modeling task offloading scheduling into a Markov chain decision process;
and 4, step 4: solving the optimal task unloading scheduling based on the double-depth Q network;
and 5: and carrying out deep reinforcement learning training based on federal learning.
The following will explain the task unloading scheduling method of the internet of vehicles in detail.
The set of vehicles in the invention is V ═ V1,v2,...,vnV, each vehicle viE.v. velocity SiAnd (4) running on the road. Each vehicle is provided with a plurality of task queues and an energy queue, the task queues are used for storing tasks to be processed, and the energy queues provide corresponding energy for vehicle calculation tasks. The set of roadside units (RSUs) is G ═ R1,R2,...,RM}, each roadside unit RmThe communication range of e G is a diameter dmThe circle of (c). A vehicle can only communicate with a Road Side Unit (RSU) in a time slot, and in the same RmVehicle aggregation within communication range
Figure BDA0003412539310000111
It is shown that vehicles within the communication range of the RSU can communicate with the corresponding RSU at V2I. And dividing the whole road into M road sections according to the communication ranges of the M RSUs.
Aiming at the time-varying property of the Internet of vehicles, a time slice dividing technology is adopted to divide the total time into N equal time slots tau, wherein tau is a small time interval. The system state is assumed to remain unchanged within each slot. Each RmWill communicate with the vehicle V in its communication rangei mThe V2I communication is performed to collectively complete the computation offload of the task.
The set of tasks is
Figure BDA0003412539310000113
The generation probability of the k-th task in each time slot is lambdakAnd satisfy
Figure BDA0003412539310000112
Each task is represented by a triplet: a isk=<Ik,ck,TkIs disclosed ink,ck,TkRespectively represent tasks akThe number of turns of the CPU required for the calculation task and the task delay constraint. Decision variables
Figure BDA0003412539310000121
Representing that task k is in vehicle v within time slot tiOff-loading to the roadside Unit RmMiddle calculation, decision variables
Figure BDA0003412539310000122
Representing that task k is carried by vehicle v in time slot tiLocal calculation when
Figure BDA0003412539310000123
The representative task k remains in the vehicle's task queue during time slot t.
Considering the communication model of the system, the wireless communication rate r between the vehicle and the roadside unit can be calculatedi mAccording to the communication rate, the uplink transmission time delay t when the task k is unloaded to the roadside unit can be calculatedu(i, m, k). Considering the calculation model of the system, it is possible to calculate the time delay t calculated locally in the vehicle from task kc(i, k), and the calculated time t for offloading of the task to the roadside unitm(k) So that the total time for the vehicle i to process the task k is
Figure BDA0003412539310000124
Simultaneous calculation of energy consumption locally calculated for a vehicle
Figure BDA0003412539310000125
And uplink transmission power consumption
Figure BDA0003412539310000126
Each vehicle having a TsA task priority queue for storing tasks to be processed and an energy queue for storing energy of the tasks to be processedThe vehicle computing task provides the corresponding energy. And calculating the initial priority pr (i, k) of the tasks for the newly generated tasks in each time slot according to the data size and the residual processing time of the tasks, and sorting the tasks from small to large according to the initial priority of the tasks and putting the tasks into the corresponding lower-marked vehicle task queue. The vehicle task queue changes as shown in fig. 2, for a task queue l, its task input sources are two types, which are the task received by the vehicle itself at time slot t and the task transferred from the previous time slot, i.e. the legacy task in task queue l + 1. The output sources of the tasks are three, namely the tasks which are transmitted to the roadside unit for calculation by the vehicle through V2I communication at the time slot t and are transferred to the task queue l-1 through the time slot and the tasks which are calculated locally. The change of the vehicle energy queue can be represented by an energy change formula
Figure BDA0003412539310000127
To obtain.
Therefore, considering the communication model and the calculation model in the car networking system, step 1 of designing the fleet model for each vehicle specifically includes the following:
step 1.1: calculating a wireless communication rate between a vehicle and a roadside unit
Figure BDA0003412539310000131
When task k is offloaded to wayside unit computation, the uplink transmission delay between the vehicle and the RSU is
Figure BDA0003412539310000132
Step 1.2: when the task is unloaded to RSU calculation, the calculation time of unloading the task k to RSU m is
Figure BDA0003412539310000133
The transmission energy consumption unloaded to the edge server in a time slot is the product of the data quantity transmitted in the time slot and the unit data quantity energy consumption:
Figure BDA0003412539310000134
when the task is in the vehicleLocal computation with time delay tc(i,k)=Ik·ck·fi localWhile locally calculating the energy consumption as
Figure BDA0003412539310000135
Step 1.3: the total time for the vehicle i to process the task k is calculated,
Figure BDA0003412539310000136
step 1.4: number of vehicle task queue {1, 2.,. l.,. T.,s},Tsfor maximum delay limits in all types of tasks, i.e. Ts=max{Tk,k∈{1,2,...,K}};
Step 1.5: the initial priority of the task k generated in each time slot in the vehicle i is calculated,
Figure BDA0003412539310000137
step 1.6: the change of the vehicle energy queue is
Figure BDA0003412539310000138
For tasks exceeding the delay limit we introduce a penalty factor w1If the processing time of only one time slot is left in the task queue with the index of 1, the time delay limit of the task is exceeded if the processing time cannot be processed in time, and the task quantity h of the time slot t exceeding the time delay limit is calculatedt. The total cost for the vehicle i to process the task k in the time slot t is
Figure BDA0003412539310000139
Service charges cs mainly including roadside unitsk,mAnd a calculated cost cs local to the vehiclek,i. The aim of the invention is to minimize the total system loss while ensuring the time delay constraint and the energy consumption constraint of the vehicle task in each time slot
Figure BDA0003412539310000141
Therefore, considering the limitations of the energy consumption constraint and the time delay constraint, the step 2 of designing the system objective function specifically includes the following steps:
step 2.1: calculating the k-type task quantity locally processed by the vehicle i in a time slot into
Figure BDA0003412539310000142
Step 2.2: the delay constraint considering the delay sensitive task is TkIf the task exceeding the delay limit cannot be completed, a penalty mechanism is introduced to make certain penalty on the task amount exceeding the delay limit, and a penalty factor is set as W1. For the task queue with the index of 1, only the processing time of one time slot is left, if the processing time cannot be processed in time, the time delay limit of the task is exceeded, and h is settFor the amount of tasks whose time-slot t exceeds the delay limit
Figure BDA0003412539310000143
Wherein
Figure BDA0003412539310000144
Representing the task quantity in a queue with the index of 1 in a task queue of a vehicle i in a time slot t;
step 2.3: the goal of the system is to minimize the system energy consumption and the total cost of processing tasks while guaranteeing the time delay constraint and energy consumption constraint of the vehicle tasks in each time slot, so the system objective function is:
Figure BDA0003412539310000145
satisfies the following conditions:
Figure BDA0003412539310000146
Figure BDA0003412539310000147
fi≤fmax (4)
Figure BDA0003412539310000151
Figure BDA0003412539310000152
wherein, W2Constraint (2) represents that the unloading decision variable can only take 0 or 1 as the weight constant; constraint (3) represents that each vehicle can only select one decision variable in one time slot; constraint (4) represents a vehicle CPU frequency constraint; constraint (5) represents a task delay constraint per time slot; constraint (6) represents a task energy consumption constraint per time slot.
Further, said step 3 of modeling the task offload schedule as a markov chain decision process comprises the following:
step 3.1: designing a system state space:
Figure BDA0003412539310000153
wherein
Figure BDA0003412539310000154
Indicating the state of the vehicle in the mth section of the time slot t, XtThe position of the vehicle at time t is represented and can be determined from the last slot position and the vehicle speed. Further, the method can be used for preparing a novel material
Figure BDA0003412539310000155
Wherein
Figure BDA0003412539310000156
Representing the number of vehicles in the set of vehicles in the road segment m for the t-slot. By using
Figure BDA0003412539310000157
Representing the state of each task queue and the state of the energy queue in the vehicle i at the time slot t;
step 3.2: designing system motion space by
Figure BDA0003412539310000158
Describing a task-off decision space, i.e. an action space, of a vehicle, further
Figure BDA0003412539310000159
Wherein
Figure BDA00034125393100001510
(M is more than or equal to 1 and less than or equal to M) represents an unloading decision variable, and determines whether the kth task of the vehicle is executed locally or unloaded to an RSU (remote service Unit) or stored in a task queue of the vehicle;
step 3.3: energy queue state transition:
Figure BDA00034125393100001511
step 3.4: and (3) task queue state conversion:
Figure BDA00034125393100001512
the task amount of V2I communication offloading in the time slot t is:
Figure BDA0003412539310000161
the task amount calculated locally by the vehicle in the time slot t is:
Figure BDA0003412539310000162
calculating the task quantity generated by the vehicle in the time slot as follows:
Figure BDA0003412539310000163
wherein l is a subscript of the task queue, and the task is placed into the subscript task queue according to the initial priority of the task when the task arrives;
step 3.5: designing a loss function of the system, wherein in the time slot t, the vehicle is in the system state StTaking action AtThe loss function of (d) is:
Figure BDA0003412539310000164
step 3.6: optimal scheduling strategy of the system:
Figure BDA0003412539310000165
where 0 < η < 1 is a discount factor that indicates the impact of future losses on current operation.
Further, as shown in fig. 3, the step 4 of solving the optimal task offload scheduling based on the dual-depth Q network includes the following steps:
step 4.1: initializing values of weight theta and Q functions and an experience buffer pool of the double-depth Q network;
step 4.2: initial state S of given Internet of vehicles system0
Step 4.3: for each time slot t is 0maxStep 4.4-step 4.15 are executed;
step 4.4: in a state StTaking action AtThe expected minimum loss representation is calculated:
Figure BDA0003412539310000166
and 4.5, calculating the optimal decision of the depth Q network:
Figure BDA0003412539310000167
step 4.6: selecting actions randomly with a probability p;
step 4.7: if p ≦ ε, choose random action At
Step 4.8: if p > ε, select action:
Figure BDA0003412539310000171
step 4.9: performing action AtObtaining the state S of the next time slott+1
Step 4.10: calculating Loss function Losst
Step 4.11: will experience (S)t,At,Losst,St+1) Putting the mixture into an experience buffer pool;
step 4.12: randomly extracting a batch of experience from the buffer pool to be used as a training sample, and calculating a double-depth Q function:
Figure BDA0003412539310000172
step 4.13: computing
Figure BDA0003412539310000173
Step 4.14: calculating L (theta)t) Gradient (2):
Figure BDA0003412539310000174
step 4.15: updating theta based on gradient descent methodt
Figure BDA0003412539310000175
Step 4.16: the training model is converged to obtain a trained model;
further, as shown in fig. 4, for the model training process in each road segment, the step 5 of performing deep reinforcement learning training based on federal learning specifically includes the following steps:
step 5.1: given a road section m, and a time slot t, the set of vehicles in this road section
Figure BDA0003412539310000176
Step 5.2: performing steps 5.3-5.10 for time slot T ═ 1, 2., T;
step 5.3: randomly selecting n vehicle sets
Figure BDA0003412539310000177
Step 5.4: obtaining the global parameter of a last time slot of the roadside unit m:
Figure BDA0003412539310000178
step 5.5: using global parameters of a time slot on each vehicular roadside unit mTo update the model local parameters:
Figure BDA0003412539310000179
step 5.6: obtaining vehicle local data:
Figure BDA0003412539310000181
step 5.7: utilizing vehicle local data
Figure BDA0003412539310000182
Training the model to obtain vehicle local training parameters and time:
Figure BDA0003412539310000183
step 5.8: uploading model parameters
Figure BDA0003412539310000184
And
Figure BDA0003412539310000185
to the roadside unit m;
step 5.9: the roadside unit m receives the vehicle parameters to carry out global aggregation:
Figure BDA0003412539310000186
step 5.10: aggregating to produce an improved global model and reassigning to the end vehicles;
step 5.11: and (5) the training model is converged, and the federal learning training is completed.
Of course, a step of initializing the model parameters of the roadside units and the vehicle is also included before step 5.2.
In conclusion, the invention considers the problem of communication and calculation joint optimization in the vehicle networking environment, designs the corresponding task queue and energy queue for each vehicle, and fully considers the calculation and caching process of the calculation task in the vehicle. In addition, the invention provides the intelligent internet of vehicles task unloading scheduling research based on the federal learning by using the federal learning in artificial intelligence, finds an effective task scheduling strategy, ensures the requirements of time delay sensitive tasks, minimizes the time delay loss, energy loss and service charge of a system, and protects the privacy of users' vehicles by adopting a distributed training method.
Although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted for elements thereof without departing from the spirit and scope of the present invention.

Claims (10)

1. A task unloading scheduling method for a vehicle networking system is characterized in that in the vehicle networking system, a whole road is divided into M mutually disjoint road sections according to the coverage area of a roadside unit, a plurality of vehicles are arranged in the coverage area of one roadside unit, and the vehicles and the roadside unit complete the task calculation and unloading through a wireless link, wherein the vehicle comprises the following steps: the method comprises the following steps:
step 1: considering a communication model and a calculation model in the Internet of vehicles, designing a queue model of each vehicle;
step 2: considering the limits of energy consumption constraint and time delay constraint, designing a system objective function;
and step 3: modeling task offloading scheduling into a Markov chain decision process;
and 4, step 4: solving the optimal task unloading scheduling based on the double-depth Q network;
and 5: and carrying out deep reinforcement learning training based on federal learning.
2. The method of claim 1, wherein: the step 1 comprises the following steps:
step 1.1: calculating a wireless communication rate between a vehicle and a roadside unit
Figure FDA0003412539300000011
Between vehicle and RSU when task k is offloaded to wayside unit computationUplink transmission delay of
Figure FDA0003412539300000012
Wherein L is0Is path loss, PiFor vehicles viPower of PwIs Gaussian white noise power, alpha is path loss index, di,mDistance between the vehicle and the RSU, and B is channel bandwidth;
step 1.2: each calculation task can be selected to be calculated locally in the vehicle or unloaded to the RSU calculation, when the task is unloaded to the RSU calculation, the calculation time of unloading the task k to the RSU m is as follows
Figure FDA0003412539300000013
FmIs the CPU frequency, | V, of the edge server mi mThe l is the number of vehicles in the vehicle set of the roadside unit m, and the transmission energy consumption unloaded to the edge server in one time slot is the product of the data quantity transmitted in the time slot and the energy consumption of the unit data quantity:
Figure FDA0003412539300000014
when the task is calculated locally in the vehicle, the local calculation time delay is tc(i,k)=Ik·ck·fi localWhile locally calculating the energy consumption as
Figure FDA0003412539300000015
Wherein f isi localAnd
Figure FDA0003412539300000016
frequency and power of the vehicle, respectively;
step 1.3: calculating the total time for the vehicle i to process the task k:
Figure FDA0003412539300000021
step 1.4: each vehicle having T thereinsIndividual task priority queues, TsFor maximum delay in all types of tasksLimitation, i.e. Ts=max{TkK belongs to {1, 2.,. K } }, the capacity limit of the vehicle task queue is in time slot units, namely the maximum capacity of the task queue, so as to ensure that the vehicle task queue can put any next task at any time slot, and the number of the task queue is {1, 2.,. l.,. T. }s};
Step 1.5: the initial priority of the task k generated in each time slot in the vehicle i is calculated,
Figure FDA0003412539300000022
tasks with smaller initial priority values are processed with higher priority;
step 1.6: the change of the vehicle energy queue is
Figure FDA0003412539300000023
Wherein the set of n vehicles is V ═ V1,v2,...,vnV vehiclesiE.v. velocity SiDriving on a road; set M roadside units as G ═ R1,R2,...,RM}, each roadside unit RmThe communication range of e G is a diameter dmIn the same RmVehicle aggregation within communication range
Figure FDA0003412539300000024
Indicating that vehicles within the communication range of the RSU can communicate with the corresponding RSU at V2I; dividing the total time into N equal time slots τ; there are K different types of tasks, each type of task has different generation probability; the generation probability of the k-th task in each time slot is lambdak
Figure FDA0003412539300000025
And satisfy
Figure FDA0003412539300000026
Wherein k represents different types of tasks; each task is represented by a triplet: a isk=<Ik,ck,TkIs disclosed ink,ck,TkRespectively represent tasks akThe number of turns of the CPU required by the calculation task and the task delay limit;
Figure FDA0003412539300000027
and
Figure FDA0003412539300000028
to express binary decision variables when
Figure FDA0003412539300000029
Representing that task k is in vehicle V within time slot tiOff-loading to the roadside Unit RmIn the middle stage of calculation when
Figure FDA00034125393000000210
Representing that task k is carried by vehicle v in time slot tiLocal calculation when
Figure FDA00034125393000000211
The representative task k remains in the vehicle's task queue during time slot t.
3. The method of claim 2, wherein: the step 2 comprises the following steps:
step 2.1: calculating the k-type task quantity locally processed by the vehicle i in a time slot into
Figure FDA00034125393000000212
Step 2.2: calculating the task amount of the time slot t exceeding the time delay limit
Figure FDA0003412539300000031
Wherein the content of the first and second substances,
Figure FDA0003412539300000032
representing a queue with index 1 in the task queue of vehicle i within time slot tThe task amount in (1);
step 2.3: minimizing the total cost of system energy consumption and processing tasks, and the system objective function is:
Figure FDA0003412539300000033
satisfies the following conditions:
Figure FDA0003412539300000034
Figure FDA0003412539300000035
fi≤fma (4)
Figure FDA0003412539300000036
Figure FDA0003412539300000037
wherein, W2Constraint (2) represents that the unloading decision variable can only take 0 or 1 as the weight constant; constraint (3) represents that each vehicle can only select one decision variable in one time slot; constraint (4) represents a vehicle CPU frequency constraint; constraint (5) represents a task delay constraint per time slot; constraint (6) represents a task energy consumption constraint per time slot.
4. The method of claim 3, wherein: the step 3 comprises the following steps:
step 3.1: designing a system state space:
Figure FDA0003412539300000038
wherein
Figure FDA0003412539300000039
Indicating the state of the vehicle in the mth section of the time slot t, XtRepresenting the position of the vehicle at time t, from the last slot position and the vehicle speed, and
Figure FDA00034125393000000310
wherein
Figure FDA00034125393000000311
Representing the number of vehicles in the set of vehicles in the section m of the time slot t, by
Figure FDA0003412539300000041
Representing the state of each task queue and the state of the energy queue in the vehicle i at the time slot t;
step 3.2: designing system motion space by
Figure FDA0003412539300000042
Describing a task-off decision space, i.e. an action space, of a vehicle, further
Figure FDA0003412539300000043
Wherein
Figure FDA0003412539300000044
Representing an offload decision variable;
step 3.3: energy queue state transition:
Figure FDA0003412539300000045
step 3.4: and (3) task queue state conversion:
Figure FDA0003412539300000046
step 3.5: designing the loss function of the system, at time slot t, the vehicleIn the system state of StTaking action AtThe loss function of (d) is:
Figure FDA0003412539300000047
step 3.6: optimal scheduling strategy of the system:
Figure FDA0003412539300000048
where 0 < η < 1 is a discount factor that indicates the impact of future losses on current operation.
5. The method of claim 4, wherein: the step 4 comprises the following steps:
step 4.1: initializing values of weight theta and Q functions and an experience buffer pool of the double-depth Q network;
step 4.2: initial state S of given Internet of vehicles system0
Step 4.3: for each time slot t is 0maxStep 4.4-step 4.15 are executed;
step 4.4: in a state StTaking action AtThe expected minimum loss representation is calculated:
Figure FDA0003412539300000049
and 4.5, calculating the optimal decision of the depth Q network:
Figure FDA00034125393000000410
step 4.6: selecting actions randomly with a probability p;
step 4.7: if p ≦ ε, choose random action At
Step 4.8: if p > ε, select action:
Figure FDA0003412539300000051
step 4.9: performing action AtTo obtain the followingState of a time slot St+1
Step 4.10: calculating Loss function Losst
Step 4.11: will experience (S)t,At,Losst,St+1) Putting the mixture into an experience buffer pool;
step 4.12: randomly extracting a batch of experience from the buffer pool to be used as a training sample, and calculating a double-depth Q function:
Figure FDA0003412539300000052
step 4.13: computing
Figure FDA0003412539300000053
Step 4.14: calculating L (theta)t) Gradient (2):
Figure FDA0003412539300000054
step 4.15: updating theta based on gradient descent methodt
Figure FDA0003412539300000055
Step 4.16: and (5) the training model is converged to obtain a trained model.
6. The method of claim 5, wherein: the step 5 comprises the following steps:
step 5.1: given a road section m, and a time slot t, the set of vehicles in this road section
Figure FDA0003412539300000059
Step 5.2: performing steps 5.3-5.10 for time slot T ═ 1, 2., T;
step 5.3: randomly selecting n vehicle sets
Figure FDA0003412539300000056
Step 5.4: obtaining the global parameter of a last time slot of the roadside unit m:
Figure FDA0003412539300000057
step 5.5: updating the model local parameters with the global parameters of a time slot on each vehicular roadside unit m:
Figure FDA0003412539300000058
step 5.6: obtaining vehicle local data:
Figure FDA0003412539300000061
step 5.7: utilizing vehicle local data
Figure FDA0003412539300000062
Training the model to obtain vehicle local training parameters and time:
Figure FDA0003412539300000063
step 5.8: uploading model parameters
Figure FDA0003412539300000064
And
Figure FDA0003412539300000065
to the roadside unit m;
step 5.9: the roadside unit m receives the vehicle parameters to carry out global aggregation:
Figure FDA0003412539300000066
step 5.10: aggregating to produce an improved global model and reassigning to the end vehicles;
step 5.11: and (5) the training model is converged, and the federal learning training is completed.
7. The method of claim 6, wherein: before step 5.2, the method also comprises the step of initializing model parameters of the roadside units and the vehicles.
8. The utility model provides a car networking task uninstallation dispatch system which characterized in that: for carrying out the method of any one of claim 1 to claim 7.
9. The system of claim 8, wherein: comprises a roadside unit (RSU) and a terminal vehicle; dividing the whole road into M mutually-disjointed road sections according to the coverage area of the roadside units, wherein a plurality of vehicles are arranged in the coverage area of one roadside unit, and the vehicles and the roadside units complete the calculation and unloading of tasks through wireless links, wherein each vehicle has TsThe vehicle computing system comprises a task priority queue and an energy queue, wherein the task priority queue is used for storing tasks to be processed, and the energy queue provides corresponding energy for vehicle computing tasks.
10. The system of claim 9, wherein: each vehicle and roadside unit has a neural network training model, and vehicles in the same road section can complete distributed federal learning together with the RSU.
CN202111535739.9A 2021-12-15 2021-12-15 Internet of vehicles task unloading scheduling method and system Pending CN114268923A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111535739.9A CN114268923A (en) 2021-12-15 2021-12-15 Internet of vehicles task unloading scheduling method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111535739.9A CN114268923A (en) 2021-12-15 2021-12-15 Internet of vehicles task unloading scheduling method and system

Publications (1)

Publication Number Publication Date
CN114268923A true CN114268923A (en) 2022-04-01

Family

ID=80827399

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111535739.9A Pending CN114268923A (en) 2021-12-15 2021-12-15 Internet of vehicles task unloading scheduling method and system

Country Status (1)

Country Link
CN (1) CN114268923A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114548608A (en) * 2022-04-26 2022-05-27 腾讯科技(深圳)有限公司 Model processing method and device, target traffic equipment and storage medium
CN114860345A (en) * 2022-05-31 2022-08-05 南京邮电大学 Cache-assisted calculation unloading method in smart home scene
CN114980029A (en) * 2022-05-20 2022-08-30 重庆邮电大学 Unloading method based on task relevance in Internet of vehicles
CN115756873A (en) * 2022-12-15 2023-03-07 北京交通大学 Mobile edge computing unloading method and platform based on federal reinforcement learning
CN116506829A (en) * 2023-04-25 2023-07-28 江南大学 Federal edge learning vehicle selection method based on C-V2X communication
CN116506829B (en) * 2023-04-25 2024-05-10 广东北斗烽火台卫星定位科技有限公司 Federal edge learning vehicle selection method based on C-V2X communication

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114548608A (en) * 2022-04-26 2022-05-27 腾讯科技(深圳)有限公司 Model processing method and device, target traffic equipment and storage medium
CN114980029A (en) * 2022-05-20 2022-08-30 重庆邮电大学 Unloading method based on task relevance in Internet of vehicles
CN114860345A (en) * 2022-05-31 2022-08-05 南京邮电大学 Cache-assisted calculation unloading method in smart home scene
CN114860345B (en) * 2022-05-31 2023-09-08 南京邮电大学 Calculation unloading method based on cache assistance in smart home scene
CN115756873A (en) * 2022-12-15 2023-03-07 北京交通大学 Mobile edge computing unloading method and platform based on federal reinforcement learning
CN115756873B (en) * 2022-12-15 2023-10-13 北京交通大学 Mobile edge computing and unloading method and platform based on federation reinforcement learning
CN116506829A (en) * 2023-04-25 2023-07-28 江南大学 Federal edge learning vehicle selection method based on C-V2X communication
CN116506829B (en) * 2023-04-25 2024-05-10 广东北斗烽火台卫星定位科技有限公司 Federal edge learning vehicle selection method based on C-V2X communication

Similar Documents

Publication Publication Date Title
CN114268923A (en) Internet of vehicles task unloading scheduling method and system
Yu et al. Toward resource-efficient federated learning in mobile edge computing
CN112601197B (en) Resource optimization method in train-connected network based on non-orthogonal multiple access
CN109756378B (en) Intelligent computing unloading method under vehicle-mounted network
CN111918245B (en) Multi-agent-based vehicle speed perception calculation task unloading and resource allocation method
CN112104502B (en) Time-sensitive multitask edge computing and cache cooperation unloading strategy method
Huang et al. Vehicle speed aware computing task offloading and resource allocation based on multi-agent reinforcement learning in a vehicular edge computing network
US20220217792A1 (en) Industrial 5g dynamic multi-priority multi-access method based on deep reinforcement learning
CN113132943B (en) Task unloading scheduling and resource allocation method for vehicle-side cooperation in Internet of vehicles
CN113032904B (en) Model construction method, task allocation method, device, equipment and medium
CN109753751A (en) A kind of MEC Random Task moving method based on machine learning
CN110557732A (en) vehicle edge computing network task unloading load balancing system and balancing method
CN111915142B (en) Unmanned aerial vehicle auxiliary resource allocation method based on deep reinforcement learning
CN113641417B (en) Vehicle security task unloading method based on branch-and-bound method
CN114973673B (en) Task unloading method combining NOMA and content cache in vehicle-road cooperative system
CN114884949B (en) Task unloading method for low-orbit satellite Internet of things based on MADDPG algorithm
CN112153145A (en) Method and device for unloading calculation tasks facing Internet of vehicles in 5G edge environment
CN111352713B (en) Automatic driving reasoning task workflow scheduling method oriented to time delay optimization
CN115629873A (en) System and method for controlling unloading of vehicle-road cloud cooperative tasks and stability of task queue
CN114189869A (en) Unmanned vehicle collaborative path planning and resource allocation method based on edge calculation
CN114363803A (en) Energy-saving multi-task allocation method and system for mobile edge computing network
Vishnoi et al. Deep reinforcement learning based throughput maximization scheme for d2d users underlaying noma-enabled cellular network
CN114629769B (en) Traffic map generation method of self-organizing network
CN115052262A (en) Potential game-based vehicle networking computing unloading and power optimization method
CN114928611A (en) Internet of vehicles energy-saving calculation unloading optimization method based on IEEE802.11p protocol

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination