CN113312105B - Vehicle task part unloading strategy method based on Q learning - Google Patents

Vehicle task part unloading strategy method based on Q learning Download PDF

Info

Publication number
CN113312105B
CN113312105B CN202110619282.3A CN202110619282A CN113312105B CN 113312105 B CN113312105 B CN 113312105B CN 202110619282 A CN202110619282 A CN 202110619282A CN 113312105 B CN113312105 B CN 113312105B
Authority
CN
China
Prior art keywords
task
tasks
learning
vehicle
unloading
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110619282.3A
Other languages
Chinese (zh)
Other versions
CN113312105A (en
Inventor
赵海涛
韩哲
王滨
张晖
倪艺洋
朱洪波
张峰
王星
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Hikvision Digital Technology Co Ltd
Nanjing University of Posts and Telecommunications
Original Assignee
Hangzhou Hikvision Digital Technology Co Ltd
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Hikvision Digital Technology Co Ltd, Nanjing University of Posts and Telecommunications filed Critical Hangzhou Hikvision Digital Technology Co Ltd
Priority to CN202110619282.3A priority Critical patent/CN113312105B/en
Publication of CN113312105A publication Critical patent/CN113312105A/en
Application granted granted Critical
Publication of CN113312105B publication Critical patent/CN113312105B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • G06F9/44594Unloading
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • G06F9/5088Techniques for rebalancing the load in a distributed system involving task migration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1004Server selection for load balancing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/12Protocols specially adapted for proprietary or special-purpose networking environments, e.g. medical networks, sensor networks, networks in vehicles or remote metering networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computing Systems (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a vehicle task part unloading strategy method based on Q learning, which is applied to a vehicle-mounted self-organizing network and comprises the following steps: and for tasks requested by the mobile vehicle terminal, task classification is carried out, two extreme task types are eliminated, tasks with extremely sensitive time delay are directly locally unloaded, and tasks with large calculation resource amount are all unloaded to the MEC server for calculation. Secondly, defining task classification factors as beta for the rest businesses which are not easy to judge types n And screening out tasks with less sensitive time delay and general calculation resource amount, and performing partial unloading based on Q learning on the screened tasks. Finally, after the offloading decision of the task requested by all the mobile vehicle user terminals is determined, the computing resources are allocated to the users in each MEC server. The strategy method of the invention fully utilizes local resources and server resources, and reduces the total overhead of the system.

Description

Vehicle task part unloading strategy method based on Q learning
Technical Field
The invention belongs to the field of vehicle networking, and particularly relates to a vehicle task part unloading strategy method based on Q learning in a vehicle-mounted self-organizing network.
Background
With the development of the intelligent automobile industry, intelligent transportation systems (Intelligent Transport System, ITS) are becoming research hotspots, and autonomous control and path planning of vehicles are increasingly widely reflected in future intelligent transportation. In the future, an autonomous vehicle will be equipped with a plurality of sensors that collect data related to the services of the mobile vehicle terminals in the surrounding environment of the vehicle, and many intelligent services in intelligent traffic not only need to rely on the data collected by these sensors, but also the services provided must guarantee a low delay. Implementing intelligent transportation requires the use of different sensors to collect different data, such as energy consumption, ambient characteristics, vehicle status characteristics, and driver fatigue level, which, if processed separately, would not only consume significant computing resources, but would also affect the timeliness and reliability of the intelligent services provided.
The mobile edge calculation is to offload some complicated tasks requested by the mobile vehicle terminal to the MEC server for calculation and storage, and compared with offloading the tasks to the cloud computing center, the method shortens the distance of task transmission and reduces delay. At present, the research of domestic and foreign scholars is a computing and unloading technology, and the computing and unloading process is affected by different factors, such as the performance of mobile vehicle equipment, the quality of a return link, the communication condition of a radio channel and the like, so that the key of computing and unloading is to find a proper unloading decision, the unloading decision mainly depends on the time delay and energy consumption required by the task requested by a mobile vehicle terminal to be locally computed or unloaded to an MEC server, as the concept of task recursion decomposition and parallel computing gradually come up, domestic and foreign researchers begin to pay attention to partial computing and unloading in mobile edge computing, the computing and unloading technology mainly utilizes the MEC server and the mobile vehicle terminal to carry out parallel computing so as to reduce the time delay and resource optimization, one task requested by the vehicle terminal can be divided into two parts, one part of the task is directly put into the local to be computed, and the other part of the task is unloaded to the MEC server to be computed, and compared with the traditional computing and unloading scheme, one key problem that the partial computing and unloading is that the task is only needs to be unloaded to the MEC server to enable the total overhead to be minimized.
ZL2019101438105 discloses a method for task unloading of minimizing vehicle energy consumption based on mobile edge calculation, which makes a selection of a maximum energy-saving selection algorithm and a short-term path prediction algorithm to meet a time delay constraint and minimize task unloading energy consumption at the same time, and the unloading decision algorithm is too complex to be beneficial to calculation of the unloading method.
The ZL2020101714540 discloses a task unloading method and a task unloading device based on a mobile edge computing scene, wherein the method collects task information of equipment terminals, uploads the task information to an edge server through a Small Cell base station, sets an optimization target equation with minimum system overhead, and carries out multiple decomposition and solution on the problem of the optimization target equation; although this approach significantly reduces the overall overhead of the system, it is still inefficient in terms of energy consumption by adjusting the unloading ratio during movement of the vehicle.
Disclosure of Invention
In order to overcome the defects in the prior art, the invention provides a partial unloading method based on Q Learning in a vehicle-mounted self-organizing network, which fully utilizes local resources and server resources, thereby reducing the total cost of a system.
In order to achieve the above purpose, the present invention is realized by the following technical scheme:
the invention relates to a partial unloading strategy method based on Q learning, which comprises the following steps:
s1: task classification is carried out on tasks requested by a mobile vehicle terminal, two extreme task types are eliminated, tasks with extremely sensitive time delay are directly locally unloaded, and tasks with large calculation resource amount are all unloaded to an MEC server for calculation;
s2: for the rest business of which the type is not easy to judge, defining the task classification factor as beta n Screening out tasks with less sensitive time delay and general calculation resource quantity;
s3: partial unloading based on Q learning is carried out on the screened tasks, so that an optimal strategy is obtained;
s4: after the offloading decision of the task requested by all the mobile vehicle user terminals is determined, the computing resources are allocated to the users in each MEC server.
Further, the step S1 specifically includes: this step classifies the tasks requested by the mobile vehicle terminal. Considering two extreme easily-judged task types, one is a task with extremely sensitive time delay, and the most is a safe message task, and the task is directly unloaded locally; the other is a task requiring a large amount of computing resources, namely a task requiring the computing resource capacity of the mobile vehicle terminal itself with the computing resource amount of >2/3, which is generally a map-like message task, and the task is directly offloaded to the MEC server for computing.
Further, the step S2 specifically includes: the step is to simply judge the specific type of service according to the residual delay tolerance and the calculation resource quantity of the tasks, and adopt the definition task classification factor as beta n And deleting a part of the time delay is not sensitive, and the calculation resource quantity is also common to carry out partial unloading on the task.
Here, a task classification factor β is defined n Represented as
Figure GDA0003844964750000031
Wherein d is n For message task data size, C n The amount of resources required for a unit message task size,
Figure GDA0003844964750000032
to accomplish the maximum tolerable latency for the task, N represents a total of N tasks.
Further toThe step S3 specifically includes: this step is less sensitive to time delay, and the computing resource amount is also a partial offloading of general tasks based on Q learning, and the value Q of the action taken by the participant in each state may be denoted as Q (s, a), which reflects the feedback of the environment to the current state s when the action a is performed, thereby measuring the benefit of the current policy pi. The Q values are stored in the form of a Q table, and the value function Q (s, a) is approximated to the target function Q (s, a) by iteratively and updatably learning the Q table without any prior knowledge about the environment, thereby obtaining an optimal strategy pi *
The invention provides a partial unloading strategy method based on Q learning, which classifies tasks of a vehicle terminal, firstly eliminates two conditions of extremely sensitive time delay and extremely large computing resource amount, secondly classifies the rest tasks according to the requirements of the time delay and the computing resource, and performs partial unloading on tasks with less sensitive time delay and general computing resource amount so as to reduce the total cost of the system, and finally obtains the optimal unloading decision by using Q learning according to a multi-objective optimization model.
The beneficial effects of the invention are as follows: the invention provides a partial unloading strategy method based on Q learning in the Internet of vehicles so as to fully utilize local resources and server resources, thereby reducing the total overhead of the system.
Drawings
FIG. 1 is a schematic flow chart of the method of the present invention.
Detailed Description
Embodiments of the invention are disclosed in the drawings, and for purposes of explanation, numerous practical details are set forth in the following description. However, it should be understood that these practical details are not to be taken as limiting the invention.
The invention discloses a vehicle task part unloading strategy method based on Q learning in a vehicle-mounted self-organizing network, which is applied to the vehicle-mounted self-organizing network and comprises the following steps: firstly, task classification is carried out on tasks requested by a mobile vehicle terminal, two extreme task types are eliminated, and time delay is extremely sensitiveDirectly unloading tasks locally, and completely unloading tasks with large required computing resource amount to an MEC server for computing; secondly, defining task classification factors as beta for the rest businesses which are not easy to judge types n Screening out tasks with less sensitive time delay and general calculation resource quantity, and performing partial unloading based on Q learning on the screened tasks; finally, after the offloading decision of the task requested by all the mobile vehicle user terminals is determined, the computing resources are allocated to the users in each MEC server.
As shown in fig. 1, the method for unloading a vehicle task part based on Q learning in a vehicle-mounted ad hoc network according to the present invention specifically includes the following steps:
s1: for tasks requested by a mobile vehicle terminal, task classification is carried out, firstly, two extreme task types which are easy to judge are considered, one task is a task with extremely sensitive time delay, namely a safe message task, and the task is directly unloaded locally; the other is a task requiring a large amount of computing resources, namely a task requiring the computing resource capacity of the mobile vehicle terminal itself with the computing resource amount of >2/3, which is generally a map-like message task, and the task is directly offloaded to the MEC server for computing.
S2: for the rest business which is not easy to pass through the time delay tolerance and the calculation resource quantity of the task to simply judge the specific type, adopting the definition task classification factor as beta n Some of the time delays are not sensitive, the task with common computing resource quantity is partially unloaded, the complexity of the partial unloading decision algorithm can be reduced by the classification mode, and the task classification factor beta is defined n Can be expressed as
Figure GDA0003844964750000051
Wherein d is n For message task data size, C n The amount of resources required for a unit message task size,
Figure GDA0003844964750000052
to accomplish the maximum tolerable latency for the task, N represents a total of N tasks.
Definition τ 1 ,τ 2 For the threshold value, assume a nm To offload decision variables, where a nm E (0, 1), when beta n >τ 2 I.e. a nm =0, which is offloaded to the local for calculation; when beta is n <τ 1 I.e. a nm =1, which is offloaded to MEC server for computation, when τ 1 <β n <τ 2 When it is partially unloaded.
S3: partial unloading based on Q learning is carried out on the screened tasks, so that an optimal strategy is obtained;
in particular, the value Q of the action taken by the participant in each state may be denoted as Q (s, a), which reflects the feedback of the environment to the current state s when action a is performed, thereby measuring the degree of benefit of the current policy pi. The Q values are stored in the form of a Q table, and the value function Q (s, a) is approximated to the target function Q (s, a) by iteratively and updatably learning the Q table without any prior knowledge about the environment, thereby obtaining an optimal strategy pi *
Assuming that an offloading decision is made in one time slot of the network, the actions of the mobile vehicle user in this time slot are defined as
Figure GDA0003844964750000053
In the method, random action selection is considered to cause unbalanced load of the MEC server, so when actions are selected, the actions are selected in a greedy mode, the defined exploration rate is represented by epsilon, and then the action selection can be represented as
Figure GDA0003844964750000061
V is the vehicle user set, v= { V 1 ,V 2 ,...,V n }。
The optimal selection strategy is:
Figure GDA0003844964750000062
Figure GDA0003844964750000063
where g (s, a) is the prize for performing action a in the current state s.
The Q function is adopted as an evaluation function, and the maximum total expected return function generated after learning is as follows:
Figure GDA0003844964750000064
wherein eta is taken as a learning parameter and satisfies 0.ltoreq.eta.ltoreq.1, so that an update formula of the Q value can be expressed as:
Figure GDA0003844964750000065
wherein the method comprises the steps of
Figure GDA0003844964750000066
Defined as a learning rate, which indicates how much user rewards are learned in this state, since the goal of the method is to minimize the total overhead of the system, the rewards for one slot are:
Figure GDA0003844964750000067
Figure GDA0003844964750000068
time required for all computing tasks to be performed, < >>
Figure GDA0003844964750000069
The energy required to complete all tasks, lambda 1 λ 2 As a weighted value, lambda 12 =1。
It can be seen that only the current state is needed to learn by continuous iteration
Figure GDA00038449647500000610
Up to the value of the corresponding user offloading decision quantity alpha at convergence nm I.e. optimal.
S4: after the unloading decision of the tasks requested by all the mobile vehicle user terminals is determined, computing resource allocation is continuously carried out on the users in each MEC server, the resource allocation problem is a convex optimization problem, and the optimal solution of the task can be obtained through a Lagrange multiplier method according to KKT conditions.
The foregoing description is only illustrative of the invention and is not to be construed as limiting the invention. Various modifications and variations of the present invention will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, or the like, which is within the spirit and principles of the present invention, should be included in the scope of the claims of the present invention.

Claims (3)

1. A vehicle task part unloading strategy method based on Q learning is characterized in that: the method comprises the following steps:
s1: task classification is carried out on tasks requested by the mobile vehicle terminal, wherein the task classification firstly considers two extreme task types which are easy to judge, namely a task with extremely sensitive time delay, namely a safe message task, the task is directly unloaded locally, and the task with large calculation resource quantity is needed, namely the task with the calculation resource capacity of the mobile vehicle terminal per se with the needed calculation resource quantity of more than 2/3, and the task is directly unloaded to an MEC server for calculation;
s2: for the rest service definition task classification factor of the type which is not easy to judge is beta n Screening out tasks with less sensitive time delay and general calculation resource quantity, and defining task classification factor beta n Represented as
Figure FDA0003959576760000011
Wherein d is n For message task data size, C n The amount of resources required for a unit message task size,
Figure FDA0003959576760000012
to achieve the maximum tolerable latency for the task, N represents a total of N tasks,
definition τ 1 ,τ 2 For the threshold value, assume a nm To offload decision variables, where a nm E (0, 1), when beta n >τ 2 I.e. a nm =0, which is offloaded to the local for calculation; when beta is n <τ 1 I.e. a nm =1, which is offloaded to MEC server for computation, when τ 1 <β n <τ 2 When in use, the device is partially unloaded;
s3: partial unloading based on Q learning is carried out on the screened tasks, so that an optimal strategy is obtained;
s4: after the offloading decision of the task requested by all the mobile vehicle user terminals is determined, the computing resources are allocated to the users in each MEC server.
2. A vehicle mission section offload strategy method based on Q learning as claimed in claim 1, wherein: the step S3 specifically comprises the following steps: the step is to perform partial unloading based on Q learning for tasks with less sensitivity to time delay and general calculation resource quantity, the value Q value of actions taken by participants in each state can be expressed as Q (s, a), which reflects feedback of the environment to the current state s when the action a is executed, so that the benefit degree of the current strategy pi is measured, the Q values can be stored in the form of a Q table, the Q table is consulted to perform iterative update learning under the condition that no prior knowledge about the environment is needed, and the value function Q (s, a) approaches to the target function Q (s, a), so that the optimal strategy pi is obtained.
3. A vehicle mission section offload strategy method based on Q learning as claimed in claim 1, wherein: the method is applied to the vehicle-mounted self-organizing network.
CN202110619282.3A 2021-06-03 2021-06-03 Vehicle task part unloading strategy method based on Q learning Active CN113312105B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110619282.3A CN113312105B (en) 2021-06-03 2021-06-03 Vehicle task part unloading strategy method based on Q learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110619282.3A CN113312105B (en) 2021-06-03 2021-06-03 Vehicle task part unloading strategy method based on Q learning

Publications (2)

Publication Number Publication Date
CN113312105A CN113312105A (en) 2021-08-27
CN113312105B true CN113312105B (en) 2023-05-02

Family

ID=77377270

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110619282.3A Active CN113312105B (en) 2021-06-03 2021-06-03 Vehicle task part unloading strategy method based on Q learning

Country Status (1)

Country Link
CN (1) CN113312105B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111372314A (en) * 2020-03-12 2020-07-03 湖南大学 Task unloading method and task unloading device based on mobile edge computing scene

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109710336B (en) * 2019-01-11 2021-01-05 中南林业科技大学 Mobile edge computing task scheduling method based on joint energy and delay optimization
CN111918311B (en) * 2020-08-12 2022-04-12 重庆邮电大学 Vehicle networking task unloading and resource allocation method based on 5G mobile edge computing

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111372314A (en) * 2020-03-12 2020-07-03 湖南大学 Task unloading method and task unloading device based on mobile edge computing scene

Also Published As

Publication number Publication date
CN113312105A (en) 2021-08-27

Similar Documents

Publication Publication Date Title
CN112118601B (en) Method for reducing task unloading delay of 6G digital twin edge computing network
CN113778648B (en) Task scheduling method based on deep reinforcement learning in hierarchical edge computing environment
CN114143346A (en) Joint optimization method and system for task unloading and service caching of Internet of vehicles
CN114650228B (en) Federal learning scheduling method based on calculation unloading in heterogeneous network
CN113542376A (en) Task unloading method based on energy consumption and time delay weighting
Li et al. Joint optimization of computation cost and delay for task offloading in vehicular fog networks
CN116782296A (en) Digital twinning-based internet-of-vehicles edge computing and unloading multi-objective decision method
CN114615730B (en) Power distribution method for content coverage of backhaul-limited dense wireless network
Hammami et al. On-policy vs. off-policy deep reinforcement learning for resource allocation in open radio access network
CN116321307A (en) Bidirectional cache placement method based on deep reinforcement learning in non-cellular network
Li et al. Collaborative optimization of edge-cloud computation offloading in internet of vehicles
CN113961204A (en) Vehicle networking computing unloading method and system based on multi-target reinforcement learning
CN113312105B (en) Vehicle task part unloading strategy method based on Q learning
CN111580943B (en) Task scheduling method for multi-hop unloading in low-delay edge calculation
Zhou et al. Hierarchical multi-agent deep reinforcement learning for energy-efficient hybrid computation offloading
Chen et al. Profit-Aware Cooperative Offloading in UAV-Enabled MEC Systems Using Lightweight Deep Reinforcement Learning
CN117459112A (en) Mobile edge caching method and equipment in LEO satellite network based on graph rolling network
CN111882119A (en) Battery SOH prediction optimization method based on SA-BP neural network
CN116820603A (en) Intelligent factory redundancy unloading method based on deep reinforcement learning
Zhou et al. User‐centric data communication service strategy for 5G vehicular networks
CN115460710A (en) Intelligent calculation unloading method in vehicle edge calculation scene based on deep reinforcement learning
CN115622603A (en) Age minimization optimization method for auxiliary transmission information
Chen et al. Tasks-oriented joint resource allocation scheme for the Internet of vehicles with sensing, communication and computing integration
Huang et al. Latency guaranteed edge inference via dynamic compression ratio selection
Wang et al. Learning From Images: Proactive Caching With Parallel Convolutional Neural Networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant