CN115550944A - Dynamic service placement method based on edge calculation and deep reinforcement learning in Internet of vehicles - Google Patents
Dynamic service placement method based on edge calculation and deep reinforcement learning in Internet of vehicles Download PDFInfo
- Publication number
- CN115550944A CN115550944A CN202210992657.5A CN202210992657A CN115550944A CN 115550944 A CN115550944 A CN 115550944A CN 202210992657 A CN202210992657 A CN 202210992657A CN 115550944 A CN115550944 A CN 115550944A
- Authority
- CN
- China
- Prior art keywords
- service
- network
- edge
- vehicles
- edge server
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 31
- 230000002787 reinforcement Effects 0.000 title claims abstract description 28
- 230000006870 function Effects 0.000 claims abstract description 55
- 230000009471 action Effects 0.000 claims abstract description 32
- 238000011156 evaluation Methods 0.000 claims abstract description 18
- 238000012549 training Methods 0.000 claims abstract description 11
- 238000004891 communication Methods 0.000 claims description 10
- 238000013507 mapping Methods 0.000 claims description 10
- 150000001875 compounds Chemical class 0.000 claims description 8
- 241000470001 Delaya Species 0.000 claims description 4
- 241000695274 Processa Species 0.000 claims description 4
- 230000006855 networking Effects 0.000 description 8
- 230000001934 delay Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W16/00—Network planning, e.g. coverage or traffic planning tools; Network deployment, e.g. resource partitioning or cells structures
- H04W16/18—Network planning tools
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5072—Grid computing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16Y—INFORMATION AND COMMUNICATION TECHNOLOGY SPECIALLY ADAPTED FOR THE INTERNET OF THINGS [IoT]
- G16Y20/00—Information sensed or collected by the things
- G16Y20/10—Information sensed or collected by the things relating to the environment, e.g. temperature; relating to location
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16Y—INFORMATION AND COMMUNICATION TECHNOLOGY SPECIALLY ADAPTED FOR THE INTERNET OF THINGS [IoT]
- G16Y20/00—Information sensed or collected by the things
- G16Y20/30—Information sensed or collected by the things relating to resources, e.g. consumed power
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16Y—INFORMATION AND COMMUNICATION TECHNOLOGY SPECIALLY ADAPTED FOR THE INTERNET OF THINGS [IoT]
- G16Y30/00—IoT infrastructure
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W16/00—Network planning, e.g. coverage or traffic planning tools; Network deployment, e.g. resource partitioning or cells structures
- H04W16/22—Traffic simulation tools or models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/50—Indexing scheme relating to G06F9/50
- G06F2209/502—Proximity
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Mathematical Physics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- Environmental & Geological Engineering (AREA)
- Toxicology (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention discloses a dynamic service placement method based on edge calculation and deep reinforcement learning in the Internet of vehicles, which comprises the following steps: 1) Establishing a network and service request model, and acquiring network and service request related information; 2) Establishing a network and service request calculation model; 3) Constructing a state space, an action space, a strategy function and a reward function; 4) Constructing an actor network and a criticizing family network, and training the actor network and the criticizing family network; 5) The actor network generates a service placement strategy and inputs the strategy into a critic network; 6) And the criticizing family network evaluates the strategy quality of the service placement strategy, updates actor network parameters if the evaluation fails, returns to the step 5), and outputs the service placement strategy if the evaluation passes. The present invention minimizes the maximum edge resource usage and service delay while taking into account the mobility of the vehicle, changing demands, and dynamics of different types of service requests.
Description
Technical Field
The invention relates to the field of Internet of vehicles, in particular to a dynamic service placement method based on edge calculation and deep reinforcement learning in the Internet of vehicles.
Background
The internet of vehicles is an interactive network formed by information such as vehicle position, speed and route. The rapid development of communication technology has brought many new possibilities to the current field of car networking. In addition, the emergence of the fifth generation mobile communication technology enables the internet of vehicles to become more intelligent, and the service coverage range is further expanded. However, as delay-sensitive applications such as intelligent voice assistance and automatic driving become the most popular applications in the field of car networking, the traditional cloud computing paradigm is gradually unable to meet the needs of users. The European telecommunication standards institute introduces mobile edge computing into the field of car networking, expands storage resources and computing resources of cloud computing, enables the cloud computing to be closer to users, and meets requirements of the users on high reliability, low delay, safety and the like of intelligent application.
In the internet of vehicles, vehicles communicate with infrastructure to obtain services such as media downloads, collaboration messages, decentralized environment notification messages, and so on, to coordinate among applications such as remote driving, parking space discovery, navigation, and so on. In the edge computing paradigm, multiple services can be deployed on an edge server, leveraging computing and storage resources. Service placement is one of the research hotspots in the field of car networking. In particular, service placement is the mapping of services to edge servers in the internet of vehicles to meet the demand of requested services while efficiently using edge resources. From a user perspective, it is important to minimize delays in vehicle awareness services. From the service provider's perspective, maximizing edge resource usage is desirable while maintaining as much resource load balancing between servers as possible.
Disclosure of Invention
The invention aims to provide a dynamic service placement method based on edge calculation and deep reinforcement learning in the Internet of vehicles, which comprises the following steps:
1) Establishing a network and service request model, and acquiring network and service request related information;
the network and service request related information comprises edge server information, vehicle information and service information;
the edge server information comprises an edge server set E, an edge server E and residual resource capacity C of the edge server E e ;
The vehicle information includes a set of vehicles V.
The service information comprises a service set S and a vehicle number lambda of the request service S s One service instance at a time (e.g., media file download in an internet of vehicles environment, collaboration awareness messages, environment notification services, etc.) or the number of vehicles epsilon that can provide parallel connections, the specified time t and vehicle location loc in a service request message, the amount of resources R consumed by an edge server deployment service s s Time delay requirement threshold D s 。
2) Establishing a network and service request calculation model;
the network and service request calculation model comprises a total service delay calculation model and an edge resource utilization rate calculation model;
the total service delay calculation model is as follows:
in the formula (I), the compound is shown in the specification,the total service delay;propagation delay and queuing delay; dist (v, s) is the Euclidean distance between the vehicle v and the edge server deployed by the service s; c is the propagation speed of the signal through the communication medium;
number of vehicles lambda when requesting service s s When the number is less than or equal to epsilon, the queuing delayNumber of vehicles lambda when requesting service s s When is greater than epsilon, queuing delaySatisfies the following formula:
wherein the number is different by λ' s =λ s -ε;
where dist (v, s) is the Euclidean distance between the vehicle v and the edge server deployed by the service s; and c is the propagation speed of the signal through the communication medium.
The edge resource utilization calculation model is as follows:
edge resource usage rateIs the ratio between the resources consumed by the service instance and the available resources of the edge server, as follows:
in the formula, parameterC e Is the remaining resource capacity of the edge server e;is edge resource usage; r s The amount of resources consumed to deploy service s for the edge server.
3) Constructing a state space, an action space, a strategy function and a reward function;
the state space is characterized by a set of state spaces ω, namely:
ω={[v 1 ,loc 1 ,s],[v 2 ,loc 2 ,s],...,[v n ,loc n ,s]} t (6)
wherein S belongs to S; v. of 1 ,v 2 ,...,v n A set of vehicles; loc 1 ,loc 2 ,...,loc n At time t, a set of vehicle positions for service s is requested.
The action space is used for describing actions taken when the service is placed on the edge server;
wherein the action a taken at a given time t is as follows:
in the formula, pi is a strategy function required by generating action on an observation set of omega in a time unit t;the representation service s is deployed in an edge server e;meaning that service s is not deployed at edge server e.
The strategy function pi is a function executed by an actor network and is used for mapping a state space to an action space, namely pi, omega → a;
the objective of the policy function pi is to minimize the maximum edge resource usage and service latency and to control the relative importance of resource usage and service latency by using the parameter beta. The policy function pi is expressed as
Wherein, beta is a weight coefficient;
the constraints of the policy function pi include mapping constraintsTime delay constraintResource constraints
The reward function is as follows:
in the formula (I), the compound is shown in the specification,is an instant prize. Gamma is the reward factor.Service delay at time t;
4) Constructing an actor network and a criticizing family network, and training the actor network and the criticizing family network;
in the formula, theta is a criticizing family network parameter;a target value for evaluating the quality of the strategy; q i (ω, a; θ) placing the policy quality of the policy for the service;the number of available resource units in the edge server;
5) The actor network generates a service placement strategy and inputs the strategy into the critic network;
6) And the criticizing family network evaluates the strategy quality of the service placement strategy, updates actor network parameters if the evaluation fails, returns to the step 5), and outputs the service placement strategy if the evaluation passes.
The method for evaluating the policy quality of the service placement policy by the criticizing family network comprises the following steps: judging criticizing family network loss functionAnd whether convergence is achieved, if the convergence is achieved, the evaluation is passed, and if the convergence is not achieved, the evaluation is not passed.
It is worth noting that the present invention proposes a three-tier car networking architecture based on edge computing, and considers the dynamic service placement problem, with the optimization goal of minimizing maximum edge resource usage (from the service provider's perspective) and service delay (from the user's perspective).
In addition, the invention provides a service placement framework based on deep reinforcement learning, which consists of a strategy function (actor network) and a value function (critic network). The actor network makes a service placement strategy, while the critic network evaluates the performance of decisions made by the actor network based on delays observed by the vehicle.
The technical effect of the invention is undoubted. The invention provides a dynamic service placement method based on edge calculation and deep reinforcement learning in an internet of vehicles, which provides a dynamic service placement framework based on deep reinforcement learning in the internet of vehicles, and aims to minimize the maximum edge resource use and service delay while considering the mobility of vehicles, the changing requirements and the dynamics of different types of service requests.
Drawings
FIG. 1 is a three-tier vehicle networking rack based on edge calculations;
FIG. 2 is an agent structure;
FIG. 3 is a flow chart of the present invention.
Detailed Description
The present invention is further illustrated by the following examples, but it should not be construed that the scope of the above-described subject matter is limited to the following examples. Various substitutions and modifications can be made without departing from the technical idea of the invention and the scope of the invention according to the common technical knowledge and the conventional means in the field.
Example 1:
referring to fig. 1 to 3, a dynamic service placement method based on edge computing and deep reinforcement learning in a vehicle networking includes the following steps:
1) Establishing a network and service request model, and acquiring network and service request related information;
the network and service request related information comprises edge server information, vehicle information and service information;
the edge server information comprises an edge server set E, an edge server E and residual resource capacity C of the edge server E e ;
The vehicle information includes a set of vehicles V.
The service information comprises a service set S and a vehicle number lambda of the request service S s Number of vehicles epsilon that can handle one service instance at a time (e.g., media file download, collaboration awareness messaging and context notification services in an internet of vehicles environment, etc.) or can provide parallel connections, specified time t and vehicle location loc in a service request message, amount of resources R consumed by edge server deployment service s s Time delay requirement threshold D s 。
2) Establishing a network and service request calculation model;
the network and service request calculation model comprises a total service delay calculation model and an edge resource utilization rate calculation model;
the total service delay calculation model is as follows:
in the formula (I), the compound is shown in the specification,the total service delay;propagation delay and queuing delay; dist (v, s) is the Euclidean distance between the vehicle v and the edge server deployed by the service s; c is the propagation speed of the signal through the communication medium;
number of vehicles lambda when requesting service s s When the number is less than or equal to epsilon, the queuing delayNumber of vehicles lambda when requesting service s s When more than epsilon, queuing delaySatisfies the following formula:
wherein, number quantity difference λ' s =λ s -ε;
where dist (v, s) is the Euclidean distance between the vehicle v and the edge server deployed by the service s; and c is the propagation speed of the signal through the communication medium.
The edge resource utilization calculation model is as follows:
edge resource usage rateIs the ratio between the resources consumed by the service instance and the available resources of the edge server, as follows:
in the formula, parameterC e Is the remaining resource capacity of the edge server e;is edge resource usage; r s The amount of resources consumed by deploying service s for the edge server.
3) Constructing a state space, an action space, a strategy function and a reward function;
the state space is characterized by a set of state spaces ω, namely:
ω={[v 1 ,loc 1 ,s],[v 2 ,loc 2 ,s],...,[v n ,loc n ,s]} t (6)
wherein S belongs to S; v. of 1 ,v 2 ,...,v n A set of vehicles; loc 1 ,loc 2 ,...,loc n At t, a set of vehicle positions serving s is requested.
The action space is used for describing actions taken when the service is placed on the edge server;
wherein the action a taken at a given time t is as follows:
in the formula, pi is a strategy function required by generating action on an observation set of omega in a time unit t;the representation service s is deployed in an edge server e;meaning that service s is not deployed at edge server e.
The strategy function pi is a function executed by an actor network and is used for mapping a state space to an action space, namely pi, omega → a;
the objective of the policy function pi is to minimize the maximum edge resource usage and service latency and to control the relative importance of resource usage and service latency by using the parameter beta. The policy function pi is expressed as
Wherein, beta is a weight coefficient;
the principle of the policy function pi is: and (4) iterating the service set and the edge server set through subscripts s and e, searching the maximum edge resource use and service delay, and minimizing the maximum edge resource use and service delay to obtain a corresponding strategy function pi.
The constraints of the policy function pi include mapping constraintsTime delay constraintResource constraints
The reward function is as follows:
in the formula (I), the compound is shown in the specification,is an instant prize. Gamma is the reward factor.Service delay at time t;
4) Constructing an actor network and a critic network, and training the actor network and the critic network;
in the formula, theta is a criticizing family network parameter;a target value for evaluating the quality of the policy; q i (ω, a; θ) placing the policy quality of the policy for the service;the number of available resource units in the edge server;
5) The actor network generates a service placement strategy and inputs the strategy into the critic network;
6) And the criticizing family network evaluates the strategy quality of the service placement strategy, updates actor network parameters if the evaluation fails, and returns to the step 5), and outputs the service placement strategy if the evaluation passes.
The method for evaluating the policy quality of the service placement policy by the criticizing family network comprises the following steps: judging criticizing family network loss functionAnd whether convergence is achieved, if the convergence is achieved, the evaluation is passed, and if the convergence is not achieved, the evaluation is not passed.
Example 2:
a dynamic service placement method based on edge calculation and deep reinforcement learning in the Internet of vehicles comprises the following steps:
1) And establishing a network and service request model, and acquiring edge server information, vehicle information and service information.
The server information, the vehicle information and the service information comprise an edge server set E, an edge server E and residual resource capacity C of the edge server E e Vehicle set V and service set S, number of vehicles λ requesting service S s One service instance at a time (e.g., media file download, collaboration awareness messaging and context notification services in an Internet of vehicles environment, etc.) or the number of vehicles ε, t, and loc of vehicle locations in a service request message, and the amount of resources R consumed by an edge server to deploy a service s s Delay requirement threshold D s 。
2) And establishing a calculation model.
2.1 ) total service delay modeling. The entire edge Internet of vehicles system is modeled as an M/D/1 queue. Wherein, when the service s is requested from the edge server e, the total service time delay of the vehicleRefers to the total time from when the vehicle sends a service request to when the edge server receives a corresponding response. Total service delayBy propagation delayAnd queuing delayConsists of the following components:
if λ s Less than or equal to epsilon, delay in queuingIs 0. If λ s > epsilon, a queue is created and the average queuing delay for service s on the edge server will be as follows:
wherein, λ' s =λ s ε, the average propagation delay is calculated as the ratio of the distance to the propagation velocity over the medium, as follows:
where dist (v, s) is the euclidean distance between the vehicle v and the edge server deployed by the service s, and c is the propagation speed of the signal through the communication medium. Thus, the total service latency is as follows:
2.2 Edge resource usage modeling. Edge resource usage rateIs the ratio between the resources consumed by the service instance and the available resources of the edge server, as follows:
3) And designing a state space. At a given time t, the state space set describes the network environment. The agent observes the environment to form a set of state spaces ω from the service request model, as follows:
ω={[v 1 ,loc 1 ,s],[v 2 ,loc 2 ,s],...,[v n ,loc n ,s]} t 。 (6)
wherein S ∈ S, v 1 ,v 2 ,...,v n As a set of vehicles, loc 1 ,loc 2 ,...,loc n At time t, a set of vehicle positions for service s is requested.
4) Designed motion space. The action space describes the actions taken by the policy module when placing a service on an edge server, the actions taken at a given time t are as follows:
where π is the policy function required to generate an action on the observed set of ω in time unit tBinary variableA matrix is given indicating the location of the service s on the edge server e,the representation service s is deployed at the edge server e. On the contrary, the method can be used for carrying out the following steps,meaning that service s is not deployed at edge server e.
5) And designing a strategy function. The policy function pi is a function performed by the actor network to map the state space to the action space pi:ω → a. The objective of the policy function pi is to minimize the maximum edge resource usage and service latency and to control the relative importance of resource usage and service latency by using the parameter beta. The policy function pi is expressed as
The policy function is also subject to the mapping constraints,the time delay is constrained in a manner that,and the constraints of the resources are also included,
6) A reward function is designed. At each time unit t, the system receives an immediate reward from the environment in response to an action taken by the agent's actor networkAs follows:
7) And constructing a critics network, and evaluating the quality Q (omega, a) of the decision made by the actor network. Inputting the state, the action and the reward to train the criticizing network, and updating the parameter theta of the criticizing network to minimize the loss functionAs follows:
wherein, y t Is the target value. A replay memory M is further used for storing the experience of training the critic's network. The critic network acquires experience after a random period of time in the usage replay and optimizes network parameters for better performance.
8) After the training convergence of the actor network and the critic network in the steps, the actor network can find the optimal placement strategy of the service while considering the mobility and the dynamics of vehicles in different types of service requests. The criticizing family network can evaluate the strategy quality of the actor network through a value function.
Example 3:
a dynamic service placement method based on edge calculation and deep reinforcement learning in the Internet of vehicles comprises the following steps:
1) And establishing the network and service request model and acquiring the related information of the network and service request.
2) Establishing a network and service request calculation model;
3) Constructing a state space, an action space, a strategy function and a reward function;
4) Constructing an actor network and a criticizing family network, and training the actor network and the criticizing family network;
5) The actor network generates a service placement strategy and inputs the strategy into the critic network;
6) And the criticizing family network evaluates the strategy quality of the service placement strategy, updates actor network parameters if the evaluation fails, returns to the step 5), and outputs the service placement strategy if the evaluation passes.
Example 4:
a dynamic service placement method based on edge calculation and deep reinforcement learning in the Internet of vehicles is disclosed in the embodiment 3, wherein the network and service request related information comprises edge server information, vehicle information and service information;
the edge server information comprises an edge server set E, an edge server E and residual resource capacity C of the edge server E e ;
The vehicle information includes a set of vehicles V.
The service information comprises a service set S and the number lambda of vehicles requesting the service S s The number of vehicles epsilon that can handle one service instance at a time or can provide parallel connections, the specified time t and vehicle location loc in the service request message, the amount of resources R consumed by the edge server to deploy the service s s Time delay requirement threshold D s (ii) a The service instances include media file downloads, collaboration aware messaging, and environment notification services in an internet of vehicles environment.
Example 5:
a dynamic service placement method based on edge calculation and deep reinforcement learning in the Internet of vehicles is disclosed in the embodiment 3, wherein the network and service request calculation model comprises a total service delay calculation model and an edge resource utilization calculation model;
the total service delay calculation model is as follows:
in the formula (I), the compound is shown in the specification,the total service delay;propagation delay and queuing delay; dist (v, s) is the Euclidean distance between the vehicle v and the edge server deployed by the service s; c is the propagation speed of the signal through the communication medium;
number of vehicles lambda when requesting service s s When the number is less than or equal to epsilon, the queuing delayNumber of vehicles lambda when requesting service s s When is greater than epsilon, queuing delaySatisfies the following formula:
wherein the number is different by λ' s =λ s -ε;
where dist (v, s) is the Euclidean distance between vehicle v and the edge server deployed by service s; and c is the propagation speed of the signal through the communication medium.
The edge resource utilization calculation model is as follows:
edge resource usage rateIs the ratio between the resources consumed by the service instance and the available resources of the edge server, as follows:
in the formula, parameterC e Is the remaining resource capacity of the edge server e;is edge resource usage; r is s The amount of resources consumed to deploy service s for the edge server.
Example 6:
a dynamic service placement method based on edge calculation and deep reinforcement learning in the Internet of vehicles is disclosed in the embodiment 3, wherein the state space is characterized by a state space set omega, namely:
ω={[v 1 ,loc 1 ,s],[v 2 ,loc 2 ,s],...,[v n ,loc n ,s]} t (6)
wherein S belongs to S; v. of 1 ,v 2 ,...,v n A set of vehicles; loc 1 ,loc 2 ,...,loc n At t, a set of vehicle positions serving s is requested.
Example 7:
a dynamic service placement method based on edge calculation and deep reinforcement learning in the Internet of vehicles is disclosed in the embodiment 3, wherein the action space is used for describing actions taken when a service is placed on an edge server;
wherein the action a taken at a given time t is as follows:
where π is the policy function required to generate an action on the observed set of ω at time unit t;the representation service s is deployed in an edge server e;meaning that service s is not deployed at edge server e.
Example 8:
a dynamic service placement method based on edge calculation and deep reinforcement learning in the Internet of vehicles is disclosed in the embodiment 3, wherein the strategy function pi is a function executed by an actor network and is used for mapping a state space to an action space, i.e. pi: omega → a;
the objective of the policy function pi is to minimize the maximum edge resource usage and service latency and to control the relative importance of resource usage and service latency by using the parameter β;
the policy function pi is expressed as follows:
in the formula, β is a weight coefficient.
The constraints of the policy function pi include mapping constraintsTime delay constraintResource constraints
Example 9:
a dynamic service placement method based on edge calculation and deep reinforcement learning in the Internet of vehicles is disclosed in the embodiment 3, wherein the reward function is as follows:
in the formula (I), the compound is shown in the specification,is an instant prize. Gamma is the reward factor.Service time delay at the moment t;
example 10:
a dynamic service placement method based on edge calculation and deep reinforcement learning in the Internet of vehicles is disclosed in an embodiment 3, wherein a loss function in the criticizing family network training processAs follows:
in the formula, theta is a criticizing family network parameter;a target value for evaluating the quality of the strategy; q i (ω, a; θ) placing the policy quality of the policy for the service;the number of available resource units in the edge server;
example 11:
a dynamic service placement method based on edge computing and deep reinforcement learning in a vehicle networking system is disclosed in an embodiment 3, wherein the method for evaluating the policy quality of a service placement policy by a criticizing network comprises the following steps: judging criticizing family network loss functionAnd whether convergence is achieved, if the convergence is achieved, the evaluation is passed, and if the convergence is not achieved, the evaluation is not passed.
Claims (9)
1. A dynamic service placement method based on edge calculation and deep reinforcement learning in Internet of vehicles is characterized by comprising the following steps:
1) And establishing the network and service request model and acquiring the related information of the network and service request.
2) Establishing a network and service request calculation model;
3) Constructing a state space, an action space, a strategy function and a reward function;
4) Constructing an actor network and a criticizing family network, and training the actor network and the criticizing family network;
5) The actor network generates a service placement strategy and inputs the strategy into a critic network;
6) And the criticizing family network evaluates the strategy quality of the service placement strategy, updates actor network parameters if the evaluation fails, returns to the step 5), and outputs the service placement strategy if the evaluation passes.
2. The dynamic service placement method based on edge computing and deep reinforcement learning in the internet of vehicles according to claim 1, wherein the network and service request related information comprises edge server information, vehicle information and service information;
the edge server information comprises an edge server set E, an edge server E and residual resource capacity C of the edge server E e ;
The vehicle information includes a set of vehicles V.
The service information comprises a service set S and a vehicle number lambda of the request service S s The number of vehicles epsilon that can handle one service instance at a time or can provide parallel connections, the specified time t and vehicle location loc in the service request message, the amount of resources R consumed by the edge server deployment service s s Delay requirement threshold D s (ii) a The service instances include media file downloads, collaboration aware messaging, and environment notification services in an internet of vehicles environment.
3. The dynamic service placement method based on edge computing and deep reinforcement learning in the Internet of vehicles according to claim 1, wherein the network and service request computing model comprises a total service delay computing model and an edge resource utilization computing model;
the total service delay calculation model is as follows:
in the formula (I), the compound is shown in the specification,the total service delay;propagation delay and queuing delay; dist (v, s) is the Euclidean distance between the vehicle v and the edge server deployed by the service s; c is the propagation speed of the signal through the communication medium;
number of vehicles lambda when requesting service s s When the number is less than or equal to epsilon, the queuing delayNumber of vehicles lambda when requesting service s s When is greater than epsilon, queuing delaySatisfies the following formula:
wherein the number is different by λ' s =λ s -ε;
where dist (v, s) is the Euclidean distance between vehicle v and the edge server deployed by service s; and c is the propagation speed of the signal through the communication medium.
The edge resource utilization calculation model is as follows:
edge resource usage rateIs the ratio between the resources consumed by the service instance and the available resources of the edge server, as follows:
4. The dynamic service placement method based on edge computation and deep reinforcement learning in the internet of vehicles according to claim 1, wherein the state space is characterized by a state space set ω, namely:
ω={[v 1 ,loc 1 ,s],[v 2 ,loc 2 ,s],...,[v n ,loc n ,s]} t (6)
wherein S belongs to S; v. of 1 ,v 2 ,...,v n A set of vehicles; loc C 1 ,loc 2 ,...,loc n At time t, a set of vehicle positions for service s is requested.
5. The dynamic service placement method based on edge computing and deep reinforcement learning in the internet of vehicles according to claim 1, wherein the action space is used for describing actions taken when placing services on an edge server;
wherein the action a taken at a given time t is as follows:
6. The dynamic service placement method based on edge computation and deep reinforcement learning in the internet of vehicles according to claim 1, wherein the policy function pi is a function executed by an actor network for mapping a state space to an action space, i.e. pi: ω → a;
the objective of the policy function pi is to minimize the maximum edge resource usage and service latency and to control the relative importance of resource usage and service latency by using the parameter β;
the policy function pi is expressed as follows:
in the formula, β is a weight coefficient.
7. The dynamic service placement method based on edge computing and deep reinforcement learning in the internet of vehicles according to claim 1, wherein the reward function is as follows:
8. The dynamic service placement method based on edge computing and deep reinforcement learning in the internet of vehicles according to claim 1, wherein the loss function in the critics network training processAs follows:
9. The dynamic service placement method based on edge computing and deep reinforcement learning in the internet of vehicles according to claim 8, wherein the method for evaluating the policy quality of the service placement policy by the criticizing family network comprises the following steps: judging criticizing family network loss functionAnd whether convergence is achieved, if the convergence is achieved, the evaluation is passed, and if the convergence is not achieved, the evaluation is not passed.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210992657.5A CN115550944B (en) | 2022-08-18 | 2022-08-18 | Dynamic service placement method based on edge calculation and deep reinforcement learning in Internet of vehicles |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210992657.5A CN115550944B (en) | 2022-08-18 | 2022-08-18 | Dynamic service placement method based on edge calculation and deep reinforcement learning in Internet of vehicles |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115550944A true CN115550944A (en) | 2022-12-30 |
CN115550944B CN115550944B (en) | 2024-02-27 |
Family
ID=84725291
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210992657.5A Active CN115550944B (en) | 2022-08-18 | 2022-08-18 | Dynamic service placement method based on edge calculation and deep reinforcement learning in Internet of vehicles |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115550944B (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110213796A (en) * | 2019-05-28 | 2019-09-06 | 大连理工大学 | A kind of intelligent resource allocation methods in car networking |
US20210112441A1 (en) * | 2020-12-23 | 2021-04-15 | Dario Sabella | Transportation operator collaboration system |
CN113382383A (en) * | 2021-06-11 | 2021-09-10 | 浙江工业大学 | Method for unloading calculation tasks of public transport vehicle based on strategy gradient |
WO2021237996A1 (en) * | 2020-05-26 | 2021-12-02 | 多伦科技股份有限公司 | Fuzzy c-means-based adaptive energy consumption optimization vehicle clustering method |
CN114528042A (en) * | 2022-01-30 | 2022-05-24 | 南京信息工程大学 | Energy-saving automatic interconnected vehicle service unloading method based on deep reinforcement learning |
CN114625504A (en) * | 2022-03-09 | 2022-06-14 | 天津理工大学 | Internet of vehicles edge computing service migration method based on deep reinforcement learning |
-
2022
- 2022-08-18 CN CN202210992657.5A patent/CN115550944B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110213796A (en) * | 2019-05-28 | 2019-09-06 | 大连理工大学 | A kind of intelligent resource allocation methods in car networking |
WO2021237996A1 (en) * | 2020-05-26 | 2021-12-02 | 多伦科技股份有限公司 | Fuzzy c-means-based adaptive energy consumption optimization vehicle clustering method |
US20210112441A1 (en) * | 2020-12-23 | 2021-04-15 | Dario Sabella | Transportation operator collaboration system |
CN113382383A (en) * | 2021-06-11 | 2021-09-10 | 浙江工业大学 | Method for unloading calculation tasks of public transport vehicle based on strategy gradient |
CN114528042A (en) * | 2022-01-30 | 2022-05-24 | 南京信息工程大学 | Energy-saving automatic interconnected vehicle service unloading method based on deep reinforcement learning |
CN114625504A (en) * | 2022-03-09 | 2022-06-14 | 天津理工大学 | Internet of vehicles edge computing service migration method based on deep reinforcement learning |
Non-Patent Citations (4)
Title |
---|
DASONG ZHUANG: "Offloading Strategy for Vehicles in the Architecture of Vehicle-MEC-Cloud", 《2022 IEEE/CIC INTERNATIONAL CONFERENCE ON COMMUNICATIONS IN CHINA (ICCC WORKSHOPS)》, 11 August 2022 (2022-08-11) * |
XIUHUA LI: "Task Offloading for End-Edge-Cloud Orchestrated Computing in Mobile Networks", 《2020 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE (WCNC)》, 25 May 2020 (2020-05-25) * |
张海波;荆昆仑;刘开健;贺晓帆;: "车联网中一种基于软件定义网络与移动边缘计算的卸载策略", 电子与信息学报, no. 03, 15 March 2020 (2020-03-15) * |
彭军;王成龙;蒋富;顾欣;牟??;刘伟荣;: "一种车载服务的快速深度Q学习网络边云迁移策略", 电子与信息学报, no. 01, 15 January 2020 (2020-01-15) * |
Also Published As
Publication number | Publication date |
---|---|
CN115550944B (en) | 2024-02-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109756378B (en) | Intelligent computing unloading method under vehicle-mounted network | |
CN109391681B (en) | MEC-based V2X mobility prediction and content caching offloading scheme | |
CN112995950B (en) | Resource joint allocation method based on deep reinforcement learning in Internet of vehicles | |
Kazmi et al. | Infotainment enabled smart cars: A joint communication, caching, and computation approach | |
Chen et al. | Efficiency and fairness oriented dynamic task offloading in internet of vehicles | |
CN110312231A (en) | Content caching decision and resource allocation joint optimization method based on mobile edge calculations in a kind of car networking | |
CN114143346B (en) | Joint optimization method and system for task unloading and service caching of Internet of vehicles | |
CN112395090B (en) | Intelligent hybrid optimization method for service placement in mobile edge calculation | |
CN113507503B (en) | Internet of vehicles resource allocation method with load balancing function | |
CN111339554A (en) | User data privacy protection method based on mobile edge calculation | |
CN115209426B (en) | Dynamic deployment method for digital twin servers in edge car networking | |
CN114374741B (en) | Dynamic grouping internet of vehicles caching method based on reinforcement learning under MEC environment | |
CN115297171B (en) | Edge computing and unloading method and system for hierarchical decision of cellular Internet of vehicles | |
Wu et al. | A profit-aware coalition game for cooperative content caching at the network edge | |
CN110489218A (en) | Vehicle-mounted mist computing system task discharging method based on semi-Markovian decision process | |
CN114641041A (en) | Edge-intelligent-oriented Internet of vehicles slicing method and device | |
CN109495565A (en) | High concurrent service request processing method and equipment based on distributed ubiquitous computation | |
CN114979145A (en) | Content distribution method integrating sensing, communication and caching in Internet of vehicles | |
CN113709249B (en) | Safe balanced unloading method and system for driving assisting service | |
CN115550944A (en) | Dynamic service placement method based on edge calculation and deep reinforcement learning in Internet of vehicles | |
CN117221951A (en) | Task unloading method based on deep reinforcement learning in vehicle-mounted edge environment | |
CN116916272A (en) | Resource allocation and task unloading method and system based on automatic driving automobile network | |
CN116489668A (en) | Edge computing task unloading method based on high-altitude communication platform assistance | |
CN114928826A (en) | Two-stage optimization method, controller and decision method for software-defined vehicle-mounted task unloading and resource allocation | |
CN114189522B (en) | Priority-based blockchain consensus method and system in Internet of vehicles |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |